The setup.py file in Python packaging is a vestige of a bygone era, and understanding its configuration for legacy package builds requires appreciating the shift from direct script execution to declarative metadata.

Let’s see what happens when a package is built using setup.py. Imagine we have a simple package structure:

my_package/
    __init__.py
    module.py
setup.py
README.md

And setup.py looks like this:

from setuptools import setup, find_packages

setup(
    name='my_package',
    version='0.1.0',
    packages=find_packages(),
    description='A simple example package',
    author='Example User',
    author_email='user@example.com',
    url='http://example.com/my_package',
    install_requires=[
        'numpy>=1.20.0',
    ],
    classifiers=[
        'Programming Language :: Python :: 3',
        'License :: OSI Approved :: MIT License',
        'Operating System :: OS Independent',
    ],
    python_requires='>=3.7',
)

When you run pip wheel . or python setup.py sdist bdist_wheel in the directory containing this setup.py, setuptools (which setup.py imports) executes this script. It reads the arguments passed to the setup() function to gather all the metadata about your package: its name, version, dependencies, entry points, etc. This information is then used to generate package distribution files like source distributions (.tar.gz) and wheels (.whl). pip then uses these generated files to install your package.

The core problem setup.py solved was providing a standardized way for Python packages to declare their build and installation requirements. Before setup.py and setuptools, installing packages was often a manual, error-prone process involving copying files and running custom scripts. setup.py centralized this by defining a common interface for package authors to specify:

  • Package Metadata: Name, version, author, description, license, URL.
  • Code Location: Which directories contain the Python modules to be packaged (packages=find_packages()).
  • Dependencies: What other packages are required for this package to function (install_requires).
  • Python Version Compatibility: Which Python interpreters are supported (python_requires).
  • Entry Points: For creating command-line scripts or plugins.
  • Data Files: Non-Python files that need to be included.

The setuptools library, which is the de facto standard for executing setup.py, provides the setup() function. This function is the central orchestrator, taking keyword arguments that describe the package. find_packages() is a helper function that automatically discovers all packages within your project directory, making it easier than manually listing each one.

The transition to pyproject.toml has largely superseded setup.py for new projects, moving towards declarative configuration. However, setup.py remains crucial for supporting older Python versions or packages that haven’t migrated. When setuptools encounters a setup.py, it executes it as a Python script. This means you can include arbitrary Python code within your setup.py for dynamic configuration, though this is generally discouraged in favor of declarative metadata in pyproject.toml. For example, you could dynamically determine the version from a file or environment variable, a common practice in legacy projects.

A key aspect of legacy builds is handling extensions written in C or other compiled languages. setup.py allows you to define Extension objects within the setup() call. setuptools then uses tools like distutils (or its own C extension building capabilities) to compile these extensions during the build process, generating platform-specific binary modules.

The most surprising true thing about setup.py is its inherent flexibility, which is also its greatest weakness. Because it’s a Python script, you can do anything in it. This allows for complex, dynamic build logic that might be impossible with purely declarative formats. For instance, you could query the system for installed libraries, run external commands to generate code, or even perform conditional logic based on the build environment. This power, however, often leads to brittle, non-reproducible builds that are hard to understand and maintain, which is why the Python packaging community has strongly advocated for the move to pyproject.toml for declarative configuration.

When you run pip install . on a project with a setup.py, pip invokes setuptools to build the package in place or create a wheel, and then installs the resulting artifacts. The setup.py script is executed in a controlled environment, and its output is what pip consumes.

The next concept you’ll likely encounter is understanding how pyproject.toml has become the modern standard for package configuration, offering a declarative and more robust approach to building and distributing Python packages.

Want structured learning?

Take the full Pip course →