pip’s hash checking is the primary mechanism for ensuring the integrity and authenticity of packages you install.
Here’s how it works when you run pip install requests:
$ pip install requests
Collecting requests
Downloading requests-2.31.0-py3-none-any.whl (62 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.6/62.6 kB 1.2 MB/s eta 0:00:00
Collecting charset-normalizer<4,>=2 (from requests)
Downloading charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (140 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 140.5/140.5 kB 3.0 MB/s eta 0:00:00
... (other dependencies)
Installing collected packages: urllib3, idna, charset-normalizer, requests
Successfully installed charset-normalizer-3.3.2 idna-3.6 requests-2.31.0 urllib3-2.2.1
When pip downloads a package, it doesn’t just trust the file it receives. It has a list of known, legitimate cryptographic hashes for that specific version of the package. Think of these hashes as unique digital fingerprints. If the fingerprint of the downloaded file exactly matches the expected fingerprint, pip knows the file hasn’t been tampered with in transit and is the genuine article. If there’s any mismatch, pip throws an error, preventing you from installing a potentially corrupted or malicious package.
The goal is to prevent "man-in-the-middle" attacks where someone could intercept your download and swap out the real package for a malicious one. By verifying against a pre-defined hash, you ensure you’re getting exactly what the package author intended.
The hashes are stored in a special file called hashes.txt within the package’s metadata on PyPI. When pip resolves dependencies, it fetches this hashes.txt file alongside the package wheel (.whl) or source distribution (.tar.gz).
For example, if you were to manually inspect the metadata for requests-2.31.0, you’d find a hashes.txt file that looks something like this:
# sha256:38676354373206144773531504b3714108067b2e5f01d2f7b8135c817657586b
# sha256:1a8830388048856990e2204c1d608b0f263e1815555337732674117960e0c657
# sha256:8b0d6a02236021818569240b26158334f591f703f3b5b55412b2720696304567
# sha256:23000218205d078755f822364d8548a852d649a50041960513b3370584065366
# sha256:6b5d71639d9693013f408c69d1b4346d560836b148b0821e219697f6a5d17d04
# sha256:c6d86772900f72e1179273b6f09c07d7e081776d25115b87f49287404487070a
# sha256:2c82789a0336f6b66b74c0ffb2a097169984b563f24c987f10d10b48c532b155
Each line is a different hash algorithm (most commonly SHA256) followed by the hash value. pip calculates the hash of the downloaded file using the specified algorithm and compares it against all the hashes listed in hashes.txt. If any of them match, the download is considered valid. This redundancy helps ensure that even if one hashing algorithm has a theoretical weakness, the package is still protected.
The most surprising true thing about pip’s hash checking is that it’s opt-in by default in recent versions for security but can be bypassed unintentionally by outdated dependency resolvers or specific configuration flags. While the default behavior is to verify, older pip versions or certain caching mechanisms might not enforce it as strictly, leaving a window for potential compromise if not configured correctly.
When you run pip install --no-verify-hash requests, you are explicitly telling pip to skip this integrity check. This is almost never what you want unless you’re debugging a very specific issue or working in a highly controlled, air-gapped environment where you’ve manually verified every artifact.
The mechanism relies on the integrity of the hashes themselves. If the hashes.txt file on PyPI were compromised, pip would happily verify a malicious package against the tampered hashes. However, PyPI’s infrastructure is designed to protect these metadata files, making such an attack extremely difficult.
If you’re ever curious about what hashes pip is using for a package, you can inspect the ~/.cache/pip/wheels/ directory (or wherever your pip cache is configured) after a successful download. You’ll find the .whl files there, and you can compute their SHA256 hashes yourself using sha256sum <filename.whl>.
The next step in securing your Python package installations after understanding hash checking is exploring pip-tools and reproducible builds.