Python security hardening is less about adding new features and more about removing things and restricting what’s already there.
Consider this Python script:
import os
import subprocess
def run_command(command):
try:
result = subprocess.run(command, shell=True, capture_output=True, text=True, check=True)
print("STDOUT:", result.stdout)
print("STDERR:", result.stderr)
except subprocess.CalledProcessError as e:
print(f"Error executing command: {e}")
print("STDOUT:", e.stdout)
print("STDERR:", e.stderr)
if __name__ == "__main__":
user_input = input("Enter a command to run: ")
run_command(user_input)
This looks simple, but running ls / or cat /etc/passwd from the input() prompt would be trivial to exploit if this script were exposed to untrusted input. The shell=True is the immediate danger zone here, allowing shell metacharacters to be interpreted. Even without shell=True, if the command itself is constructed from untrusted input, it can still be dangerous.
Here’s a 25-point checklist to harden your Python applications:
1. Avoid eval() and exec():
- Diagnosis: Search your codebase for
eval(andexec(. - Fix: Replace them with safer alternatives. If you need to execute dynamic code, use a dedicated, sandboxed execution environment or parse structured data (like JSON) instead. For example,
json.loads()is a safe way to parse JSON strings. - Why it works:
eval()andexec()execute arbitrary Python code, making them prime targets for injection attacks if the input is not strictly controlled.
2. Sanitize User Input (Especially for OS commands):
- Diagnosis: Look for any code that takes user input and passes it to
os.system(),subprocess.run(..., shell=True), or similar functions. - Fix: Use
subprocess.run()withoutshell=True. Pass commands and arguments as a list:subprocess.run(['ls', '-l', '/tmp']). Sanitize any input that must be part of the command string using libraries likeshlex.quote()to escape shell metacharacters. - Why it works:
shell=Trueallows the shell to interpret special characters (like;,|,&,>), which can be used to execute unintended commands. Passing arguments as a list treats them as literal strings, preventing shell interpretation.
3. Use subprocess Safely:
- Diagnosis: Review all uses of
subprocess. Pay close attention toshell=Trueand how command arguments are constructed. - Fix: Prefer
subprocess.run()over older functions likeos.system(). Always pass commands and arguments as a list whenshell=False(the default and recommended setting). If you absolutely needshell=True, ensure the command string is meticulously validated and escaped. - Why it works: This avoids shell interpretation of metacharacters and provides better control over process execution.
4. Limit File Permissions:
- Diagnosis: Check where your Python scripts write files. Are they world-writable or accessible by unintended users?
- Fix: Use
os.chmod()to set restrictive permissions (e.g.,0o600for owner-only read/write) on sensitive files created by your application. - Why it works: Prevents unauthorized users from reading, modifying, or deleting critical data files.
5. Securely Manage Secrets:
- Diagnosis: Search for hardcoded passwords, API keys, or other credentials in your source code, configuration files, or environment variables that are not properly secured.
- Fix: Use environment variables for secrets, but ideally, integrate with a dedicated secrets management system (e.g., HashiCorp Vault, AWS Secrets Manager, Azure Key Vault). Load secrets at runtime, not compile time.
- Why it works: Avoids exposing sensitive information in version control or easily accessible configuration files.
6. Use ssl for Network Communication:
- Diagnosis: Identify any network communication that is not encrypted.
- Fix: Use the
sslmodule to wrap sockets for TLS/SSL encryption when communicating over networks. For HTTP, userequestswith HTTPS URLs. - Why it works: Protects data in transit from eavesdropping and man-in-the-middle attacks.
7. Disable Debugging in Production:
- Diagnosis: Check if debug flags (e.g.,
DEBUG = Truein Django or Flask) are enabled in your production environment. - Fix: Ensure debug modes are turned off in production deployments. Use configuration management to set
DEBUG = False. - Why it works: Debug modes often expose sensitive information like stack traces, internal variables, and configuration details, which can aid attackers.
8. Use pickle with Extreme Caution:
- Diagnosis: Search for
pickle.load()andpickle.dump(). - Fix: Never unpickle data from untrusted sources. If you must serialize/deserialize, consider safer formats like JSON or Protocol Buffers, or use libraries specifically designed for secure serialization.
- Why it works:
picklecan execute arbitrary code during deserialization, making it a significant security risk if the pickled data is tampered with.
9. Install Dependencies Securely:
- Diagnosis: Review your
requirements.txtorPipfile. Are you pinning exact versions? Are you using a lock file? - Fix: Always use
pip freeze > requirements.txtto pin exact versions. Usepip-toolsorPoetryto generate lock files for reproducible and secure dependency installation. - Why it works: Prevents supply chain attacks where malicious code is injected into dependency packages, and ensures consistent, tested environments.
10. Regularly Update Dependencies:
- Diagnosis: Are you aware of the CVEs (Common Vulnerabilities and Exposures) affecting your dependencies?
- Fix: Use tools like pip-audit, safety, or GitHub’s Dependabot to scan your dependencies for known vulnerabilities and update them promptly.
- Why it works: Patches known security holes in third-party libraries that your application relies on.
11. Input Validation and Type Checking:
- Diagnosis: Examine how your application handles external input (API requests, form submissions, file uploads). Is it validated rigorously?
- Fix: Use validation libraries like Pydantic or Cerberus to define expected data schemas and validate all incoming data against them. Explicitly check data types.
- Why it works: Ensures that only data conforming to expected formats and types is processed, preventing malformed inputs from causing unexpected behavior or crashes.
12. Use jinja2 or other templating engines safely:
- Diagnosis: If you’re rendering HTML or other text from templates using user-supplied data.
- Fix: Ensure autoescaping is enabled. For jinja2, this is the default. Be cautious with |safe filters; only use them on data you know is safe.
- Why it works: Autoescaping converts potentially malicious HTML/script tags into their literal string representation, preventing Cross-Site Scripting (XSS) attacks.
13. Limit Access to Sensitive Modules:
- Diagnosis: Are there internal modules that should not be imported or used by external components?
- Fix: Structure your project to avoid exposing sensitive modules. Use __init__.py to control what’s importable. Consider using separate processes or microservices for highly sensitive operations.
- Why it works: Restricts the attack surface by making it harder for an attacker to gain access to powerful internal functions.
14. Use argparse for Command-Line Arguments:
- Diagnosis: If your script takes command-line arguments, are you manually parsing sys.argv?
- Fix: Use the argparse module to define and parse command-line arguments.
- Why it works: Provides robust argument parsing, helps prevent common mistakes, and makes your CLI interface more user-friendly and less prone to injection-like issues if arguments are improperly handled.
15. Avoid os.popen():
- Diagnosis: Search for os.popen(.
- Fix: Replace with subprocess.Popen or subprocess.run.
- Why it works: os.popen() is a legacy function that is less secure and flexible than the subprocess module.
16. Securely Handle File Uploads: - Diagnosis: If your application accepts file uploads. - Fix: Validate file types, sizes, and scan for malware. Store uploaded files outside the webroot, in a secure, non-executable location. Rename files to prevent path traversal attacks. - Why it works: Prevents attackers from uploading malicious scripts or executables that could be run on the server.
17. Rate Limiting: - Diagnosis: Is your application vulnerable to brute-force attacks or denial-of-service? - Fix: Implement rate limiting on API endpoints, login attempts, and resource-intensive operations. Frameworks like Flask-Limiter or Django-Rate-Limit can help. - Why it works: Prevents attackers from overwhelming your application with requests or trying to guess credentials repeatedly.
18. Content Security Policy (CSP): - Diagnosis: Are you serving dynamic web content? - Fix: Implement a CSP header in your web application responses. This tells the browser which dynamic resources (scripts, styles, etc.) are allowed to load. - Why it works: Mitigates XSS attacks by restricting the sources from which content can be loaded.
19. Use warnings module judiciously:
- Diagnosis: Are you using warnings.warn() to indicate potential issues, but not handling them?
- Fix: In production, configure the warnings module to treat specific warnings as errors or to ignore them entirely if they are not relevant to security or stability.
- Why it works: Prevents potentially exploitable conditions or noisy output from impacting the production environment.
20. Principle of Least Privilege: - Diagnosis: Is your Python application running with more permissions than it needs? - Fix: Run your Python processes under a dedicated, unprivileged user account. Limit the file system access, network access, and system calls that this user can make. - Why it works: If the application is compromised, the attacker’s capabilities are severely limited by the restricted privileges of the compromised process.
21. Sanitize XML Input:
- Diagnosis: If your application parses XML data.
- Fix: Use libraries like defusedxml to parse XML. These libraries disable potentially dangerous features like external entity expansion (XXE).
- Why it works: Prevents XXE attacks, which can lead to information disclosure or denial-of-service.
22. Secure Serialization of Data:
- Diagnosis: Beyond pickle, consider other serialization formats.
- Fix: For inter-process communication or data storage, prefer formats like JSON, MessagePack, or Protocol Buffers. Ensure schemas are defined and validated.
- Why it works: These formats are generally data-only and do not execute code upon deserialization, making them safer than pickle.
23. Use a Web Application Firewall (WAF): - Diagnosis: Is your Python web application exposed to the internet? - Fix: Deploy a WAF (e.g., ModSecurity, Cloudflare WAF, AWS WAF) in front of your application. - Why it works: A WAF can filter out common malicious requests (SQL injection, XSS, etc.) before they even reach your Python application.
24. Secure Configuration Management: - Diagnosis: How are your application’s configuration files managed and deployed? - Fix: Use secure configuration management tools. Store sensitive configuration data encrypted or in a secrets manager. Ensure configuration files have appropriate permissions. - Why it works: Prevents configuration drift and ensures that sensitive settings are not exposed or accidentally altered.
25. Regular Security Audits and Penetration Testing: - Diagnosis: When was the last time your application was professionally audited for security flaws? - Fix: Schedule regular security audits and penetration tests by qualified security professionals. - Why it works: Proactively identifies vulnerabilities that might have been missed by automated tools or manual checklists.
After applying these, the next error you’re likely to encounter is a ModuleNotFoundError if you’ve removed a dependency that was implicitly relied upon, or a FileNotFoundError if you’ve secured file access too aggressively.