Python’s memory management is a lot more sophisticated than just "garbage collection happens." The real magic, and where most of the overhead lies, is in how Python actually requests and holds onto memory from the operating system.

Let’s see Python in action with some basic memory allocation.

import sys

# Allocate a small integer
small_int = 100
print(f"Size of small_int: {sys.getsizeof(small_int)} bytes")

# Allocate a larger integer
large_int = 10**100
print(f"Size of large_int: {sys.getsizeof(large_int)} bytes")

# Allocate a string
my_string = "hello world"
print(f"Size of my_string: {sys.getsizeof(my_string)} bytes")

# Allocate a list
my_list = [1, 2, 3, 4, 5]
print(f"Size of my_list: {sys.getsizeof(my_list)} bytes")

# Allocate a dictionary
my_dict = {"a": 1, "b": 2}
print(f"Size of my_dict: {sys.getsizeof(my_dict)} bytes")

This looks straightforward, but behind the scenes, Python isn’t just asking the OS for exactly 24 bytes for small_int. It’s using a layered system to manage memory efficiently. The core components are the allocator and arenas.

The primary goal of Python’s memory management is to minimize the overhead of frequent, small memory requests to the operating system. Each call to malloc() (or its equivalent) from the OS is relatively expensive. Python tries to amortize this cost by managing memory in larger chunks.

At the lowest level, Python uses a memory allocator. For small objects (typically less than 512 bytes), Python uses a specialized allocator (like pymalloc) that’s optimized for speed and fragmentation reduction. This allocator manages memory in fixed-size "blocks" within larger "arenas." An arena is a contiguous block of memory obtained from the operating system. When Python needs memory for a small object, it first checks if there’s an available block of the correct size within an existing arena. If so, it reuses it. Only when an arena is exhausted or no suitable blocks are available does Python request a new arena from the OS.

This strategy is incredibly effective. Imagine creating thousands of small objects. Instead of making thousands of individual calls to the OS, Python makes a few calls to get large arenas and then carves out small pieces from those arenas for its objects. This drastically reduces system calls and improves performance. For larger objects (over 512 bytes), Python falls back to the system’s default allocator (like malloc from glibc).

The pymalloc allocator is particularly clever. It pre-allocates memory in chunks called arenas. Each arena is typically 256 KB. Within an arena, memory is divided into pools for different object sizes. When you create a small object, Python finds the appropriate pool, grabs a free block, and returns it. This avoids fragmentation and speeds up allocation because Python doesn’t need to search for arbitrary free memory; it just picks from a pre-defined slot.

The surprising part is how Python handles the deallocation of these small objects. When an object is garbage collected, its memory block isn’t immediately returned to the OS. Instead, it’s marked as free within its pool and can be reused for a new object of the same size. This "pool" concept is key to Python’s efficiency. Only when an entire arena is completely empty does Python consider releasing it back to the OS.

The size of these arenas and the block sizes within them are not arbitrary. They are carefully chosen based on typical object sizes in Python and the performance characteristics of the underlying operating system. For instance, the default arena size is 256 KB, and the block sizes are powers of 2, ranging from 8 bytes up to 512 bytes. This ensures efficient packing and minimal wasted space.

One subtle aspect is how Python’s garbage collector interacts with this. When an object is no longer referenced, its memory is reclaimed by the allocator. For small objects, this means the block is returned to its pool. For larger objects, it’s a more direct call to the system allocator. The generational garbage collector in Python primarily tracks cycles of references, but the underlying memory management (allocators and arenas) is what makes the frequent creation and destruction of objects feasible.

Understanding this layering of arenas, pools, and blocks is crucial for debugging memory bloat. When you see high memory usage, it’s often not just a few large objects but many small objects held within partially filled arenas that haven’t yet been released to the OS.

Want structured learning?

Take the full Python course →