Small integer caching in python

I noticed something interesting about python

a=10
b=10
print(a is b)
True

But also,

a=1000
b=1000
print(a is b)
False

The small integer cache

Python (specifically CPython) pre-caches integers in the range -5 to 256 at interpreter startup. These integers are interned, meaning there’s only one shared object for each of them.

So when you do:

a = 42
b = 42

Both a and b point to the same memory address, because 42 is inside the small-int cache. That’s why a is b is True.

But:

x = 1000
y = 1000

Now you’re outside that cached range, so Python creates two separate int objects—even if their values are the same.

Cython Implementation

Inside CPython’s source code (Objects/longobject.c), there’s an array called _PyLong_SMALL_INTS. This array is initalised at startup, and all small ints(-5 to 256) that python tries to acess are returned from this array using this method:

static PyObject *
get_small_int(sdigit ival)
{
    assert(IS_SMALL_INT(ival));
    return (PyObject *)&_PyLong_SMALL_INTS[_PY_NSMALLNEGINTS + ival];
}

This stays true even if you generate it at runtime (e.g., int("10"), or 3 + 7) it returns the already-initialized object from this array. And since this cache is initalised and used for each python process, small int ids remain consistent across threads too.

import threading

def print_id():
    x = 3
    print(id(x))

threads = [threading.Thread(target=print_id) for _ in range(4)]
for t in threads: t.start()
...
4327116120
4327116120
4327116120
4327116120

TL;DR

Integers from -5 to 256 are cached and reused.
This is why a is b might be True for small numbers but False for large ones.
It’s a smart performance trick, just don’t confuse it with value equality.

The small integer cache#

Cython Implementation#

The small integer cache

Cython Implementation