DATA SCIENCE
How much memory does a Python object take on memory?
I was working on a large dictionary in Python for a data science project. The Resouce Monitor (a windows utility that displays information about the use of hardware) showed an enormous amount of memory usage in a short amount of time. I knew that my draft code was not optimal, but the rate of memory utilization was not making sense with the growth rate of my dictionary length. It seemed that my dictionary length did not have a linear relationship with the dictionary object's size in memory. I decided to check the size of my dictionary in memory. I was sure that there should be a Python function that gives me the answer, right? Of course, I used Google to find that magic function. After an hour of research, I joined the group of Python programmers who realized that there is no straightforward solution to this question. Why? Read this article.
If you are not familiar with how Python manages memory, I would recommend reading the following article of mine first. In a layman’s language, it explains how Python allocates memory to objects.
As I explained in the article, Python objects get stored in the Heap Memory (i.e., a dynamic memory). You can get the address of an object on the Heap memory (aka the Heap) by using a function called id()
. You can read more details in the above article.
But in that article, we did not discuss anything about the size of objects on the Heap. What is the size of an object in memory? Here, I give you three answers to this question. One is simple but wrong, the next one is a little bit more complex and more accurate, and the last solution is as correct as we can get.
Why should I give you a simple but wrong answer? The reason is that if you see the correct answer first, you might not understand why the answer is a little bit complicated. Also, you might not have a good understanding of the reasons behind that. After all, we are reading articles to understand the reasons behind codes and solutions; otherwise, StackOverflow is full of correct and verified solutions for almost everything.
In Python, the most basic function for measuring the size of an object in memory is sys.getsizeof()
. Let’s take a look at this function using a few examples.
>>> sys.getsizeof(1)
28
>>> sys.getsizeof('Python')
55
>>> sys.getsizeof([1, 2, 3])
88
These three examples show the size of an integer, a string, and a list in bytes. At first glance, everything seems good, and you wonder why this article should be written, right? Give me a few minutes, and I might convince you (as I got convinced after reading the other examples for the first time). Let’s see another example.
>>> sys.getsizeof('')
49
>>> sys.getsizeof('P')
50
>>> sys.getsizeof('Py')
51
>>> sys.getsizeof('Pyt')
52
First, I have an empty string. It took 49 bytes! Then I have a string with only one character, and its size is 50 bytes. I added more characters, and it seems that each character adds one byte to the size of my string object. How do we explain this observation? Actually, it is easy. In Python, like almost everything else, a string is an object, not only a collection of characters. An object (in this case, a string object), in addition to its value (i.e., collection of characters), has different attributes and related components. When we create an object, Python stores all this information in memory. Therefore, we have an overhead even for an empty string.
Let’s check the same thing for a list.
>>> sys.getsizeof([])
64
>>> sys.getsizeof([1])
72
>>> sys.getsizeof([1, 2])
80
We see the same story here. A list object has 64 bytes of overhead. For each additional item, its size grows by 8 bytes. Okay, it was strange first, but it makes sense now. Let’s see another example (I promise you, this one is more interesting).
>>> sys.getsizeof([1, 2])
80
>>> sys.getsizeof([3, 4, 5, 1])
96
>>> sys.getsizeof([1, 2, [3, 4, 5, 1]])
88
First, I have a list of [1, 2]
which takes 80 bytes of memory. I have another list of [3, 4, 5, 1]
which took 96 bytes. So far, everything makes sense. For a list object, we have 64 bytes of overhead and 8 bytes for each additional item. Now, I nest the second list inside the first list. The resulting list will be something like [1, 2, [3, 4, 5, 1]]
. When I get the size of this new list object, its size is 88 bytes. What?!! The size of the new nested list (i.e., 88 bytes) is even less than the size of my second list (i.e., 96 bytes). How is it possible?
Let’s playback. First, I had a list of two items (i.e., integer numbers). It took 80 bytes of memory as we expected. When we added a new item, which was a list, it added 8 bytes to my list. It seems that, no matter what, an additional item takes 8 bytes. It seems that a list object is not storing the items but a reference to items (i.e., the memory addresses). THAT’S TRUE. When you create a list object, the list object by itself takes 64 bytes of memory, and each item adds 8 bytes of memory to the size of the list because of references to other objects. It means that in the previous example, the list of [1, 2, [3, 4, 5, 1]]
is stored on the memory like [reference to obj1, reference to obj2, reference to obj3]
. The size of each reference is 8 bytes. In this case, obj1
, obj2
, and obj3
are stored somewhere else in the memory. Therefore, to get the actual size of our list object, in addition to getting the size of the list, we need to include the size of each member object (which we call them items).
As we learned from the previous section, sys.getsizeof()
only gives us the size of the object and its attributes on memory. It does not include the size of referenced objects (e.g., items in a list) and their attributes. To get the actual size of an object, we must iterate through all components of an object (e.g., items in a list object) and add their sizes together. The following figure is an example.
As the above figure shows, the size of an object such as [1, 2, [3, 4, 5, 1]]
is 352 bytes. However, there is a mistake in this calculation. If you look at the list of objects in the figure, you see the same memory addresses on rows 2 and 8 (highlighted with *). It seems that 1
(i.e., an integer object) in the main list and 1
in the nested list are stored in the same memory address. As I explained in a previous article (link), Python stores integer numbers between [-5, 256] once and points all references to the same memory address (for memory optimization). Therefore, it is better to identify duplicates using their memory addresses (via id()
) and count their memory size once. Therefore for our example, we must remove duplicates before summing their memory sizes. The following figure shows the correct answer, which is 324 bytes.
The previous solution was more accurate than what we calculated initially, but unfortunately, it still has some caveats. When you load a class, some other elements, that you cannot think of them (e.g., obj.__dict__ or obj.__slots__), might also get stored in the memory. Tracking these elements manually is hard and sometimes impossible. A better way of searching for all the elements attached to your object is to use a function from the Python Garbage Collector interface called gc.get_referents()
.
If you are not familiar with Python Garbage Collector, I recommend you to read this article. Garbage Collector keeps track of all objects and associated elements in the Heap and removes them when the program does not need them anymore.
Here we can take advantage of the Garbage Collector interface (link) to find all elements linked to the object that we want to know its size on the memory. The following code iterates through all objects and elements attached to the original object and adds their size to the total size of the object.
I also found a good solution provided by the following article. Although the solution works for a limited set of objects, the solution looks solid.
Measuring the size of Python objects in memory is not an easy task. There is not a built-in and straightforward solution for finding the actual size of the objects. In this article, we learned why it isn't easy to measure the objects' actual size. Also, I provided a solution that works for many (not all) objects in Python.
Follow me on Twitter for the latest stories: https://twitter.com/TamimiNas
FAQs
What is the size of objects in memory in Python list? ›
When you create a list object, the list object by itself takes 64 bytes of memory, and each item adds 8 bytes of memory to the size of the list because of references to other objects.
How do I get the size of a Python object in memory? ›getsizeof() can be done to find the storage size of a particular object that occupies some space in the memory. This function returns the size of the object in bytes.
How do I fix memory error in Python? ›- Allocate More Memory.
- Work with a Smaller Sample.
- Use a Computer with More Memory.
- Use a Relational Database.
- Use a Big Data Platform.
A MemoryError means that the interpreter has run out of memory to allocate to your Python program. This may be due to an issue in the setup of the Python environment or it may be a concern with the code itself loading too much data at the same time.
What does __ sizeof __ do in Python? ›The __sizeof__() function in Python doesn't exactly tell us the size of the object. It doesn't return the size of a generator object as Python cannot tell us beforehand that how much size of a generator is. Still, in actuality, it returns the internal size for a particular object (in bytes) occupying the memory.
How to find memory leak in Python? ›You can detect memory leaks in Python by monitoring your Python app's performance via an Application Performance Monitoring tool such as Scout APM. Once you detect a memory leak, there are multiple ways to solve it.
How to check memory consumption in Python? ›Working with Python Memory Profiler
You can use it by putting the @profile decorator around any function or method and running python -m memory_profiler myscript. You'll see line-by-line memory usage once your script exits.
Memory allocation in Python
The function calls and the references are stored in the stack memory whereas all the value objects are stored in the heap memory.
You need to hold a reference to an object (i.e. assign it to a variable or store it in a list). There is no language support for going from an object address directly to an object (i.e. pointer dereferencing). Save this answer.
How do I free up memory in Python? ›Just like the del method, you can invoke the gc. collect() for clearing the memory of not just variables but for all Python objects. Thus, you can use a combination of del() and gc. collect() to clear the variable from Python memory.
How do I clear out memory error? ›
- The easiest and fastest way to fix this issue is to press "Ctrl + Alt + Del" keys at the same time to bring up Task Manager. ...
- Once here, select Task Manager >, check and select programs that are using high memory and CPU usage > click End Task to close the selected programs or apps.
OutOfMemoryError: Metaspace error is thrown. To mitigate the issue, you can increase the size of the Metaspace by adding the -XX:MaxMetaspaceSize flag to startup parameters of your Java application. For example, to set the Metaspace region size to 128M, you would add the following parameter: -XX:MaxMetaspaceSize=128m .
Do you have to worry about memory in Python? ›It is the manager keeping Python's memory in check, thus enabling you to focus on your code instead of having to worry about memory management. Due to its simplicity, however, Python does not provide you much freedom in managing memory usage, unlike in languages like C++ where you can manually allocate and free memory.
Why do I keep getting memory errors? ›Causes of such memory errors may be due to certain cognitive factors, such as spreading activation, or to physiological factors, including brain damage, age or emotional factors. Furthermore, memory errors have been reported in individuals with schizophrenia and depression.
What do you mean by sizeof () operator? ›The sizeof operator gives the amount of storage, in bytes, required to store an object of the type of the operand. This operator allows you to avoid specifying machine-dependent data sizes in your programs.
How do you find the size of a set in Python? ›To determine how many items a set has, use the len() method.
How do you find the size of a given variable without using sizeof () operator? ›The idea is to use pointer arithmetic ( (&(var)+1) ) to determine the offset of the variable, and then subtract the original address of the variable, yielding its size. For example, if you have an int16_t i variable located at 0x0002 , you would be subtracting 0x0002 from 0x0006 , thereby obtaining 0x4 or 4 bytes.
How do you find out what is causing a memory leak? ›To find a memory leak, look at how much RAM the system is using. The Resource Monitor in Windows can be used to accomplish this. In Windows 8.1 and Windows 10: To open the Run dialogue, press Windows+R, then type "resmon" and click OK.
Are memory leaks common in Python? ›Like other languages, memory leaks often occur in Python. Its built-in detector, CPython, helps in memory management. However, memory leaks still occur sometimes due to some unresolved issues. It is a challenge for us programmers to solve this issue.
How do I know if I have a memory leak? ›The system can have a myriad of symptoms that point to a leak, though: decreased performance, a slowdown plus the inability to open additional programs, or it may freeze up completely.
How do you manage memory in Python? ›
The programmer has to manually allocate memory before it can be used by the program and release it when the program no longer needs it. In Python, memory management is automatic! Python automatically handles the allocation and deallocation of memory.
What is the command to check memory usage? ›Entering cat /proc/meminfo in your terminal opens the /proc/meminfo file. This is a virtual file that reports the amount of available and used memory. It contains real-time information about the system's memory usage as well as the buffers and shared memory used by the kernel.
How do I track memory usage? ›Check Computer Memory Usage Easily
To open up Resource Monitor, press Windows Key + R and type resmon into the search box. Resource Monitor will tell you exactly how much RAM is being used, what is using it, and allow you to sort the list of apps using it by several different categories.
No, they are in a different memory called “Heap Memory” (also called the Heap). To store objects, we need memory with dynamic memory allocation (i.e., size of memory and objects can change). Python interpreter actively allocates and deallocates the memory on the Heap (what C/C++ programmers should do manually!!!
What is the memory size of Python? ›Python has a pymalloc allocator optimized for small objects (smaller or equal to 512 bytes) with a short lifetime. It uses memory mappings called “arenas” with a fixed size of 256 KiB.
Where are objects stored in memory? ›In Java, all objects are dynamically allocated on Heap. This is different from C++ where objects can be allocated memory either on Stack or on Heap. In JAVA , when we allocate the object using new(), the object is allocated on Heap, otherwise on Stack if not global or static.
How do I remove an object from a memory in Python? ›Using the __del__() method. In Python, a destructor is defined using the specific function __del__(). For instance, when we run del object name, the object's destructor is automatically called, and it then gets garbage collected.
Do Python objects get deleted? ›The object obj references will be deleted when the function exits. That's because CPython (the default Python implementation) uses reference counting to track object lifetimes. obj is a local variable and only exists for the duration of the function.
Does Python create itself if the file doesn't exist in the memory? ›If the file doesn't yet exist, a new one gets created. The handle is set at the end of the file. The newly written data will be added at the end, following the previously written data. Append and Read ('a+'): Using this method, you can read and write in the file.
How to fix out of memory errors by increasing available memory? ›Increase Xmx in small increments (eg 512mb at a time), until you no longer experience the OutOfMemory error. This is because increasing the heap beyond the capabilities of your server to adequately Garbage Collect can cause other problems (eg performance/freezing)
What is maximum heap size for 64 bit or x64 JVM is it 8gb or 16gb? ›
For 64 bit platforms and Java stacks in general, the recommended Maximum Heap range for WebSphere Application Server, would be between (4096M - 8192M) or (4G - 8G).
Why is Python memory inefficient? ›It is a slower way of memory allocation. Once static memory is allocated, neither its size can be changed, nor it can be re-used. Hence, less efficient. We can change the memory size after allocation and can be reused as well.
What are 4 other possible causes of memory problems? ›Other causes for memory problems can include aging, medical conditions, emotional problems, mild cognitive impairment, or another type of dementia.
How do I check memory list size? ›In order to determine the size of the list, we have passed the list object to the getsizeof() function which on execution return the sizes of list1 and list2 in bytes along with garbage collector overhead.
How do I check memory usage in Python list? ›You can use it by putting the @profile decorator around any function or method and running python -m memory_profiler myscript. You'll see line-by-line memory usage once your script exits.
How is a Python list stored in memory? ›The list is based on an array. An array is a set of elements ① of the same size, ② located in memory one after another, without gaps. Since elements are the same size and placed contiguously, it is easy to get an array item by index. All we need is the memory address of the very first element (the “head” of the array).
How do you find the memory size of an array in Python? ›Using size and itemsize attributes of NumPy array
size: This attribute gives the number of elements present in the NumPy array. itemsize: This attribute gives the memory size of one element of NumPy array in bytes. Example 1: Python3.
Using the __del__() method. In Python, a destructor is defined using the specific function __del__(). For instance, when we run del object name, the object's destructor is automatically called, and it then gets garbage collected.
Does Python clean up memory? ›The memory Heap in Python holds the objects and other data structures used in the program. So, when a variable (a reference to an object) is no longer in use, the Python memory manager frees up the space, i.e. it removes the unnecessary object.
Does Python take up a lot of memory? ›Storing integers or floats in Python has a huge overhead in memory. Learn why, and how NumPy makes things better. Objects in Python have large memory overhead. Learn why, and what do about it: avoiding dicts, fewer objects, and more.
How to check dict size in memory Python? ›
Finding the Size of the Dictionary in Bytes
The memory size of the dictionary object in bytes can be determined by the getsizeof() function. This function is available from the sys module. Like len() , it can be used to find the size of any Python object.
All you have to do is add the decorator function and the process_memory function in your code and this will give you the memory consumed by the code and it's before and after.
How do you find the memory location of an object in Python? ›Method 1: Using id()
We can get an address using the id() function. id() function gives the address of the particular object.
Empty list. When an empty list [] is created, no space for elements is allocated - this can be seen in PyList_New . 36 bytes is the amount of space required for the list data structure itself on a 32-bit machine.