Memory Management In Python: Revealing Python’s Secret

Hello Pythonistas, welcome back!🐍

Do you have a short-term memory loss🥴 problem?

Well, for me, I don’t even remember if I had my lunch.🥪😅

But today we are going to reveal how python does memory management… (not just remembering but also forgetting things that are meant to be forgotten🙄)

Don’t worry, we’ll be taking it step by step, using an example of an empty library where you’re the librarian.📚👩‍🏫

So get ready, and lets reveal python’s memory management together!🚀💻

Previous post’s challenge’s solution

The solution to challenge:

from abc import ABC, abstractmethod
from customtkinter import *
class Character(ABC):
    @abstractmethod
    def ability(self):
        pass
    
class Unicorn(Character):  
    def ability(self):
        return "Has the power to grant wishes, but only ones that involve glitter and rainbows."
        
class Yeti(Character):  
    def ability(self):
        return "Has the power of invisibility, but only when it’s snowing and there are no other Yetis around to watch."
        
class Loch(Character):
    def ability(self):
        return "Has the power to summon a lake monster tsunami, but can only use it once every 1000 years."
class App(CTk):
    def __init__(self):
        super().__init__()
        self.C = [
                  ("Unicorn", Unicorn()), 
                  ("Yeti", Yeti()), 
                  ("Loch", Loch())
                ]
        self.selected = IntVar()
        i = 0
        for character in self.C:
            self.rb = CTkRadioButton(self, text=character[0], variable=self.selected, value=i, command=self.show_charObj)
            # Display the buttons with a padding(spacing) of 10 pixels
            self.rb.pack(pady=10)
            i += 1
        # Create an empty label to display the ability later
        self.show = CTkLabel(self, text="")
        # Display the label
        self.show.pack()
    def show_charObj(self):
        # Remove the previous display of the ability (if any)
        self.show.pack_forget()
        temp = self.C[self.selected.get()][1].ability()
        # Create a label to display the ability
        self.show = CTkLabel(self, corner_radius=25, text=temp)
        # Display the label with a padding of 10 pixels
        self.show.pack(pady=10)
if __name__ == '__main__':
    app = App()
    app.mainloop()

So, we have this cool little app that showcases the abilities of three different characters – a Unicorn🦄, a Yeti👹, and a Loch.

Each character has a unique ability that’s defined using a method called ability().

These characters are defined as subclasses of an abstract base class called Character, which is like a blueprint for creating new characters.

Now, when you open the app, you’ll see these three characters displayed as radio 🔘 buttons. All you need to do is select one of the radio buttons to see the corresponding character’s ability in a label🏷️. It’s that simple!

Behind the scenes, the app first creates a list called self.C that contains the name of each character🦄👹 and the character’s object.

Then, it creates radio🔘 buttons for each character in the 📃 list self.C. The variable self.selected keeps track of which radio button is selected by the user.

When you select a radio button, the app calls the show_charObj() method. This method removes 🗑️ the previous display of the character’s ability (if any), retrieves the ability of the selected character🦄👹 from the self.C list, and displays it in a label.

That’s all there is to it!💁

Now let’s delve into Python’s memory management.

Memory Is an Empty Library

Before delving into memory management let’s see what is memory in the first place.🤔

Let’s imagine the computer’s memory is an empty library📚, waiting to be filled with books📖. When you turn on your computer, there are no👎 books in the library, just like there is no👎 data in memory.

As you start using different applications, books will be added to the library, and the memory will be occupied. 📚👨‍💻 But just like authors can’t randomly place any book anywhere in a library, applications need the memory manager’s permission to use memory.

Over time⏱️, some books in the library will become obsolete or go out of use. These books need to be removed to make room for new ones. Similarly, memory not in use needs to be freed up.🗑️

As the librarian👩‍🏫, you play two important roles:

  • deciding where each book will be placed in the library (the memory manager), and
  • removing books that are no longer useful (the garbage collector).🧑‍🏫📚

In essence, the computer’s memory is like an empty library, waiting to be filled with countless amazing🤩 books. With your help, the books can be organized and managed effectively, allowing the computer to run smoothly.🚀

Moving on to the next level of memory management.

From Hardware to Software: Understanding Memory Management

Awesome🎉, you’ve got the basics of memory management down!

In our library analogy📚, providing space for books is like allocating memory💾, while removing books no longer in use is like freeing memory🗑️.

But have you ever wondered where memory comes from and where it goes🚶‍♂️?

Memory is like a magical✨ physical place where all your data and instructions are stored💾. It’s like a secret🕵️‍♀️ hideaway inside your hardware devices, such as your RAM or hard disk.

As your Python code interacts with the memory, it passes through different levels of abstraction, from the hardware -> the OS -> multiple other applications -> until finally arriving at the python application that manages the memory for your code.

So, let’s dive into memory management by the python application.

Python’s Default Implementation: What You Need to Know

So, you may be thinking🤔, what the heck does it mean to implement Python?

It’s like building a special house🏠 where Python 🐍 can live and play 🎉. Well, it means creating a program or software that interprets and executes python code.

The default implementation of python is written in C programming language. Sounds strange right?🤯

But, it’s true 😮 python’s default implementation is CPython written in C language.

Your python code is first converted to more computer-readable instructions called bytecode, these get interpreted by a virtual machine when you run your code.

The .pyc and the __pycache__ folder you see are bytecode.

There are various other implementations of python:

  • Jython (written in Java)
  • IronPython (written in C#)
  • PyPy (written in Python and translated into machine code)

In this article, I’ll only discuss CPython.

You need to understand CPython as this is the hero 🦸 of the story. I mean memory management in python happens based on the data structures and algorithms of CPython after all.

You must have heard it at least once “Everything in python is an object” it is true at an implementation level.

Every object in CPython uses a struct called PyObject (our grand-daddy👴).

So, PyObject is the grand-daddy👴 of all the python objects.

Like any other grand-daddy, it has 2 major traits:

  • ob_refcnt: reference count 🔢
  • ob_type: pointer to another type🔍

struct is a user-defined datatype in C. It is like a class with only attributes and no methods.

The reference counter is used by Mr. garbage collector 🧹 to know which object is no longer useful. And the pointer is for the datatype of the object which simply points to some other struct.

Memory needs to be allocated and deallocated with care. I mean imagine you allocated the same memory space to ChatGPT 💬 and YouTube 📺 imagine GPT will try to write answers and YT will try to play video on the same memory.

To avoid such traffic issues we use GIL 🚦 (our traffic control cop) 👮‍♂️…

Understanding Python’s Global Interpreter Lock (GIL)

Now, about our traffic 🚦 control cop. I mean Global Interpreter Lock.🔒

As the name suggests (both names), it addresses the challenge of managing 💼🤝👮‍♂️ shared resources, such as a computer’s memory.

To avoid stepping of two processes on each other’s toes GIL only allows one process (more commonly known as a thread) at a time.👊

It’s like when two authors want their books to be placed at the same position in the library📚, but none would be able to do so. There you would need to go and place only one book in that space.

So, our hero🦸 CPython takes the help of Python’s cop GIL so that the application’s processes don’t get into a nasty fight with one another. And the memory management event can go on safely.

GIL locks the entire interpreter, meaning that another thread can’t step on the current one.

The call to our cop GIL is heavily debated in the python community. You can read these reddits to know more📖🔍.

Garbage collector: python’s broom

You know what they say about books📚 that nobody is reading? It’s time⏰ to let them go!

We saw that every object in python has two things:

  • reference count and🔢
  • pointer👉

Whenever an object’s reference count becomes 0 it gets qualified for being dumped🗑️ from the memory just like those books.📚

Here are a few ways in which the reference count of any object increases📈:

  1. If you assign the object to another variable
li = [1, 2, 3]
# Reference count = 1
new_li = li
# Reference count = 2
  1. Pass the object as an argument
total = sum(li)
  1. If you use it in a list
matrix = [li, li, li]

Well, there are a lot of other ways in which the reference count increases these were just a few.🤏

If you want to know the current reference count of an object, then you don’t need to strain your eyes🤓 and count.👆✌️…

sys.getrefcount() would do the job👌 for you but remember‼️ it increases the reference count by 1.

Now, you must have guessed that Mr. Garbage🧹 collector comes and collects such objects with a reference count of 0 and thus frees up the memory.🗑️

But, what freeing up the memory means? Memory isn’t a prisoner🦹‍♂️ end of the day.

Let’s take a peek👀 into CPython’s memory management.

Crafting Python’s Memory: A Peek into CPython’s Memory Management

CPython's Memory Management
CPython’s Memory Management

Our hero🦸 CPython doesn’t directly get access to the physical💾 memory in the computer.

As said before Operating system abstracts this memory.

Our hero🦸 gets a chunk of virtual memory that the OS-specific memory manager carves🍰 out for it.

Everything in python is an object for us but, our hero has to do some non-object and internal stuff. So, CPython divides the chunk of memory it has into two✌️ portions:

  • Object related
  • Internal and Non-Object related.

Now, I have written it in simple words you can go on reading the official doc(I have linked it below) to know every detail.

Within the object-related area object allocator is the boss🤵. It gets called every time an object needs memory:

  • allocated or
  • deleted.

It is a kind😊 boss who knows how to deal with small amounts of data. It is smart🤵 and efficient too as it allocates memory only when required.

There are three✌️👆 main parts of CPython’s memory allocation:

  • Arena,
  • Pools, and
  • Blocks

Before understanding these, you should first know the:

  • malloc() and
  • free()

functions of the C language.

malloc() is to get space from the memory reserved for our program. free() is to free the reserved space back to the operating system.

Now, let’s see pools🏊 first.

Pools

A pool (not a swimming🏊‍♂️ pool of course) is a doubly linked list having nodes of a particular size of memory. Like a list having 8-byte memory nodes linked to one another.

There are multiple pools like this (not to dive in) of different memory sizes for different classes like int, str, or your own classes.

These swimming pools can be in 3 states:

  • 👥 Used
  • 🏊‍♀️ Full
  • 🚫 Empty

Used pools are the ones that can still have some data.

Full are the ones that cannot eat have any more data.

Empty pools are the ones that don’t have any water data.

Whenever a block of memory is requested the linked list of used pools is checked if there is some space in it, its allocated otherwise our hero CPython goes to empty pools.

Whenever the hero needs to use space from the empty pool the list is added to used pools.

And whenever the hero gets some “free” memory from the full pool the list is added to used pools.

That’s how these pools switch their states.🔄

Blocks

Blocks are the nodes in the doubly linked list, i.e., pools.

These are the smallest🔎 units of memory, they are of a fixed number of bytes.

Well, just like pools they have 3 states:

  • Used
  • Full
  • Empty

Arena

An arena has a lot of swimming🏊‍♂️ pools.

Now, they do not❌ have any explicit states like pools and blocks. Instead, they are a sorted singly linked list called usable_arenas. The one with the least number of free pools comes first👆 and so on.

But, why? Why not the opposite that makes more sense🤔 after all?

And that brings us to the idea💡 of, what freeing up memory even means.

Python doesn’t free the memory back to the Operating system🖥️. It keeps the memory to itself for later use.

Instead, it keeps the memory for later use. Therefore, the only way to truly free up memory is to release entire arenas.

While freeing up empty arenas can help reduce memory usage, it’s not always necessary or practical to do so.

CPython has its own mechanisms for managing memory, and it’s generally best to let the interpreter handle memory allocation and deallocation.

Official python doc on CPython

Conclusion

I hope I was able to get memory management right into your memory🧠.

We first started off getting around the 2 basic tasks related to memory management:

  • 👉allocating and
  • 👉freeing memory.

Next, we saw that there are multiple levels between your python code and the actual physical memory.

Then, we went on to see memory management by the python application. Here we saw that Python’s default implementation is written in C.

Further, we met the traffic🚦 police of CPython’s memory management, GIL. It was responsible to keep the memory management fight-free.

After this, we took a deeper insight into the garbage collector♻️ in CPython’s memory management.

Finally, we saw in-depth our hero🦸‍♂️ CPython’s memory management and saw what freeing up memory means.

I hope I was able to give you a comprehensive insight into Python’s memory management…

#Memory management

Challenge 🧗‍♀️

Your challenge is to pin down a major difference between CPython and PyPy’s memory management.

Stay happy😄 and keep coding and do suggest any improvements if there.👍

Take care and have a great😊 time I’ll see you soon in the next post…Bye Bye👋

Leave a Reply

This Post Has 8 Comments

  1. primal

    It’s рerfect time to make some plans for thе futᥙгe and it’s time to be
    hapρy. I have rеad thіs post ɑnd if I couⅼd I
    wish to sսggest you few interesting things or suggestions.

    Maybe you could write next articles referring to this artіcle.

    I wiѕh to read mоre things about it!

  2. Maitry

    sure, will write more about memory management in future articles…Stay tuned. Glad you like it.😊

  3. Chat GPT

    Great article on memory management in Python! The content is well-explained and easy to understand, making it a valuable resource for beginners like me. I appreciate the inclusion of practical examples and tips for optimizing memory usage. Keep up the fantastic work! GPTOnline

  4. Maitry

    Thanks a lot!! Stay tuned for more…

  5. fusion

    There is definately a lot to find out about this subject. I like all the points you made

  6. Maitry

    Thanks a lot! Stay tuned for more…

  7. ChatGPT

    This article on memory management in Python is absolutely fantastic! It provides a clear and concise explanation of how Python manages memory, which helped me better understand the inner workings of the language. The examples and illustrations used were incredibly helpful in grasping the concepts. Thank you for providing such valuable insights and making the topic so accessible for readers like me. Great job!
    -CGPTOnline

  8. Maitry

    Thank you very much for your kind words! Pleased to know that…