Memory Leak

What is it?

  • a

What does Memory Leak cost me?

  • It is almost impossible to scale up when there is a memory leak

  • We hamper the experience of current users as well when there is a memory leakage

  • Example:

    • In my current organization, we are using 5 node clusters for different services

    • There was leakage in one of the applications, deployed on 1 server

    • Since there was a memory leakage in that application, it eats up all the RAM

    • Hence, other applications were working only on 4 nodes

    • And the current application was needed to restart whenever eating up all of the RAM

    • Hampering the current user experience, costing us unnecessarily on the bigger server, and we are facing the scaling issue as well

Possible Memory Leak cases?

  • some low-level C library is leaking

    • We can skip as of now (this is from the point, how python has been created)

  • Python code has global lists or dicts that grow over time and forgot to remove the objects after use

    • Need to figure out the source, and rectify this leakage

  • There are some reference cycles in the app

    • Automatically taken care of by garbage collector for python

  • Creating multiple instances of a very heavy package within an application

    • This can be rectified by creating the instance globally and using it everywhere

How to identify Memory Leak?

  • The most common way to detect memory leakage is when the server runs out of free space

  • Track the available RAM space for the server for 2-3 days after the application has been deployed

  • Use profiling tool that tells which part of application using how much RAM memory

How to handle Memory Leak?

  • Manually disposing-off resources no more needed (but reference is still available for the resource).

    • Nearly all languages include resource types that aren’t automatically freed.

    • Need to write specific code that tells the application that the resource’s work has finished

  • Most of the languages are equipped with an automatic memory management system called a garbage collector which frees up memory that the application doesn’t need.

    • That is when references count to a variable is zero, the gc frees up space

  • Within an application, if a single instance of a package can work, then create it globally in a config file. Call the instance wherever required

    Example: spacy "en core web lg" is a fairly large package, close to 3GB in size, we were using 3 instances of it earlier. When identified the issue, started using the single instance, declared globally

Few More Concepts

To check the memory location, we can use: hex(id(<value>)), it will give the location of the value

# If the values of x and y are same, it will point to the same memory location
x = 1
y = 1
hex(id(x))
hex(id(y))
  • Python uses the process called "interning", python only stores one object on Heap memory and ask different variables to point to this memory address if they use those objects

  • Interning does not apply to other types of objects such as large integers, most strings, floats, lists, dictionaries, tuples.

Common Ways to Reduce the Space Complexity

to be updated....

References for Further Reading

Last updated

Was this helpful?