Python Notes: Global Interpreter Lock

Why Python use GIL

Python uses reference count for memory management. Each objects in python has a reference count variables that keep track of the number of reference to the object. When the reference count goes back to 0, Python would release this object from memory.

However, in multi threading scenario, multiple thread might access the same object, and object reference count could be changed incorrectly in race conditions. Then objects that should be released could still stay in memory and worst case, objects that should not be release are incorrectly released.

To solve this problem, python introduced GIL, which is a global lock in python interpreter level. The rules is that, any python code has to acquire this lock to be executed. You might ask why not add one lock to each objects? This could result in deadlock.

In this way, python code guarantees that only one thread would be able to change the object reference count.

Problem of GIL

The GIL solution, however has problems in that Python code would not be able to utilize multi CPUs. If your code is CPU intensive, python multi-thread would not help you at all. However, if you program is not CPU intensive, but I/O intensive, for example, network application, Python thread is still a good choice.

Solution to this problem?

Are there solutions to the problem? Yes, there are. Python community has tries many times to solve this problem. Python GIL is not really tie to python language it self, it ties to Python interpreter it self. So, as long as we change the underlying python interpreter, python could support multithread. For example, Jython, is implemented in Java.

However, another important reason why Python GIL is not removed is that python has many extended libraries that are writing in C. Those libraries works well with Python in that they don’t need to worry about multi-thread models, the GIL model is really easy to integrate. Moving those libraries to other interpreters are hard works.

Another solution is to use multiprocess instead of multithread. Python has good designed libraries that supports multiprocess. However, process management would have more overhead than thread management for operating system, which means the performance of multiprocess programs are worse than multithreads.

Python Notes: Decorator

By definition, a decorator is a function that takes another function and extends the behavior of the latter function without explicitly modifying it. Built-in decorators such as @staticmethod, @classmethod, and @property work in the same way.

How decorator works? Let's take a look at the following example:

def my_decorator(some_func):
    def wrapper():
        print("Some function is being called.")

    return wrapper

def another_func():
    print("Another function is being called.")

foo = my_decorator(another_func)


You would see the following print on the screen:

"Another function is being called."
"Some function is being called."

As you can see, we pass some_func into a closure, and do something before or after calling this function without modifying its original behavior, and we return the function. As we already learned, python functions are just like other python objects, they are first class objects. The returned function could be called just as any other functions.


The above example is already very similar to decorator. The difference is that decorator often comes with a @ symbol. This is an example of python syntax sugar, which often refers to syntax in a programming language that aims to make the things easy to read or to express. For example, the following is an example of a decorator:

def another_func():
    print("Another function is being called.")

Decorator that takes any argument

In python, we can use *args and **kargs to represent arbitrary arguments. The following example shows how to take arbitrary arguments in a decorator:

def proxy(func):
    def wrapper(*args, **kargs):
        return func(*args, **kargs)
    return wrapper

Decorator with parameters

Sometimes we want the decorator to take parameters, for example, we should implement it in this way:

def decorator(argument):
    def real_decorator(function):
        def wrapper(*args, **kwargs):
            result = function(*args, **kwargs)
        return wrapper
    return real_decorator

Decorator tips

One practical tips when defining decorator is to use the functoolss.wraps, this function would keep all the meta data information of the original functions, including the function signature and docstring information.

import functools

def uppercase(func):
    def warpper():
        return func()
    return wrapper

Python Notes: function as first class object

Per history of python blog, everything in python are first class objects, that means all objects that could be named in the language (e.g., integers, strings, functions, classes, modules, methods, etc.) to have equal status. Th:at is, they can be assigned to variables, placed in lists, stored in dictionaries, passed as arguments, and so forth..

Essentially, functions return a value based on the given arguments. In Python, functions are first-class objects as well. This means that functions can be passed around, and used as arguments, just like any other value (e.g, string, int, float).

Internally, python use a common C data structure that are used everywhere in the interpreter to represent all objects, either it is a python function or a integer.

However, when it comes to python function as first class, there are subtle things to think about when doing design.

Think about the following function definition:

class A:
    def __init__(self, x):
        self.x = x

    def foo(self, bar):
        print self.x, bar

What would happen if you assign to a variable: b = The first argument of the function would have to be the instance itself. To handle this problem, python 2 returns a unbound method, which is a warper around the original function, but it restrict that the first argument of the function has to be the object instance: a = A(), b(a). In python 3, however, this restriction is removed as the author found this is not very useful.

Let’s think about the second condition, when you have a instance of a class: a = A(1), b = In this case, python would return a bound method which is a thin wrapper around the original function. Bound method stores the instance as a internal object and this object would be the default first argument when calling this function.

Python Notes: Iterator, Generator and Co-routine


Python support iteration, for example, iterating over a list:

for elem in [1, 2, 3]:
    print elem

Iterating over a dict:

for key in {'Google': 'G',
              'Yahoo': 'Y',
              'Microsoft': 'M'}:
    print key

Iterating over a file:

with open('path/to/file.csv', 'r') as file:
    for line in file:
        print line

We use iterable objects in many ways, for example, reductions: sum(s), min(s), constructors: list(s), in operators: item in s.

The reason why we can iterate over iterable is because of iterable protocols: any objects that supports iter() and next() is an itterable. For example, we can define one itterable object in the following way:

class count:

def __init__(self, start):
    self.count = start

def __iter__(self):
    return self

Def next(self):
    if self.count < 0:
        raise StopIteration
    r = self.count
    self.count -=1
    return r

We can use the above example in this way:

c = count(5)
for i in c:
    print I
# 5, 4, 3, 2, 1


So what is a generator? By definition: a generator is a function that produces a sequence of results instead of a single value.

So generator is a function, it is different from other functions that it generates a sequence of results instead of a single value. Generator function is very different from normal function, calling the generator function will create one generator, but would not execute it, until next() is called. The following is an example of generator:

def count(n):
    while n > 0:
        yield n
        n -= 1

c = count(5)

Note that when we first initiate count, it won't execute. Until the first time we call, the generator would start to execute. But it will suspend on the yield command, until next time it executes.

So to speak, a generator is a convenient way of writing an iterator, and you don't have to worry about iterator protocols.

Except for yield based generator function, python also supports generator expression:

a = [1, 2, 3, 4]
b = [x*2 for x in a]
c = (x*2 for x in a)

b is still a regular list, while c is a generator.


Python coroutine is very similar to generator. Think about the following pattern:

def receive_count():
        while True:
            n = (yield) # Yield expression
            print "T-minues ", n
    except GeneratorExit:
        print "Exit from generator."

The above form of generator is called coroutine. Coroutine is different from generator in that it receives data instead of generates data. Think of it as a consumer or receiver.

To use python co-routine, you need to call next() first so that the function executes to the yield field part, then you can use send to send the value to the function. For example:

    c = receive_count() # trigger to yield function
    c.send(1) # sending 1 to the co-routine.
    # prints "T-minus 1"

Python provided a decorator called @consumer to execute the next() function part. With the consumer decorator, the co-routine can be used directly.

Then the question is: why don't we just declare co-routine as a regular function where you can send the value to it directly instead of relying on the yield expression? Using coroutine in the given examples doesn't fully justify it's value. More often, people use co-routine to implement a application level multiple threading. I will introduce more about this later.

Python Notes: Context management

Python supports context management. Which often used when handling resources, for example, file, network connection. With statement helps make sure the resources are cleaned up or released. There are two major functions for context management: __enter__ and __exit__.

__enter__ is triggered when the with statement is first triggered, while __exit__ statement is triggered when the statement finishes execution.

One very common usage of with statement is when we open up files:

    with open('/path/to/file', 'r') as file:
        for line in file():

The with statement on this example will automatically close the file descriptor no matter how this with block exits.

Python with statement also supports nesting, for example:

    with open('/open/to/file', 'r') as infile:
        with open('write/to/file', 'w') as outfile:
            for line in infile:
                if line.startswith('Test'):

If you want your code to support with statement based context management, just override the two functions on your code, for example:

class Resource:
    def __init__(self, res):
        self.res = res

    def __enter__(self):
        #define your code here

    def __exit__(self, exc_type, exc_value, traceback):
        #define your code here

There is another way to support context management, which is to use the contextlib.contextmanager, we will introduce this later.


Python Notes: Closure


The following is an example of python closure.

def outer_function(outter_arg):
    closure_var = outter_arg
    def inner_function(inner_arg):
        print("{} {}".format(closure_var, inner_arg))

    return inner_function

# Usage of a python closure
closure_func = outer_function("X")
closure_func("Y1") # print "Y1 X"
closure_funct("Y2") # output "Y2 X"

Variables and nested function

Python has two types of variables: local variable and global variable. Variables defined within a function has a local scope, while variables defined outside a function has global scope.

When a function is defined within another function, the function is called nested function. The nested function, however, can access the outer function’s variables. In the example above, outer function defined a variable called closure_var, and this variable would be initiated by the input variable. The inner function is able to access this variable and use this variable in its own function definition.

There are two steps of using a closure function: initiate the closure function and assign to a local variable, then call the variable with parameter to invoke the inner function.

When to use closure?

  • Reduce the use of global variables

Closure could hide variables inside the function, in this way we reduce the use of global variables.

  • Simplify single function classes.
    For example, the above example could be converted into a single function class:
class OuterClass:
    def __init__(outer_arg):
        self.arg = outer_arg

    def inner_function(self, inner_arg):
        print("{} {}".format(self.arg, inner_arg))

In general, closure runs faster than instance function calls.