Optimizing a Python Program

These days, I’ve been optimizing a Python program I wrote. Optimizing is a fun task, but very difficult. Most of the time, the first solution I think is even worse than the previous situation. I need more experience.

Some processes were too slow in my program and I realized it was because I was performing too much disk I/O operations. I thought a solution could be read more data in memory and operate there. Now I have excessive memory consumption.

Here is a very simplified description of my memory consumption problem:

I have a text file. Each line in the file represents an item of a large list. Each line has two string values separated by a character. Something like a CSV file. I have to read the file content and put it in a list.

A line in the file looks like this:

Content of the first value|Content of the second value

The separator is '|'

Here is a simple Python program that read the file:

class Field:
    def __init__(self, line):
        self.value1, self.value2 = line.split('|')

fields = []

with open('test_data') as file_:
    for line in file_:
        fields.append(Field(line))

Running this program with a test file of about 42 MB gives this results:

Execution time (time): 0m4.108s
Memory consumed (pmap): 166652K

I was surprised by the high memory usage of the program. If the file is 42MiB, I thought the program should use a similar amount of memory, obviously a higher amount, but not almost four times the size of the file.

An equivalent program in C (error checking is omitted):

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

#define VALUE1_MAX 30
#define VALUE2_MAX 80
#define LINE_SIZE VALUE1_MAX + VALUE2_MAX + 3
#define BUFFER 10000

typedef struct
{
    char value1[VALUE1_MAX+1];
    char value2[VALUE2_MAX+1];
} field;

int main()
{
    FILE *file = fopen("test_data", "r");

    field *fields = (field*) malloc(BUFFER*sizeof(field));
    char line[LINE_SIZE];
    char *part;
    long i=0;
    long size = BUFFER;
    while(fgets(line, LINE_SIZE, file) != NULL) {
        part = strtok(line, "|");
        strcpy(fields[i].value1, part);
        part = strtok(NULL, "|");
        strcpy(fields[i].value2, part);

        i++;
        if (i == size) {
            size += BUFFER;
            fields = (field*) realloc(fields, size*sizeof(field));
        }
    }
    fclose(file);
    free(fields);
    return 0;
}

Results for the C program:

Execution time (time): 0m0.537s
Memory consumed (pmap): 57548K

This is much better.

The problem with the Python program seems to be the Field objects using more memory than they need. Testing the program without the Field creations, changing fields.append(Field(line)) withfields.append(line) seems to perform better:

Execution time (time): 0m0.575s
Memory consumed (pmap): 66808K

Clearly, the Field object is the bottleneck both in memory consumption and execution time. This is probably because of some default memory allocations that Python makes for the object and its fields. Python is a really cool language, but it doesn’t let you control the way the memory is used. This is a positive thing in most of the cases, but in some of them, like this one, is negative.

Most of the times, there are only very small parts of a program that really need to be optimized. And a programmer is much more productive with Python than with C. It doesn’t make sense to rewrite the program in C. Instead, a C module could be written for the bottlenecks.

I was too lazy to learn how to use the Python C API, so I looked a this project called Cython. Cython is a language designed for writing Python extensions. It’s very similar to Python, but is translated to C and compiled to an easy to use Python module. Cython also lets you mix C code and Python code easily. It lets you use high level python objects or low level C data types as you need and mix them properly.

I decided to rewrite the Field class in Cython:

#field.pyx 
DEF VALUE1_MAX = 30
DEF VALUE2_MAX = 80

cdef extern from "string.h": 
    char *strcpy(char *dest, char *src)

cdef class Field: 
    cdef readonly char value1[VALUE1_MAX+1] 
    cdef readonly char value2[VALUE2_MAX+1] 
    def __init__(self, line):
        v1, v2 = line.split('|')
        strcpy(self.value1, v1)
        strcpy(self.value2, v2)

This extension type can be used almost in the same way than a real Python object:

>>> f = Field('Hello|World')
>>> f.value1
'Hello'
>>> f.value2
'World'
>>>

I had to modify the original Python script to use the new module:

from field import Field

fields = []

with open('test_data') as file_:
    for line in file_:
        fields.append(Field(line))

Results of the new program:

Execution time (time): 0m1.257s
Memory consumed (pmap): 69800K

This is a huge improvement. With a very small change, the program now consumes almost 100MB less memory and it runs three times faster. I could write more parts in Cython, using strtok() instead of str.split(), or even rewriting the entire list and reading process. I would probable get a performance very similar to the C program. But I’m comfortable with the results now. I’m still surprised with the small effort compared to the awesome results.

If you want to do your own tests. Here is a simple script to generate a test file with 500k values:

import string
import random

with open('test_data', 'w') as f:
    for i in range(500000):
        value1 = ''.join(random.choice(string.letters)
                         for s in range(random.randint(15, 30)))
        value2 = ''.join(random.choice(string.letters)
                         for s in range(random.randint(50, 80)))
        f.write(value1 + '|' + value2 + '\n')

Python vs C#: Queries

One of the most beloved C# 3.0 features is Linq. Linq brings great power to C#, it allows you to easily write structured queries over collections or remote data sources. Now with C# is possible to make queries as easy as with other languages like Python. I decided to compare the way you make queries with C# and with Python. I found a great page showing 101 Linq examples, I decided to write Python versions of this examples. Which version do you like more?

Where – Simple 1

C# version:

int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
var lowNums = from n in numbers where n < 5 select n;

Python version:

numbers = [5, 4, 1, 3, 9, 8, 6, 7, 2, 0]
low_nums = (n for n in numbers if n < 5)

 

Where – Indexed

C# version:

string[] digits = { "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine" };
var shortDigits = digits.Where((digit, index) => digit.Length < index);

Python version:

digits = ['zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine']
short_digits = (digit for index, digit in enumerate(digits) if len(digit) < index)

 

Select – Simple 1

C# version:

var numsPlusOne = from n in numbers select n + 1;

Python version:

nums_plus_one = (n + 1 for n in numbers)

 

Select – Anonymous Types 1

C# version:

string[] words = { "aPPLE", "BlUeBeRrY", "cHeRry" };

var upperLowerWords =
    from w in words
    select new {Upper = w.ToUpper(), Lower = w.ToLower()};

Python version:

The exact Python version would be something like:

words = ['aPPLE', 'BlUeBeRrY', 'cHeRry']

upper_lower_words = ( type('', (), {'upper': w.upper(), 'lower': w.upper() })
                      for w in words)

But I feel more Pythonic this:

upper_lower_words = ( (w.lower(), w.upper()) for w in words)

Or even this:

upper_lower_words = ( {'upper': w.upper(), 'lower': w.upper() }
                      for w in words)

SelectMany – Compound from 1

C# version:

int[] numbersA = { 0, 2, 4, 5, 6, 8, 9 };
int[] numbersB = { 1, 3, 5, 7, 8 };

var pairs =
    from a in numbersA,
         b in numbersB
    where a < b
    select new {a, b};

Python version:

numbersA = [0, 2, 4, 5, 6, 8, 9]
numbersB = [1, 3, 5, 7, 8 ]

pairs = ( (a, b) for a in numbersA 
                 for b in numbersB 
                 if a < b)

SelectMany – from Assignment

C# version:

var orders = from c in customers,
                  o in c.Orders,
                  total = o.Total
             where total >= 2000.0M
             select new {c.CustomerID, o.OrderID, total};

Python version:

I couldn’t find how to make the assignment in Python, so the version is:

orders = ( {'customer_id': c.customer_id,
            'order_id': o.order_id,
            'total': o.total }
           for c in customers
           for o in c.orders
           if o.total > 2000)

SelectMany – Multiple from

C# version:

var orders = from c in customers
             where c.Region == "WA"
             from o in c.Orders
             where o.OrderDate >= cutoffDate
             select new {c.CustomerID, o.OrderID};

Python version:

orders = ( (c.customer_id, o.order_id)
           for c in customers if c.region == 'WA'
           for o in c.orders if o.date >= cutoff_date)

Take Simple

C# version:

var first3Numbers = numbers.Take(3);

Python version:

if we are working with something like a list, we could do:

first_3_numbers = numbers[:3]

but, if we are working with iterators, we must do:

first_3_numbers = itertools.islice(numbers, None, 3)

Skip – Simple

C# version:

var allButFirst4Numbers = numbers.Skip(4);

Python version:

all_but_fist_4_numbers = numbers[4:] # list version all_but_fist_4_numbers = itertools.islice(numbers, 4, None) # iterator version 

TakeWhile – Simple

C# version:

var firstNumbersLessThan6 = numbers.TakeWhile(n => n < 6);

Python version:

fist_numbers_less_that_6 = itertools.takewhile(lambda x: x < 6, numbers)

SkipWhile – Simple

C# version:

var allButFirst3Numbers = numbers.SkipWhile(n => n % 3 != 0);

Python version:

all_but_first_3_numbers = itertools.dropwhile(lambda x: x % 3 != 0, numbers)

First & Last

C# version:

numbers.First()
numbers.Last()

Python version:

numbers[0]  # first for a list numbers[-1] # last for a list 
numbers.next()   # first for iterator list(numbers)[0] # first for iterator 
list(numbers)[-1] # last for iterator 

First – Indexed

C# version:

int evenNum = numbers.First((num, index) => (num % 2 == 0) && (index % 2 == 0));

Python version:

even_num = [n for i, n in enumerate(numbers) if n%2 == 0 and i%2 == 0][0]

or:

even_num = (n for i, n in enumerate(numbers) if n%2 == 0 and i%2 == 0).next()

to be continued…

PyBotLearn

PyWeek is finished. Unfortunately, I didn’t have enough time for it this year, so I’m almost a DNF. The theme for this year was Robot. My idea ended up being more like an application than a game. I wanted to create an educational environment for teaching programing to children. Something similar to Logo. The idea was to build a game where your could program a small robot. It should be interactive so children could experiment on the fly.

In other words, It should be something like this: A window with a robot and a world, bellow a Python console. The users should be able to write commands in the console and the robot should execute them. The robot should be able to interact with it’s environment and with the user too. Error, warnings, etc, should be given to the programmer in a friendly way. Additionally, there should be various challenges that the programmers should solve. This should be the fun factor for the game.

But, unfortunately, as I said before, I didn’t have more that a couple of hours every day since Thursday, so I’ve failed on most of my objectives, and the only thing I have is a barely working demo. No challenges, no fun, no objects, no cool environment. Any way, I still like the idea, and I will try to continue with this project. I would love to hear more opinions about it.

My final submission was called “PyBotLearn”. You can download it from my PyWeek 6 Entry Page.

Here is a small video of the demo:

A higher resolution video can be downloaded from here

PyWeek Ended

PyWeek is over. It was absolutelly fun!. My final entry is not what I would call a finished product, but it’s not bad. A couple of hours before the challenge end, the pyweek.org server went down. We had to send a md5 sum of our final entries to one of the event’s coordinators via e-mail.

Video of my game:

http://video.google.com/googleplayer.swf?docId=-3566000892545301155&hl=en

My code and more detailed comments in my PyWeek Entry Page.

PyWeek

Tomorrow, I’m going to participate in the fifth edition of PyWeek. PyWeek is a challenge in which participants must develop a video game in one week using Python. I like the idea because it brings a possibility to finish a project and have some fun by the way.

Some of the games created during PyWeek are really awesome. It’s amazing the fact that they were made in only one week. My favorite games of previous editions of PyWeek are:

I also like the competition and challenge feeling that you can breath in PyWeek.s

It’s possible to participate in two categories: Individual and Team. This time I am going to participate as Individual. I am thinking in use PyGame only. Even when some people are talking about Panda3D. I also want to use Blender to create pre-rendered sprites. I have been learning it secretly for a while. The result has been exactly what I expected: I suck as a graphic artist. My models are absolutely ugly, but at least I can do something for a game. By the way, now I prefer Blender to Wings3D for 3D modelling.

Screenshots of my attempt to model an aircraft with Blender. I also tried some kind of cell-shading or toon-shading redering:

If I suck with Blender. I prefer not even talk about my talent with sounds and music.

See you in one (py)week!

Dolor de cabeza con Python

Un problema de usar lenguajes dinámicos es que, al no tener etapa de compilación, no es posible detectar muchos de los errores sino hasta que se lanza alguna excepción mientras el programa se ejecuta. Un problema peor es cuando por alguna razón el error no produce una excepción y el programa termina funcionando erróneamente si dar ninguna pista sobre dónde puede estar el problema.

Examinen este pedazo de código en Python:

class MyClass:
	pass

if MyClass() < 1:
	do_something()
else:
	do_something_else()

o algo más curioso todavía:

if MyClass() < float('-infinity'):
	do_something()

do_something() siempre se ejecuta.

Lo correcto debería ser que al hacer este tipo de comparaciones se lanzara una excepción. La única forma de poderlo hacer debería se cuando sea explícito que el objeto puede compararse.

Noten que este código si lanza una excepción del tipo TypeError:

a = MyClass() + 1

Según lo que me dijeron en #python, parece ser que todos los objetos en Python están habilitados para hacer comparaciones. Esta es la razón por la cual se pueden ordenar fácilmente listas con cualquier tipo de datos en ellas. Debido a que arreglar esto supondría un corte con la compatibilidad hacia atrás, sólo podremos disfrutar de una adecuado comportamiento hasta que tengamos Python 3.0. Si esto hubiera estado listo ahora mismo me habría ahorrado un gran dolor de cabeza buscando un error.