Python gallery generator

It's so hard to find a good gallery generator these days :) All I wanted was something really simple that could show a few photos in a slide-show style interface. Should use static HTML and leave the important configuration upto a style sheet.

All the ones on the market seemed to be a bit of an overkill, so I wrote my own. Of course, a sample is worth a thousand words.

Importing a python file as a config file

In some circumstances you might like to have the configuration file for your Python program actually be another Python program, especially if you want your users to be able to write really cool config files. At first glance the eval() statement looks like a nice way to do this, but unfortunatley it won't work because import isn't an expression, it's a statement.

>>> config_file = "config"
>>> eval("import " + config)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'config' is not defined

You can get around this however with the ihooks library. First, create a config file called config.py with the following.

class Config:
      some_config_dictionary = {"option" : "Hello Config"}

Then you can import the config as per the following

import ihooks, os

def import_module(filename):
    loader = ihooks.BasicModuleLoader()
    path, file = os.path.split(filename)
    name, ext  = os.path.splitext(file)
    module = loader.find_module_in_dir(name, path)
    if not module:
        raise ImportError, name
    module = loader.load_module(name, module)
    return module

def config_example():
    config = import_module("config.py").Config
    print config.some_config_dictionary["option"]

if __name__ == "__main__":
        config_example()

All going well, you should print out Hello, Config!

Decoding binary objects

bindecode is a little app I wrote to decode a bunch of bits into something more human friendly. Currently it supports most important IA64 registers and a few other things, though if someone wants to send me patches for other architectures that is cool.

A screenshot should suffice

ianw@lime:~/programs/junkcode/code/bindecode$ python ./bindecode.py

Bitmap Decoder
--------------
Tab shows bitmaps to complete, quit or Ctrl-D to exit

Input Bitmap Type >
386_pte           ia64_itir         ia64_psr          ia64_rr           to_32bit_binary   unix_permissions
ia64_ifa          ia64_pkr          ia64_pte          ia64_tlb_gr       to_64bit_binary

Input Bitmap Type > to_32bit_binary

to_32bit_binary value > 0xff
Decoded output for a Convert to a 32 bit binary
Bits | 0000 0000 0000 0000 0000 0000 1111 1111

to_32bit_binary value > o123
Decoded output for a Convert to a 32 bit binary
Bits | 0000 0000 0000 0000 0000 0000 0101 0011

to_32bit_binary value > b101010101010
Decoded output for a Convert to a 32 bit binary
Bits | 0000 0000 0000 0000 0000 1010 1010 1010

to_32bit_binary value > q

Input Bitmap Type > ia64_pte

ia64_pte value > 0x0210000005be01e1

Decoded output for a IA64 PTE Entry
           Present | True
          Reserved | 0
 Memory Attributes | 0x0
     Page Accessed | True
        Page Dirty | True
   Privilege Level | 0x3
     Access Rights | . . .
               PFN | 0x5be0
Exception Deferral | True
           Ignored | 0x10

ia64_pte value > q

Input Bitmap Type > q

The real thing is a bit more colourful with some ANSI colours.

On the Linux development model

Since 2.6 it seems that the distinction between the "stable" series and "unstable" series has pretty much disappeared. As someone who needs to keep up with current developments, this is often a real pain as it is like trying to build a house on quicksand. But one thing I didn't think of was that this increases compatibility; as evidenced by the model doing exactly the opposite ... in this post Andrew Morton suggests that suse may have caused two interfaces to the same thing due to releasing before something was accepted into the kernel.org sources.

So there's one advantage to having the moving releases as we do now -- everyone has no excuses not to keep pushing their stuff into the official trees.

On swig and size_t

Generally, when passing the size of something around it's good to use size_t and, should that something be a blob (in the binary object type sense) it probably wants to be a void*.

However, the cstring extension to SWIG uses int as sizes and char* for data, for example

%cstring_output_withsize(parm, maxparm): This macro is used to handle bounded character output functions where both a char * and a pointer int * are passed. Initially, the int * parameter points to a value containing the maximum size. On return, this value is assumed to contain the actual number of bytes. As input, a user simply supplies the maximum length. The output value is a string that may contain binary data.

You could potentially create your own typemaps to handle this and re-write large parts of cstring SWIG interface, but the point would be moot because by the time it got back to the Python API it has to be an int anyway; e.g. calls like PyObject* PyString_FromStringAndSize(const char *v, int len) all take an int. Since Python supports binary strings everything should be a char* too (this is less critical, but if you want to build with -Wall -Werror as you should, you'll need to make sure the types are right).

I would reccommend not following some of the SWIG instructions about doing your own typedef for size_t. This seems fraught with danger and you're only going to be calling Python API functions that expect an int anyway. Be aware that if you really have a need to be passing something around with a size that doesn't fit in an int, you'll have some work to do; otherwise design your API with the right types for the Python API.

Pthreads mutex subtleties

Is the following code valid?

#include <stdio.h>
#include <signal.h>
#include <pthread.h>

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

void sigint_handler(int sig)
{
        pthread_mutex_unlock(&mutex);
        return;
}


int main(void)
{

        signal(SIGINT, sigint_handler);

        pthread_mutex_lock(&mutex);

        printf("About to wait\n");

        pthread_mutex_lock(&mutex);

        printf("Ok, done!\n");
}

It certainly works on Linux, but on at least FreeBSD it will not as deadlock detection will kick in when the thread tries to lock the mutex it already has locked. Using signals in threaded code is probably a bad idea but I believe the deadlock detection is broken in this case.

Of course, the right way to do this is to use a condition variable and use a pthread_cond_wait() call in main() and wake it up from the signal handler. This works fine.

Moral of the story -- check the return value of pthread_mutex_lock() because sometimes it may not be what you expect! It's much easier to debug a error message about pthread_mutex_lock returning some strange EDEADLCK code than to try and find why your multithreaded application seems to be randomly crashing on another operating system.

Comparing some of the 386, AMD64 and IA64 ABI

Apart from the obvious 32-64 bit distinction between 386 and AMD64 there are two other interesting comparisons; parameter passing conventions and position independent code conventions.

Parameter Passing : on x86 parameters are passed via the stack. On AMD64 the first six "integer" arguments (anything that fits in a 64 bit register, basically) are passed via registers, similarly some floats can be passed via SSE registers. Only after this is data passed on the stack. On IA64, the first 8 arguments are passed in registers, whilst the rest are put on the stack.

On both AMD64 and IA64, there is a extra 16 byte "scratch area" (IA64) / 128 byte "red zone" (AMD64) that is below at the bottom of current stack frame. I would suggest that the smaller IA64 scratch area size is because of register windowing, which AMD64 does not support. On both architectures this is reserved and not modified by signal or interrupt handlers. "Leaf functions" (functions that do not call other functions) can use this area as their entire stack frame; saving some considerable overhead.

For varargs functions causes some confusion for AMD64/IA64, since arguments might be floats or might be integers, meaning they should be passed in either general or float/SSE registers respectively. On AMD64, functions known to be varargs functions should have a prologue that saves all arguments to a "register save area" that has a known layout (you pass the maximum number of possible floating point args as well to avoid saving unnecessary registers). Then, as you use the va_arg macro to go through the arguments you grab them from the register save area. On IA64, you assume that the first 8 arguments are passed in via the stack, and save these registers to your scratch area (2 registers) and 48 bytes of your stack (remaining 6 registers). This means all your arguments are stacked together (the incoming parameter list sits up against the scratch area) and va_arg can simply "walk" upwards.

Undefined functions are a bit more tricky; IA64 suggests that if a float is passed into a function with an undefined parameter, it should be copied to both the first general purpose register and the first floating point register, just to be safe. AMD64 doesn't seem to make such assumptions for you, for example, on IA64

ianw@lime:/tmp$ cat function.c
void function(float f)
{
        printf("%f\n", f);
}
ianw@lime:/tmp$ cat test.c
extern function();

int main(void)
{
        float f = 10000.01;
        function(f);
}
ianw@lime:/tmp$ gcc -o test test.c function.c
ianw@lime:/tmp$ ./test
10000.009766

That same code on AMD64 returns 0.

IP relative addressing: Position Independent Code (PIC) is code that can be loaded anywhere into memory and work. This is important because shared libraries may not always be at the same address, since other shared libraries might be loaded before or after them, etc. To maintain position independence, you can't rely on the base address of any code (because it might change) so you add a layer of indirection between your calls. In Linux/ELF land this is done with a Global Offset Table (GOT).

You can think of the GOT as a big list two columned list that has a symbol and it's "real address". Thus, instead of loading the symbol directly, you load the value from the GOT, and then load that value to find the real thing.

Note, you always know the relative address of the GOT, because although the base address might change, the difference between your code and where the GOT is will not. This means that if you need to load an address from the GOT, the easiest way is to load via an offset from the current instruction from the GOT entry. The compiler knows the current instruction offset (note it can't know the current instruction address, because the binary might be anywhere in memory), so it wants to say load the address at (CURRENT_INSTRUCTION - OFFSET_TO_GOT_ENTRY).

386 just can't do this -- there is no way to load an offset from the current instruction pointer. The only way you can do it is to keep a pointer to the GOT in a register (%ebp), and then offset from that. This wastes a whole register, and when you only have a few like the 386 this is a big killer.

AMD64 fixes this and allows you to offset from the current instruction pointer. This frees up a register, and changes the ABI by removing the distinction between the Absolute PLT and PIC PLT.

The PLT is a further enhancement that facilitates lazy binding. The PLT is "stubs" that point to a fix up function in the dynamic loader. At first, the GOT entries for functions point to the PLT entry for that function.

When you call the function, you don't go directly to it, you load it's value via the GOT and then jump to that value. As mentioned, at first this points to the PLT stub. This calls the lookup function in the dynamic loader which goes off and finds the real function (this might actually be in another shared library that needs to be loaded, for example). As arguments to this lookup function you pass the function name you're looking for (obviously) and the GOT entry of the original call. The dynamic loader finds the function, but then additionally fixes up the GOT entry to no longer point to the PLT stub, but to point directly to the required function. This means the next time you load from the GOT, you get the direct address of the function without the overhead of the PLT stub again.

IA64, allowing IP relative addressing, similarly doesn't have a distinction between absolute and PIC PLT's.

Debian release checklist

  1. Scan through code to ensure that it does not clag in different code as a library, etc, that may be under a different license
  2. lintian
  3. linda
  4. pbuilder