Split debugging info -- symbols

In a previous post I mentioned split debugging info.

One addendum to this is how symbols are handled. Symbols are separate to debugging info (i.e. the stuff about variable names, types, etc you get when -g is turned on), but necessary for a good debugging experience.

You have a choice, however, of where you leave the symbol files. You can leave them in your shipping binary/library so that users who don't have the full debugging info available will still get a back-trace that at least has function names. The cost is slightly larger files for everyone, even if the symbols are never used. This appears to be what Redhat does with it's system libraries, for example.

The other option is to keep the symbols in the .debug files along-side the debug info. This results in smaller libraries, but really requires you to have the debug info files available to have workable debugging. This appears to be what Debian does.

So, how do you go about this? Well, it depends on what tools you're using.

For binutils strip, there is some asynchronicity between the --strip-debug and --only-keep-debug options. --strip-debug will keep the symbol table in the binary, and --only-keep-debug will also keep the symbol table.

$ gcc -g -o main main.c
$ readelf --sections ./main | grep symtab
  [36] .symtab           SYMTAB          00000000 000f48 000490 10     37  53  4
$ cp main main.debug
$ strip --only-keep-debug main.debug
$ readelf --sections ./main.debug | grep symtab
  [36] .symtab           SYMTAB          00000000 000b1c 000490 10     37  53  4
$ strip --strip-debug ./main
$ readelf --sections ./main.debug | grep symtab
  [36] .symtab           SYMTAB          00000000 000b1c 000490 10     37  53  4

Of course, you can then strip (with no arguments) the final binary to get rid of the symbol table; but other than manually pulling out the .symtab section with objcopy I'm not aware of any way to remove it from the debug info file.

Constrast with elfutils; more commonly used on Redhat based system I think.

eu-strip's --strip-debug does the same thing; leaves the symtab section in the binary. However, it also has a -f option, which puts any removed sections during the strip into a separate file. Therefore, you can create any combination you wish; eu-strip -f results in an empty binary with symbols and debug data in the .debug file, while eu-strip -g -f results in debug data only in the .debug file, and symbol data retained in the binary.

The only thing to be careful about is using eu-strip -g -f and then further stripping the binary, and consequently destroying the symbol table, but retaining debug info. This can lead to some strange things in backtraces:

$ gcc -g -o main main.c
$ eu-strip -g -f main.debug main
$ strip ./main
$ gdb ./main
GNU gdb (GDB) 7.1-debian
...
(gdb) break foo
Breakpoint 1 at 0x8048397: file main.c, line 2.
(gdb) r
Starting program: /home/ianw/tmp/symtab/main

Breakpoint 1, foo (i=100) at main.c:2
      2return i + 100;
(gdb) back
#0  foo (i=100) at main.c:2
#1  0x080483b1 in main () at main.c:6
#2  0x423f1c76 in __libc_start_main (main=Could not find the frame base for "__libc_start_main".
) at libc-start.c:228
#3  0x08048301 in ?? ()

Note one difference between strip and eu-strip is that binutils strip will leave the .gnu_debuglink section in, while eu-strip will not:

$ gcc -g -o main main.c
$ eu-strip -g -f main.debug main
$ readelf --sections ./main| grep debuglink
  [29] .gnu_debuglink    PROGBITS        00000000 000bd8 000010 00      0   0  4
$ eu-strip main
$ readelf --sections ./main| grep debuglink
$ gcc -g -o main main.c
$ eu-strip -g -f main.debug main
$ strip main
$ readelf --sections ./main| grep debuglink
  [27] .gnu_debuglink    PROGBITS        00000000 0005d8 000010 00      0   0  4

A short tour of TERM

I'd wager probably more people today don't know what the TERM environment variable really does than do (yes, everyone who used to use ed over a 300-baud acoustic-coupler is laughing, but times have changed!).

I recently got hit by what was, at the time, a strange bug. Consider the following trivial Python program, that uses curses and issues the "hide the cursor" command.

$ cat curses.py
import curses

curses.initscr()

curses.curs_set(0)

curses.nocbreak()
curses.echo()
curses.endwin()

If you run it in your xterm or linux console, nothing should happen. But change your TERM to vt102 and it won't work:

$ TERM=vt102 python ./test.py
Traceback (most recent call last):test-curses.py
  File "./test-curses.py", line 5, in <module>
    curses.curs_set(0)
_curses.error: curs_set() returned ERR

(obvious when you explicitly reset the term, no so obvious when some script is doing it behind your back -- as I can attest!) So, what really happened there? The curs_set man page gives us the first clue:

curs_set returns the previous cursor state, or ERR if the requested visibility is not supported.

How do we know if visibility is supported? For that, curses consults the terminfo library. This is a giant database streching back to the dawn of time that says what certain terminals are capable of, and how to get them to do whatever it is you want them to do. So the TERM environment variable tells the terminfo libraries what database to use (see /usr/share/terminfo/*) when reporting features to people who ask, such as ncurses. man 5 terminfo tells us:

cursor_invisible | civis | vi | make cursor invisible

So curses is asking terminfo how to make the cursor invisible, and getting back "that's not possible here", and hence throwing our error.

But how can a person tell if their current terminal supports this feature? infocmp will echo out the features and what escape codes are used to access them. So we can check easily:

$ echo $TERM
xterm
$ infocmp | grep civis
                       bel=^G, blink=\E[5m, bold=\E[1m, cbt=\E[Z, civis=\E[?25l,
$ TERM=vt102 infocmp | grep civis
$

So we can now clearly see that the vt102 terminal doesn't support civis, and hence has no way to make the cursor invisible; hence the error code.

If you're ever looking for a good read, check out the termcap/terminfo master file. The comments are like a little slice of the history of computing!

Cook's Illustrated v Food52 Cookie Challenge

I saw Christopher Kimball, doyen of the Cook's Illustrated empire, at our local bookstore a while ago and, as one of my old professors would say, he was "good value".

He did, however, have a bit of a rant about the internet and how random websites just did not produce recipes that could compare to the meticulous testing that a Cook's Illustrated recipe went through. I was interested to see this come to a head on Slate where Cook's Illustrated has put their recipes up against food52.com for a head-to-head battle.

So yesterday, my wife and I took up the challenge. We resolved to follow both recipes meticulously, which I believe we achieved after a trip for ingredients.

Neither was particularly easier or harder than the other -- the Cook's Illustrated had a lot of fiddling with spices, while the food52.com one required creaming the butter and sugar.

Both came out pretty much as you would expect, although the Cook's Illustrated ones were a little flat. The recipe does say to be careful to not overwork the dough; it may be partially user error.

We were split. I liked the Cook's Illustrated one better, as the spices really were quite mellow and very enjoyable with a cup of coffee. My wife tended towards the plainer food52.com ones, but she is a big fan of a plain sugar cookie.

The ultimate test, however, was leaving both of them on the bench at work with a request to vote. The winner was clear -- 10 votes for Cook's Illustrated and only 3 for food52.com.

So, maybe Kimball has a point. Either way, when there's cookies, everyone's a winner!

Some photos of the results:

This laptop has Super Cow Powers

This laptop has Super Cow Powers

Tip: if you would like your own cow sticker, take a cute child through the checkouts at your local Whole Foods.

How much slower is cross compiling to your own host?

The usual case for cross-compiling is that your target is so woefully slow and under-powered that you would be insane to do anything else.

However, sometimes for one of the best reasons of all, "historical reasons", you might ship a 64-bit product but support building on 32-bit hosts, and thus cross-compile even on a very fast architecture like x86. How much does this cost, even though almost everyone is running the 32-bit cross-compiler on a modern 64-bit machine?

To test, I got a a 32-bit cross and a 64-bit native x86_64 compiler and toolchain; in this case based on gcc-4.1.2 and binutils 2.17. I then did a allyesconfig build of Linux 2.6.33 x86_64 kernel 3 times using the cross compilier toolchain and then native one. The results (in seconds):

32-bit

64-bit

6090

5684

6050

5616

6063

5652

average

6067

5650

So, all up, ~7% less by building your 64-bit code on a 64-bit machine with a 32-bit cross-compiler.

What exactly does -Bsymblic do? -- update

Some time ago I wrote a description of the -Bsymbolic linker flag which could do with some further explanation. The original article is a good starting place.

One interesting point that I didn't go into was the potential for code optimisation -Bsymbolic brings about. I'm not sure if I missed that at the time, or the toolchain changed, both are probably equally likely!

Let me recap the example...

ianw@jj:/tmp/bsymbolic$ cat Makefile
all: test test-bsymbolic

clean:
    rm -f *.so test testsym

liboverride.so : liboverride.c
           $(CC) -Wall -O2 -shared -fPIC -o liboverride.so $<

libtest.so : libtest.c
       $(CC) -Wall -O2 -shared -fPIC -o libtest.so $<

libtest-bsymbolic.so : libtest.c
             $(CC) -Wall -O2 -shared -fPIC -Wl,-Bsymbolic -o $@ $<

test : test.c libtest.so liboverride.so
     $(CC) -Wall -O2 -L. -Wl,-rpath=. -ltest -o $@ $<

test-bsymbolic : test.c libtest-bsymbolic.so liboverride.so
    $(CC) -Wall -O2 -L. -Wl,-rpath=. -ltest-bsymbolic -o $@ $<
$ cat liboverride.c
#include <stdio.h>

int foo(void)
{
    printf("override foo called\n");
    return 0;
}
$ cat libtest.c
#include <stdio.h>

int foo(void) {
    printf("libtest foo called\n");
    return 1;
}

int test_foo(void) {
    return foo();
}
$ cat test.c
#include <stdio.h>

int test_foo(void);

int main(void)
{
    printf("%d\n", test_foo());
    return 0;
}

In words; libtest.so provides test_foo(), which calls foo() to do the actual work. libtest-bsymbolic.so is simply built with the flag in question, -Bsymbolic. liboverride.so provides the alternative version of foo() designed to override the original via a LD_PRELOAD of the library.

test is built against libtest.so, test-bsymbolic against libtest-bsymbolic.so.

Running the examples, we can see that the LD_PRELOAD does not override the symbol in the library built with -Bsymbolic.

$ ./test
libtest foo called
1

$ ./test-bsymbolic
libtest foo called
1

$ LD_PRELOAD=liboverride.so ./test
override foo called
0

$ LD_PRELOAD=liboverride.so ./test-bsymbolic
libtest foo called
1

There are a couple of things going on here. Firstly, you can see that the SYMBOLIC flag is set in the dynamic section, leading to the dynamic linker behaviour I explained in the original article:

ianw@jj:/tmp/bsymbolic$ readelf --dynamic ./libtest-bsymbolic.so

Dynamic section at offset 0x550 contains 22 entries:
  Tag        Type                         Name/Value
 0x00000001 (NEEDED)                     Shared library: [libc.so.6]
 0x00000010 (SYMBOLIC)                   0x0
...

However, there is also an effect on generated code. Have a look at the PLTs:

$ objdump --disassemble-all ./libtest.so
Disassembly of section .plt:

[... blah ...]

0000039c &lt;foo@plt&gt;:
 39c:   ff a3 10 00 00 00       jmp    *0x10(%ebx)
 3a2:   68 08 00 00 00          push   $0x8
 3a7:   e9 d0 ff ff ff          jmp    37c &lt;_init+0x30&gt;
$ objdump --disassemble-all ./libtest-bsymbolic.so
Disassembly of section .plt:

00000374 <__gmon_start__@plt-0x10>:
 374:   ff b3 04 00 00 00       pushl  0x4(%ebx)
 37a:   ff a3 08 00 00 00       jmp    *0x8(%ebx)
 380:   00 00                   add    %al,(%eax)
        ...

00000384 <__gmon_start__@plt>:
 384:   ff a3 0c 00 00 00       jmp    *0xc(%ebx)
 38a:   68 00 00 00 00          push   $0x0
 38f:   e9 e0 ff ff ff          jmp    374 <_init+0x30>

00000394 <puts@plt>:
 394:   ff a3 10 00 00 00       jmp    *0x10(%ebx)
 39a:   68 08 00 00 00          push   $0x8
 39f:   e9 d0 ff ff ff          jmp    374 <_init+0x30>

000003a4 <__cxa_finalize@plt>:
 3a4:   ff a3 14 00 00 00       jmp    *0x14(%ebx)
 3aa:   68 10 00 00 00          push   $0x10
 3af:   e9 c0 ff ff ff          jmp    374 <_init+0x30>

Notice the difference? There is no PLT entry for foo() when -Bsymbolic is used.

Effectively, the toolchain has noticed that foo() can never be overridden and optimised out the PLT call for it. This is analogous to using "hidden" attributes for symbols, which I have detailed in another article on symbol visiblity attributes (which also goes into PLT's, if the above meant nothing to you).

So -Bsymbolic does have some more side-effects than just setting a flag to tell the dynamic linker about binding rules -- it can actually result in optimised code. However, I'm still struggling to find good use-cases for -Bsymbolic that can't be better done with Version scripts and visibility attributes. I would certainly recommend using these methods if at all possible.

Thanks to Ryan Lortie for comments on the original article.

Handling hostnames, UDP and IPv6 in Python

So, you have some application where you want the user to specify a remote host/port, and you want to support IPv4 and IPv6.

For literal addresses, things are fairly simple. IPv4 addresses are simple, and RFC2732 has things covered by putting the IPv6 address within square brackets.

It gets more interesting as to what you should do with hostnames. The problem is that getaddrinfo can return you multiple addresses, but without extra disambiguation from the user it is very difficult to know which one to choose. RFC4472 discusses this, but there does not appear to be any good solution.

Possibly you can do something like ping/ping6 and have a separate program name or configuration flag to choose IPv6. This comes at a cost of transparency.

The glibc implementation of getaddrinfo() puts considerable effort into deciding if you have an IPv6 interface up and running before it will return an IPv6 address. It will even recognise link-local addresses and sort addresses more likely to work to the front of the returned list as described here. However, there is still a small possibility that the IPv6 interface doesn't actually work, and so the library will sort the IPv6 address as first in the returned list when maybe it shouldn't be.

If you are using TCP, you can connect to each address in turn to find one that works. With UDP, however, the connect essentially does nothing.

So I believe probably the best way to handle hostnames for UDP connections, at least on Linux/glibc, is to trust getaddrinfo to return the sanest values first, try a connect on the socket anyway just for extra security and then essentially hope it works. Below is some example code to do that (literal address splitter bit stolen from Python's httplib).

import socket

DEFAULT_PORT = 123

host = '[fe80::21c:a0ff:fb27:7196]:567'

# the port will be anything after the last :
p = host.rfind(":")

# ipv6 literals should have a closing brace
b = host.rfind("]")

# if the last : is outside the [addr] part (or if we don't have []'s
if (p > b):
    try:
        port = int(host[p+1:])
    except ValueError:
        print "Non-numeric port"
        raise
    host = host[:p]
else:
    port = DEFAULT_PORT

# now strip off ipv6 []'s if there are any
if host and host[0] == '[' and host[-1] == ']':
    host = host[1:-1]

print "host = <%s>, port = <%d>" % (host, port)

the_socket = None

res = socket.getaddrinfo(host, port, socket.AF_UNSPEC, socket.SOCK_DGRAM)

# go through all the returned values, and choose the ipv6 one if
# we see it.
for r in res:
    af,socktype,proto,cname,sa = r

    try:
        the_socket = socket.socket(af, socktype, proto)
        the_socket.connect(sa)
    except socket.error, e:
        # connect failed!  try the next one
        continue

    break

if the_socket == None:
    raise socket.error, "Could not get address!"

# ready to send!
the_socket.send("hi!")

RFC3164 smells

From RFC3164, which is otherwise about syslog formats:

6. Security Considerations

An odor may be considered to be a message that does not require any acknowledgement. People tend to avoid bad odors but are drawn to odors that they associate with good food. The acknowledgement of the receipt of the odor or scent is not required and indeed it may be the height of discretion to totally ignore some odors. On the other hand, it is usually considered good civility to acknowledge the prowess of the cook merely from the ambiance wafting from the kitchen. Similarly, various species have been found to utilize odors to attract mates. One species of moth uses this scent to find each other. However, it has been found that bolas spiders can mimic the odor of the female moths of this species. This scent will then attract male moths, which will follow it with the expectation of finding a mate. Instead, when they arrive at the source of the scent, they will be eaten [8]. This is a case of a false message being sent out with inimical intent.

...

Along the lines of the analogy, computer event messages may be sent accidentally, erroneously and even maliciously.

This smells more like "I bet nobody ever really reads this RFC, let's put some stuff in the middle to see if they do!".