Running new applications on old glibc

Glibc (short for GNU Libc, or GNU C Library) is a library that provides the interface between application programs and the Linux kernel. Although its official name is the "C" library (library for programs written in the "C" language), virtually all dynamically linked program binaries depend on it - it is the de-facto system library in almost all Linux operating systems.

Glibc is in active development and new versions are released from time-to-time; and this poses a problem: what would happen for application and program binaries originally built to run with a different version of glibc than the one currently installed in your operating system?

Glibc has a versioning system that allows backward compatibility (older programs built to run on older versions of glibc will continue to run on new glibc); but it is of no help the other way around: programs that depend on newer glibc will usually not run on systems with older glibc.

There is a way.

Background

First, the obvious: this issue only happens when you don't have the source of the said program binaries. If you do have it - the solution is very simple, just re-compile the application to work with your existing glibc (patch it if necessary), problem solved then and there.

Thus from now on, we will assume that you do not have access to the source code (or if you do, it is very difficult or impractical to re-build the application from source).


The problem stated in the abstract above is faced not only by glibc, but by any other libraries. But for other libraries, it is possible and easy to:

  1. upgrade and install new versions of the libraries; and
  2. some libraries can even have multiple versions of them installed at the same time - satisfying the need of older programs while at the same time catering to the requirement of newer programs.

But it is not so easy for glibc. Glibc, by its virtue as THE system library, it is used by all programs including many of programs that are involved in an upgrade process - doing so is the equivalent of pulling out the rug under which one is standing. It can be done, but it's tricky.

None is the other relief of running multiple versions of the same library applies to glibc - being THE system library, you cannot install multiple versions of glibc (unless you want to resort to tricks like the one specified here: http://bitwagon.com/rtldi/rtldi.html ).

The actual problem is usually caused by mismatch between expected version and installed version. Most of the time, the problem is missing symbols - there are some symbols (functions, variables) exported from the "expected version" of the library, but they don't exist in the actual installed version.

Another problem is if a function of the same name, has different behaviour in different versions of the library (i.e. different side-effects, taking different number of parameters, etc).

The Usual Trick

The usual trick to solve this problem by using LD_LIBRARY_PATH and LD_PRELOAD. LD_LIBRARY_PATH is used to supply missing functions; and LD_PRELOAD is used to override known problematic functions with better working ones.

There are tons of articles written in the Internet using these two solution. One of them here: http://www.novell.com/coolsolutions/feature/11775.html (this particular link is about running old apps on new glibc, the opposite of what we're dealing here, but the tricks are valid nonetheless and will work for many cases - but not always).

Problem solved? No.

When The Usual Trick Doesn't Work

The specific problem that motivates me to write this article is this error message - which cannot be addressed by the previous trick.

./myprogram: /lib/libc.so.6: version `GLIBC_2.14' not found (required by ./myprogram)

Sometime around glibc 2.1 (at the time this is written, we are at glibc 2.19), glibc introduced "symbol versioning" - in which glibc can support multiple functions of the same name, each tied to a particular version of glibc. This was meant to support backward compatibility, and it worked spectacularly in that respect. But it also means that new programs built on new glibc will by default use new version of the function (even if an older, compatible version exists and works well enough). When this happens, the resulting binary will only run on the newer glibc.

None of the LD_LIBRARY_PATH and LD_PREFIX trick will work because glibc's dynamic linker will perform this checking first before anything else is done. So how can we do about this?

For program authors / distritbutors: One of the ways to avoid having this problem in the first place is to build and compile the program using an older glibc.

Alternatively, one can also use the technique outlined here to be able to build and compile on newer glibc, but still forces the compiler to use older version of the functions (Of course, when doing this, one needs to be aware that there are differences (sometimes subtle) on the functions being versioned, and the application must take the necessary care accordingly).

When this is done properly, the resulting binary will work with both the old and the new glibc.

The Solution

The solution consist of two parts:

  1. "Weaken" the version dependency so it doesn't cause the dynamic linker to abort the program on start.
  2. Then supply the missing functions (the functions that depends on the newer glibc)

Get your hex editor ready to patch the binary, since no regular tools supports doing this (though, once you know the steps, it's not difficult to script the entire process). You will also need readelf.

Before I continue, however, it's good to know how the the versioning information is kept in the binary. Having this knowledge makes it clear of the mechanical steps that we will do later.

Illustrative data

Here is an illustrative output for dissection purposes. The three tables below are snippets of outputs from readelf -V myprogram (first table) and readelf -s myprogram (the next two tables). These are outputs from a real program, I just called it "myprogram" to protect the innocent.

Again for illustration only, we will only look for glibc symbols higher than 2.13. At the time the original version of this article was written, my OS uses glibc 2.13 so reference to a version newer than that is a problem.

It turns out that the only problematic version for this myprogram is GLIBC_2.14 (memcpy). I didn't know this in advance, in reality you will have to figure out yourself, but I'm telling you this so you understand why the information is given the way it is.

Symbol table '.dynsym' contains 265 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __printf_chk@GLIBC_2.3.4 (2)
     2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND ftell@GLIBC_2.2.5 (3)
     3: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND _Znam@GLIBCXX_3.4 (4)
     4: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __errno_location@GLIBC_2.2.5 (5)
... <snip> ...
    92: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND sendmsg@GLIBC_2.2.5 (5)
    93: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __vsnprintf_chk@GLIBC_2.3.4 (2)
    94: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND memcpy@GLIBC_2.14 (12)
    95: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND scandir64@GLIBC_2.2.5 (3)
    96: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND connect@GLIBC_2.2.5 (5)
... <snip> ...
Table 1: .dynsym table, from readelf -s

Version symbols section '.gnu.version' contains 265 entries:
 Addr: 0000000000402c42  Offset: 0x002c42  Link: 5 (.dynsym)
  000:   0 (*local*)       2 (GLIBC_2.3.4)   3 (GLIBC_2.2.5)   4 (GLIBCXX_3.4)
  004:   5 (GLIBC_2.2.5)   3 (GLIBC_2.2.5)   5 (GLIBC_2.2.5)   6 (GLIBCXX_3.4.15)
  008:   3 (GLIBC_2.2.5)   3 (GLIBC_2.2.5)   3 (GLIBC_2.2.5)   7 (GLIBC_2.3)
... <snip> ...
  054:   5 (GLIBC_2.2.5)   a (GLIBC_PRIVATE)   4 (GLIBCXX_3.4)   4 (GLIBCXX_3.4)
  058:   4 (GLIBCXX_3.4)   4 (GLIBCXX_3.4)   4 (GLIBCXX_3.4)   4 (GLIBCXX_3.4)
  05c:   5 (GLIBC_2.2.5)   2 (GLIBC_2.3.4)   c (GLIBC_2.14)    3 (GLIBC_2.2.5)
  060:   5 (GLIBC_2.2.5)   3 (GLIBC_2.2.5)   5 (GLIBC_2.2.5)   2 (GLIBC_2.3.4)
  064:   5 (GLIBC_2.2.5)   3 (GLIBC_2.2.5)   3 (GLIBC_2.2.5)   4 (GLIBCXX_3.4)
... <snip> ...
Table 2: .gnu.version table, from readelf -V

Version needs section '.gnu.version_r' contains 5 entries:
 Addr: 0x0000000000402e58  Offset: 0x002e58  Link: 6 (.dynstr)
  000000: Version: 1  File: libgcc_s.so.1  Cnt: 1
  0x0010:   Name: GCC_3.0  Flags: none  Version: 14
  0x0020: Version: 1  File: libdl.so.2  Cnt: 1
  0x0030:   Name: GLIBC_2.2.5  Flags: none  Version: 11
  0x0040: Version: 1  File: libpthread.so.0  Cnt: 1
  0x0050:   Name: GLIBC_2.2.5  Flags: none  Version: 5
  0x0060: Version: 1  File: libstdc++.so.6  Cnt: 3
  0x0070:   Name: CXXABI_1.3  Flags: none  Version: 8
  0x0080:   Name: GLIBCXX_3.4.15  Flags: none  Version: 6
  0x0090:   Name: GLIBCXX_3.4  Flags: none  Version: 4
  0x00a0: Version: 1  File: libc.so.6  Cnt: 7
  0x00b0:   Name: GLIBC_2.4  Flags: none  Version: 13
  0x00c0:   Name: GLIBC_2.14  Flags: none  Version: 12
  0x00d0:   Name: GLIBC_PRIVATE  Flags: none  Version: 10
  0x00e0:   Name: GLIBC_2.3.2  Flags: none  Version: 9
  0x00f0:   Name: GLIBC_2.3  Flags: none  Version: 7
  0x0100:   Name: GLIBC_2.2.5  Flags: none  Version: 3
  0x0110:   Name: GLIBC_2.3.4  Flags: none  Version: 2
Table 3: .gnu.version_r table, from readelf -V

The three tables there are:

  1. .dynsym table. This contains the information about all symbols required for dynamic linking. Somewhere in this table lurks the name of the function that depends on that new glibc.

    In this example, it's memcpy@GLIBC_2.14. You can search for all offending functions by using grep. Please note that the number in parentheses - for memcpy it's (12).

  2. .gnu.version table. This contains the versioning information for all dynamic symbols. Every symbol listed in .dynsym will have a corresponding entry here.

    For example, both this table and .dynsym shows 265 entries. Also, the memcpy function is the 94th entry of .dynsym. 94 in hex is 0x5e, and we find that 0x5e-th entry of this table contains GLIBC_2.14.

    Also note that in front of GLIBC_2.14, the 0x5e-th entry also shows c, which is a hexadecimal 0x0c, which in decimal is 12 - this is the same value you saw in parentheses in the memcpy function. That is not a co-incidence.

  3. .gnu.version_r table - This table contains the library versions that is required by the binary (hence the _r suffix). Every entries shows the version name (GLIBC_2.2.5, GLIBC_2.14, etc) and the "Version number" at the end of it. The "Version" number is a misnomer, it's actually more of an index that other tables can use to refer to it.

    If you look at it, GLIBC_2.14 has "Version" (actually index) 12. This is the same 12 or 0x0c that earlier tables referred to.
    .

In short, a binary has a list of imported symbols (table 1). Each imported symbol has a version, and the "index" to the version information is given by table 2. The actual version itself is actually given by table 3. Well, it actually doesn't stop here - table 3 actually only contains version records, but the actual strings of the version (GLIBC_2.14) is stored elsewhere (in .dynstr table) but that is not important for our purpose.

Now on the actual steps.


Step 1: Weaken the version dependency

To weaken the dependency, we need to mark the version GLIBC_2.14 as optional, that is WEAK. The table that contains this information (thus the table that we want to patch) is Table 3, .gnu.version_r.

To do this, we notice a few facts:

  1. The table .gnu.version_r starts at file offset 0x002e58.

  2. The symbol GLIBC_2.14 is located at offset 0x00c0 within this table - or, at file offset 0x002f18 (0x002e58 + 0x00c0).

  3. There are no flags for GLIBC_2.14 symbol. Thus we would expect the storage location that holds this flag would contain zeros.

  4. The location pointed at 0x00c0 would be those of the Elfxx_Vernaux structure (see here for details).

  5. The flags (vna_flags) is the second member of the structure (where Elfxx_Word is 32-bit integer alias 4 bytes, and Elfxx_Half is 16-bit - two bytes).

  6. Thus the flags will be located at 0x04 bytes after the start of the table, and spans two bytes. With the above example, the flags would be located at 0x002f18 + 0x04 or at file offiset 0x002f1c.

Get your favorite hex editor and start editing the binary. Get to the location pointed by the final point above (0x0021fc). At it turns out, for my example binary, the two bytes starting at that file offset is 0x0000 - and this corresponds to "Flags: None" - so we are at the correct place.

To make the symbol weak, just put 0x2 there (0x02 corresponds to VER_FLG_WEAK), then save the file. Then we check the version again using "readelf -V xxx".

We've now got:

Version needs section '.gnu.version_r' contains 5 entries:
 Addr: 0x0000000000402e58  Offset: 0x002e58  Link: 6 (.dynstr)
  000000: Version: 1  File: libgcc_s.so.1  Cnt: 1
  0x0010:   Name: GCC_3.0  Flags: none  Version: 14
... <snip> ...
  0x0090:   Name: GLIBCXX_3.4  Flags: none  Version: 4
  0x00a0: Version: 1  File: libc.so.6  Cnt: 7
  0x00b0:   Name: GLIBC_2.4  Flags: none  Version: 13
  0x00c0:   Name: GLIBC_2.14  Flags: WEAK   Version: 12
  0x00d0:   Name: GLIBC_PRIVATE  Flags: none  Version: 10
... <snip> ...

Notice that GLIBC_2.14 is now marked as WEAK, just as we told it so. Now try to run the binary. It will still not run, but now we've got two messages: (notice the mention of "weak version").

myprogram: /lib/libc.so.6: weak version `GLIBC_2.14' not found (required by myprogram)
symbol memcpy, version GLIBC_2.14 not defined in file libc.so.6 with link time reference

It looks as if we made the situation worse (from one error now it becomes two errors), but the fact that the program is complaining about missing memcpy is in fact good news compared to when it is not complaining at all (which means that the prorgram was aborted before it even had the chance to check for missing dependency).

So we are on the right track, and on to step 2.


Step 2: Add the missing functions

Obviously, our next step is to supply the missing memcpy function.

There are many ways to do this, the easiest being creating our own shared library that implements those missing functions and use LD_PRELOAD to load it before the binary runs.

For this example, you want to implement the proper memcpy function. If you read that glibc 2.14's version of memcpy actually does, you will now that it is actually identical to memmove.

So this will do

#include <string.h>
void* memcpy(void *dest, const void *src, size_t n) {
	write(1,"yes\n",4);
	return memmove(dest, src, n);
}

And build it like this:

gcc -s -shared -o mylib.so -fPIC -fno-builtin mylib.c

And run it the binary like this:

LD_PRELOAD=./mylib.so myprogram

And you still get the message

myprogram: /lib/libc.so.6: weak version `GLIBC_2.14' not found (required by myprogram)

but the application runs !!


Another alternative, more tedious and less reliable is to determine whether there is another version of the same function from older glibc version, and patch it so that the binary uses that function instead of the version from the newer glibc.

For memcpy, obviously it should, since memcpy is a well-known function and has existed for years. There is a memcpy version that existed since glibc 2.2.5 (aka, memcpy@GLIBC_2.2.5).

This is however is a risky business because there is a reason why those functions are superseded. For memcpy example, the old version is not reliable if the source and destination memory areas overlap. The 2.14 version supposedly maps this to memmove (which does not have this restriction but slower), and that's what I did with my earlier LD_PRELOAD example.

However, if you're game to take the risk, then read on.

To redirect a function call to an older copy of itself:

  1. You must obviously ensure that an older copy exists.

    For the memcpy example there is an older copy from GLIBC_2.2.5 (memcpy@GLIBC_2.2.5).

  2. You must ensure that the symbol version GLIBC_2.2.5 for the library that you want to use (in this case, libc.so.6) exist. Check this in the 3rd table ( .gnu.version_r table).

    For our example, again, we find that this is the case. libc.so.6 does contains GLIBC_2.2.5, it is located at offset 0x0100 (not important) and its "Version" (aka index) is 3 (this is what we need to know).

  3. The table that contain the information and thus the table that we want to patch is Table 2, .gnu.version. Its file offset is 0x002c42.

  4. The function that we want to patch is memcpy, from Table 1 ( .dynsym) we know that it is the 94th (or 0x5e-th) entry.

  5. Each entry is an Elfxx_Half, or 16 bits, or two bytes. So 0x5e-th entry correspond to an offset of 0xbc.

  6. So the file offset of the version index for memcpy is located at 0x002c42 + 0xbc, or at 0x02cfe.

  7. That entry, if you look with your favorite hex editor, should have the value of 0x0c (12) - which means it currently points to GLIBC_2.14.

  8. You want to change the value to 0x03 - which points to GLIBC_2.2.5

Well, if you do that properly, and then run readelf -sV myprogram, you will see the following:

Symbol table '.dynsym' contains 265 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
... <snip> ...
    93: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __vsnprintf_chk@GLIBC_2.3.4 (2)
    94: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND memcpy@GLIBC_2.2.5 (3)
    95: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND scandir64@GLIBC_2.2.5 (3)
... <snip> ...

Version symbols section '.gnu.version' contains 265 entries:
 Addr: 0000000000402c42  Offset: 0x002c42  Link: 5 (.dynsym)
... <snip> ...
  058:   4 (GLIBCXX_3.4)   4 (GLIBCXX_3.4)   4 (GLIBCXX_3.4)   4 (GLIBCXX_3.4)
  05c:   5 (GLIBC_2.2.5)   2 (GLIBC_2.3.4)   3 (GLIBC_2.2.5)   3 (GLIBC_2.2.5)
  060:   5 (GLIBC_2.2.5)   3 (GLIBC_2.2.5)   5 (GLIBC_2.2.5)   2 (GLIBC_2.3.4)
... <snip> ...

As you can see, memcpy is now pointing at GLIBC_2.2.5. You can now run the binary without any LD_PRELOAD or self-built shared library. You still get the warning about missing WEAK version but the program now runs. Whether it runs successfully or not, it depends on what assumption was used by the original program on memcpy.

For my particular case, it happened that re-directing memcpy to its older self worked quite well too; it lasted until I finally upgraded my OS and left glibc 2.13 behind.

Final Notes

The method explained in this article is not the only one. There are many other ways to do same; most of them will recommend to:
  1. Upgrade glibc (if possible).
  2. Install a newer version of glibc in a non-standard location and wrap the binaries with scripts that start them with an explicit call to the dynamic linker (ld-linux.so).
  3. Install a newer version of glibc in a chroot and run the applications in the chroot.
They all have their own merits and should definitely be considered. The method I outlined above is one which is reserved for the last resort when none of the above (and others) are possible for whatever reasons.

References