Preparing Native Compiler

This article will highlight of what it takes to build a self-hosting native hard-float compiler.

The need for Native Compiler

Why the need for native compiler? Plain and simple. Many software packages are never designed with cross-compiling in mind, and will not build with a cross-compiler (unless you spend inordinate amount of time to fix it...).

Even those that do, aren't always compiled correctly with a cross-compiler. In fact, we're quite fortunate enough that the basic tools are all cross-compiler ready (stuff like util-linux, coreutils, toolchains etc).

Native compiling on ARM is a relatively new development: previously it isn't possible to do so since most ARM platform are low-power systems (as in, doesn't have much CPU power and only very little memory) incapable of hosting a native compiler.

Another intriguing way of doing a native compiling is through Qemu - without any need for a board at all. Qemu is slow, doing compilation in Qemu is even slower - yet it may beat the native compiling on the real ARM machine! Surprised? This interesting idea is treated in great length at Rob Landley's Aboriginal Linux - I'm not going to spoil the fun for you. In fact, the first native compiler I use (in order to short-circuit the process) is the one from Aboriginal Linux.

The Method We Use

The proper way of bootstrapping a native compiler is like this:

  1. create a cross-compiler
  2. use the cross-compiler to build the native compiler

Like what we did in FirstBoot, we are going to short-circuit this.

What we will do is to use Aboriginals native compiler (not the cross-compiler we used in FirstBoot) to re-build another native compiler, following the guide from Linux From Scratch (LFS) method (version 7.1).

The reasons why I wanted to create my own native compiler instead of using Aboriginal's one are:

  1. I want to create my own libc, and have control over what's in and what's out.

    As it turns out, libc --- which is part of the toolchain --- is intricately linked with the rest of the tools, like the compiler, linker, etc. Using Aboriginal's compiler means I'm stuck with the libc built into it (uClibc) and I don't have control over how that libc was configured.

  2. Aboriginal's compiler is targeted at ARMv6, with soft-float.

    My development board and all the interesting boards I'm looking at are ARMv7, many of which come with hardware floating point unit (FPU). Properly compiled, a software that uses FPU can be up to 40% faster than those using soft-float (=sofware floating point emulation).

  3. Aboriginal's compiler is rather dated (version 4.2.1)

    While I don't need the latest toolchain (which frequently comes with bugs), I do need something newer as Aboriginal's one doesn't understand ARMv7 architecture (and thus failed to compile linux-sunxi kernel and u-boot - that's why we need the Linaro's cross compiler in FirstBoot).

The process to cross the chasm from cross-compiler to native-compiler, assuming the libc is glibc, is as follows:
  1. create a naked-binutils - binutils that will run without libc
  2. create a naked-gcc - gcc that will run without using libc
  3. create glibc from naked-gcc and naked-binutils
  4. create binutils which uses the new glibc
  5. create gcc which uses the new binutils and new glibc
  6. we can then retire the old toolchain (gcc, binutils, libc)

Requirements and Setup

  1. Aboriginal native compiler for armv6l from here. Extract it somewhere to a device that is accessible to the Mele (I extracted it to the LFS work partition).
  2. Busybox with almost all applets enabled (the equivalent of coreutils, diffutils, findutils, and util-linux must all be enabled) with the correct symlinks already setup.

My setup:

  1. I use the initrd - so all the symlinks for busybox is already setup.

  2. I use two separate storage (Mele can't boot from USB, and using SD Card for activities with a lot of read/write is slow).
    • SD Card: I use the second partition of my SD Card (formatted as ext3) as the "save partition", so that I can keep persistent settings.
    • An USB harddisk with two partitions: one for swap (1GB) and one for LFS work partition (/dev/sda3, mounted at /mnt/sda3)
    • My boot.cmd and uEnv.txt for these settings is as follows:

     setenv bootargs console=${console} ${audio} root=${root} ${rootwait} ${panicargs} loglevel=${loglevel} ${basesfs} ${savefile} ${coldplug} ${extras}
     fatload mmc 0 0x43000000 script.bin
     fatload mmc 0 0x48000000 ${kernel}
     if fatload mmc 0 0x43100000 ${initrd}; then
    	bootm 0x48000000 0x43100000;
     else
    	bootm 0x48000000;
     fi
    
    My boot.cmd

     console=tty0
     kernel=uImage
     initrd=uInitrd
     loglevel=3
     audio=hdmi.audio=EDID:0
     video=disp.screen0_output_mode=EDID:1280x720p60
     #root=/dev/mmcblk0p2
     #rootwait=rootwait
     rootwait=waitdev=3
     panicargs=panic=10
     #basesfs=basesfs=local
     savefile=savefile=direct:device:sda3:/
     #coldplug=
     extras=sunxi_ve_mem_reserve=0 sunxi_g2d_mem_reserve=0 sunxi_no_mali_mem_reserve sunxi_fb_mem_reserve=16
    
    My uEnv.txt
    .

Important note before you start building

The Mele doesn't have battery-backup for its RTC, which means that as soon as I powered it off it the time will be reset to 2010. This isn't good for building process, as the make program relies heavily on file timestamps to determine whether a file is up-to-date and whether it needs re-building. The system time must be correct (or not too far off) for that to work.

One of the way to do it to enter date-time every time the system boots-up (remember IBM PC? :) ). It gets boring pretty quickly. A better way is to save the current time before shutting down, and use this last shutdown as the system time upon restart. The places to edit is /etc/rc.d/rcS (startup) and /etc/rc.d/rcK (shutdown). Alternatively, you can setup a script which connects to the network and sync the time using NTP.

Build Steps

  1. Follow the preparation outlined of Chapter 4 of LFS 7.1

    • Setting up is different, because we use "busybox ash" instead of bash. You can use bash if you want to (it is included in Aboriginal toolchain) but I didn't use it initially because it has no history recall.

    • Make sure that Aboriginal's toolchain is accessible from within the "lfs" user (otherwise you can't build anything).

    • Copy the /lib directory of Aboriginal's toolchain to the /lib directory. This is important because the initial binaries are compiled using Aboriginal's library and will not run without it. It can be scrapped once our new toolchain becomes self-hosting (at the end of Chapter 5).

    • copy static bash from aboriginal toolchain to /bin. You don't need to symlink it to 'sh', but it has to be in /bin because many of the autotools build system will look for it there.

  2. Start with Chapter 5.

    1. Binutils pass #1 (naked binutils)
      Build as documented in LFS.

    2. GCC pass #1 (naked gcc - compiled to use naked binutils)
      Build as documented in LFS.

    3. Linux headers (for building glibc)
      Get the headers from linux-sunxi instead
       cd /path/to/linux-sunxi/kernel
       make ARCH=arm headers_install INSTALL_HDR_PATH=headers
      
      and then copy /path/to/linux-sunxi/kernel/headers/include to /tools/include.

    4. glibc pass #1 (temporary glibc - built using naked gcc and binutils)
      Get the corresponding version of glibc-ports tarball (it is glibc-ports-2.14.1 in this case) from GCC website. Expand this tarball inside glibc-2.14.1 directory and rename the extracted directory to "ports". This tarball contains the support to build glibc for ARM platform.

      Then you create the following configparms before you run configure step:

       CFLAGS += "-mfloat-abi=hard -mfpu=vfpv3"
      

    5. binutils pass #2 (temporary binutils - binaries produced by this tool will use temporary glibc)
      Edit config.guess and change "gnueabihf" to "gnueabi".

      Reason? The build process of binutils is smart enough to detect that (naked-)gcc generates hardfloat by default and change the arch-tripled to "gnueabihf". Unfortunately, the build-system of GCC doesn't recognise "gnueabihf" and will get utterly confused (generating OABI instead of EABI altogether) and will stop the build process later (when compiling libgcc).

      There is a way of fixing this, which we will apply for the final compiler (gcc pass #3), which we will build in Chapter 6.

    6. gcc pass #2 (temporary gcc - built to use temporary binutils and glibc)
      Add the following to the configure step to build for hard-float by default:
       CFLAGS_FOR_TARGET="-mfloat-abi=hard -mfpu=vfpv3" \
       ../configure ... \ # from LFS
       --with-arch=armv7-a --with-cpu=cortex-a8 --with-tune=cortex-a8 --with-float=hard --with-fpu=vfpv3
      

      By the end of this step, you have a working native temporary compiler (in /tools) which you can use to build stuff. The toolchain you have in /tools is now comparable in power to aboriginal native compiler which you used to bootstrap the process.

      Don't delete that one yet, though, because aboriginal compiler also provides other build tools like "make" and friends that you haven't built yet (you will, soon enough).

    7. Build the rest of Chapter 5's tools enough so that it is capable of building the final compiler in Chapter 6. (tools like make, m4, perl). You don't have to build all of them as many of them can be replaced by busybox's versions of the same tools (gzip, bzip2, xz, coreutils, diffutils, findutils, awk, sed, tar etc). The busybox version is definitely good enough to bootstrap Chapter 5, it is also good enough for Chapter 6. Bash will fail to build because we don't have bison/yacc, so just skip that and use the ready-made bash from Aboriginal.

      Building the test packages (Tcl, Expect, DejaGNU, and Check) is optional although it is recommended.

    8. Perl will fail to build because it expects to find errno.h in /usr/include, as noted here: http://www.linuxquestions.org/questions/linux-newbie-8/perl-compilation-failing-no-error-definitions-found-at-errno_pm-pl-922506/
      Patch ext/Errno/Errno_pm.PL to make it find errno.h in /tools/include.

      By the end of Chapter 5, you can remove Aboriginal native compiler as you have fully bootstrapped the native toolchain.

  3. On to Chapter 6

    Before you build, read the initial sections of Chapter 6 - that of package management. How do you plan to do it?

    As for me, I use paco to log the packages that I will install in Chapter 6 (which will reside in the final root filesystem instead of in /tools). This tool also enables me to create a binary tarball which I can re-use later. At the time of writing, I have not yet decided what package manager I will use (although slapt-get and gslapt sounds nice) --- the most important thing is I have the binary tarball snapshots of the new installation which I can convert into any other package format in the future. Paco comes handy for that.

    Once that step is cleared, we can continue to build. The most critical here are the steps until you can build the final compiler (step 6.17 - GCC pass 3).

    1. glibc pass #2 (final glibc - residing in root filesystem, not in /tools)
      Use the following configparms to ensure it is compiled in hard-float:
       CFLAGS += -mfloat-abi=hard -mfpu=vfpv3 -march=armv7-a -mcpu=cortex-a8 -mtune=cortex-a8 -pipe -O3
      

    2. binutils pass #3 (final binutils - binaries produced will use final glibc)
      No additional patches needed, do not patch "gnueabihf" to "gnueabi" as this time we will be patching gcc instead.

    3. gcc pass #3 (final gcc - built to use final binutils and glibc)
      Apply patch from here: http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00444.html to make it "gnueabihf" aware. Then edit and patch config.guess, replacing "gnueabi" to "gnueabihf"

      Use the configure line from LFS, just need to add this at the end so that it will use hard-float as the default:

       --with-arch=armv7-a --with-cpu=cortex-a8 --with-tune=cortex-a8 \
       --with-fpu=vfpv3 --with-float=hard
      

    4. That's it.

      You can continue to choose to build the rest of the tools: autoconf, automake, make, patch, perl, m4, etc are needed to build other packages, the rest is really up to you - again you can build busybox and use its tools, or you can build and use the full coreutils, diffutils, etc.

      I don't use the rest of LFS, e.g. Chapter 7 (bootscripts and booting process) - because my system is already bootable using busybox and Fatdog initrd. Instead, I will build more packages and get them to run inside chroot. Once this is working, I will compress the root filesystem and turn it into an SFS - the basesfs SFS of FatdogArm.

Next: BuildingApplications