Seamonkey illegal instruction

The Seamonkey compilation for FatdogArm is finally done. I must have compiled it over 10 times - I lost count already. Once I overcame the libxul.so linking failure problem, everything went smooth - except that the resulting binary refused to run, with a simple message "illegal instructions" or sometimes "segmentation fault". And this happened for all subsequent compiles, no matter what configure flag I used.

I went through 4 or 5 more compiles before I realised what was wrong, and this is the story.

Once I was quite certain that the crash had nothing to do with the configure flags, I tried to use gdb to figure out the crash, but it didn't really help. It didn't give meaningful stack trace and it refused to disassemble the location that contains the illegal instruction. I can't even put a breakpoint on or near the crash location.

Then I ran strace. It gave me something useful - the last system call called before the crash (which is, open /proc/self/auxv). There were a few locations in seamonkey code base that do this, but they all seem a bit illogical to me (a few of the location are the libs which didn't even get compiled in because of the configure flags I used).

Then I looked at the content of /proc/self/auxv itself. Among others, it gave me the location of the base address of the dynamic linker (ld-linux.so). Comparing that address with the addresses shown in strace, it became clear that those strace calls - including the last one before the crash - were not seamonkey's; they were in fact calls made by the dynamic linker. This was confirmed by looking at the /proc/self/map - those adderess were indeed mapped to ld-linux.so. What does it mean? It means the the crash happened before seamonkey itself started to run; it happened in the dynamic linker itself, when it was preparing seamonkey for execution.

The only reason for that crash to happen is that when the executable (or one of the dynamic libraries) is/are corrupted.

But who or what can corrupt freshly compiled binaries? It can't be gcc or the linker (unless somehow seamonkey managed to trigger very subtle and obscure toolchain bugs), because all other binaries I built so far works perfectly. As it turns out this was the culprit: https://wiki.mozilla.org/Elfhack.

Once I realised this, it was pretty straightforward to put "--disable-elf-hack" to the configure flags. The resulting seamonkey worked very well - indeed, now the Calendar function is working too (it didn't work in 2.19). I wonder what has happened between 2.19 and 2.20, because I certainly didn't use that switch when I built 2.19 and 2.19 compiled cleanly on the first attempt. I could have tried to look at the differences but for now I'm happy that SM 2.20 works.

SM 2.20 will be the default browser in alpha2 release of FatdogArm.



Posted on 12 Sep 2013, 17:58 - Categories: FatdogArm Linux Arm
Edit - Delete


No comments posted yet.

Add Comment

Title
Author
 
Content
Show Smilies
Security Code 5682856
Mascot of Fatdog64
Password (to protect your identity)