Extreme C
上QQ阅读APP看书,第一时间看更新

Executable Object Files

Now, it's time to talk about executable object files. You should know by now that executable object file is one of the final products of a C project. Like relocatable object files, they have the same items in the:; the machine-level instructions, the values for initialized global variables, and the symbol tabl;t however, the arrangement can be different. We can show this regarding the ELF executable object files since it would be easy to generate them and study their internal structure.

In order to produce an executable ELF object file, we continue with example 3.1. In the previous section, we generated relocatable object files for the two sources existing in the example, and in this section, we are going to link them to form an executable file.

The following commands do that for you, as explained in the previous chapter:

$ gcc funcs.o main.o -o ex3_1.out

$

Shell Box 3-5: Linking previously built relocatable object files in example 3.1

In the previous section, we spoke about sections being present in an ELF object file. We should say that more sections exist in an ELF executable object file, but together with some segments. Every ELF executable object file, and as you will see later in this chapter, every ELF shared object file, has a number of segments in addition to sections. Each segment consists of a number of sections (zero or more), and the sections are put into segments based on their content.

For example, all sections containing machine-level instructions go into the same segment. You will see in Chapter 4, Process Memory Structure, that these segments nicely map to static memory segments found in the memory layout of a running process.

Let's look at the contents of an executable file and meet these segments. Similarly, to relocatable object files, we can use the same command to show the sectios, and the segments found in an executable ELF object file.

$ readelf -hSl ex3_1.out

ELF Header:

Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00

Class: ELF64

Data: 2's complement, little endian

Version: 1 (current)

OS/ABI: UNIX - System V

ABI Version: 0

Type: DYN (Shared object file)

Machine: Advanced Micro Devices X86-64

Version: 0x1

Entry point address: 0x4f0

Start of program headers: 64 (bytes into file)

Start of section headers: 6576 (bytes into file)

Flags: 0x0

Size of this header: 64 (bytes)

Size of program headers: 56 (bytes)

Number of program headers: 9

Size of section headers: 64 (bytes)

Number of section headers: 28

Section header string table index: 27

Section Headers:

[Nr] Name Type Address Offset

Size EntSize Flags Link Info Align

[ 0] NULL 0000000000000000 00000000

0000000000000000 0000000000000000 0 0 0

[ 1] .interp PROGBITS 0000000000000238 00000238

000000000000001c 0000000000000000 A 0 0 1

[ 2] .note.ABI-tag NOTE 0000000000000254 00000254

0000000000000020 0000000000000000 A 0 0 4

[ 3] .note.gnu.build-i NOTE 0000000000000274 00000274

0000000000000024 0000000000000000 A 0 0 4

...

[26] .strtab STRTAB 0000000000000000 00001678

0000000000000239 0000000000000000 0 0 1

[27] .shstrtab STRTAB 0000000000000000 000018b1

00000000000000f9 0000000000000000 0 0 1

Key to Flags:

W (write), A (alloc), X (execute), M (merge), S (strings), I (info),

L (link order), O (extra OS processing required), G (group), T (TLS),

C (compressed), x (unknown), o (OS specific), E (exclude),

l (large), p (processor specific)

Program Headers:

Type Offset VirtAddr PhysAddr

FileSiz MemSiz Flags Align

PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040

0x00000000000001f8 0x00000000000001f8 R 0x8

INTERP 0x0000000000000238 0x0000000000000238 0x0000000000000238

0x000000000000001c 0x000000000000001c R 0x1

[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]

...

GNU_EH_FRAME 0x0000000000000714 0x0000000000000714 0x0000000000000714

0x000000000000004c 0x000000000000004c R 0x4

GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000

0x0000000000000000 0x0000000000000000 RW 0x10

GNU_RELRO 0x0000000000000df0 0x0000000000200df0 0x0000000000200df0

0x0000000000000210 0x0000000000000210 R 0x1

Section to Segment mapping:

Segment Sections...

00

01 .interp

02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame

03 .init_array .fini_array .dynamic .got .data .bss

04 .dynamic

05 .note.ABI-tag .note.gnu.build-id

06 .eh_frame_hdr

07

08 .init_array .fini_array .dynamic .got

$

Shell Box 3-6: The ELF content of ex3_1.out executable object file

There are multiple notes about the above output:

  • We can see that the type of object file from the ELF point of vew, is a shared object file. In other words, in ELF, an executable object file is a shared object file that has some specific segments like INTERP. This segment (actually the .interp section which is referred to by this segment) is used by the loader program to load and execute the executable object file.
  • We have made four segments bold. The first one refers to the INTERP segment which is explained in the previous bullet point. The second one is the TEXT segment. It contains all the section having machine-level instructions. The third one is the DATA segment that contains all the values that should be used to initialize the global variables and other early structures. The fourth segment refers to the section that dynamic linking related information can be found. For instance, the shared object files that need to be loaded as part of the execution.
  • As you see, we've got more sections in comparison to a relocatable shared object, probably filled with data required to load and execute the object file.

As we explained in the previous section, the symbols found in the symbol table of a relocatable object file do not have any absolute and determined addresses. That's because the sections containing machine-level instructions are not linked yet.

In a deeper sense, linking a number of relocatable object files is actually to collect all similar sections from the given relocatable object files and put them together to form a bigger section, and finally put the resulting section into the output executable or the shared object file. Therefore, only after this step, the symbols can be finalized and obtain the addresses that are not going to change. In executable object files, the addresses are absolute, while in shared object files, the relative addresses are absolute. We will discuss this more in the section dedicated to dynamic libraries.

Let's look at the symbol table found in the executable file ex3_1.out. Note that the symbol table has many entries and that's why the output is not fully shown in the following shell box:

$ readelf -s ex3_1.out

Symbol table '.dynsym' contains 6 entries:

Num: Value Size Type Bind Vis Ndx Name

0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND

...

5: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@GLIBC_2.2.5 (2)

Symbol table '.symtab' contains 66 entries:

Num: Value Size Type Bind Vis Ndx Name

0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND

...

45: 0000000000201000 0 NOTYPE WEAK DEFAULT 22 data_start

46: 0000000000000610 47 FUNC GLOBAL DEFAULT 13 max_3

47: 0000000000201014 4 OBJECT GLOBAL DEFAULT 22 b

48: 0000000000201018 0 NOTYPE GLOBAL DEFAULT 22 _edata

49: 0000000000000704 0 FUNC GLOBAL DEFAULT 14 _fini

50: 00000000000005fa 22 FUNC GLOBAL DEFAULT 13 max

51: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main@@GLIBC_

...

64: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@@GLIBC_2.2

65: 00000000000004b8 0 FUNC GLOBAL DEFAULT 10 _init

$

Shell Box 3-7: The symbol tables found in the ex3_1.out executable object file

As you see in the preceding shell box, we have two different symbol tables in an executable object file. The first one, .dynsym, contains the symbols that should be resolved when loading the executable, but the second symbol table, .symtab, contains all the resolved symbols together with unresolved symbols brought from the dynamic symbol table. In other words, the symbol table contains the unresolved symbols from the dynamic table as well.

As you see, the resolved symbols in the symbol table have absolute corresponding addresses that they have obtained after the linking step. The addresses for max and max_3 symbols are shown in bold font.

In this section, we took a brief look into the executable object file. In the next section, we are going to talk about static libraries.