Mar 22, 2011

Windows PE Header

Understanding Windows PE Header is essential to perform Reverse Engineering Malware.

Each executable file has a COFF (Common Object File Format), which is used from the OS loader to run the program. Windows Portable Executable (PE) is one of the COFF available in Windows Operating System, while the Executable Linking File (ELF) is the main Linux COFF.

Microsoft migrated to the PE format with the introduction of the Windows NT 3.1 operating system. PE/COFF headers still include an MS-DOS executable program, which is by default a stub that displays the simple message "This program cannot be run in DOS mode" (or similar). PE also continues to serve the changing Windows platform. Some extensions include the .NET PE format (see below), a 64-bit version called PE32+ (sometimes PE+), and a specification for Windows CE.

MZ are the first 2 bytes you will see in any PE file opened in a hex editor. The DOS header occupies the first 64 bytes of the file - ie the first 4 rows seen in the hexeditor in the picture below. The last DWORD before the DOS stub begins contains 00h 01h 00h 00h, which is the offset where the PE header begins. The DOS stub is the piece of software that runs if the executable is run from DOS environment (for example DOS shell). For retro-compatibility it often executes a printf("This program must be run under Win32");. 

The PE header begins with its signature 50h, 45h, 00h, 00h (the letters "PE" followed by two terminating zeroes). If in the Signature field of the PE header, you find an NE signature here rather than a PE, you're working with a 16-bit Windows New Executable file. Likewise, an LE in the signature field would indicate a Windows 3.x virtual device driver (VxD). An LX here would be the mark of a file for OS/2 2.0. FileHeader is the next 20 bytes of the PE file and contains info about the physical layout & properties of the file e.g. number of sections. OptionalHeader is always present and forms the next 224 bytes. It contains info about the logical layout inside the PE file e.g. AddressOfEntryPoint. Its size is given by a member of FileHeader. The structures of these members are also defined in windows.inc.

Not all these section must be used, but you need to modify the NumberOfSections to add or delete sections from a PE file. The best way to analyze those section is by using PEExplorer or PEID.

EntryPoint is The Relative Virtual Addresses (RVA) of the first instruction that will be executed when the PE loader is ready to run the PE file. If you want to divert the flow of execution right from the start, you need to change the value in this field to a new RVA and the instruction at the new RVA will be executed first. Executable packers usually redirect this value to their decompression stub, after which execution jumps back to the original entry point of the app the OEP. Of further note is the Starforce protection in which the CODE section is not present in the file on disk but is written into virtual memory on execution.

ImageBase is the preferred load address for the PE file. For example, if the value in this field is 400000h, the PE loader will try to load the file into the virtual address space starting at 400000h. The word "preferred" means that the PE loader may not load the file at that address if some other module already occupied that address range. In 99% of cases it is 400000h.

SectionAlignment is the granularity of the alignment of the sections in memory. For example, if the value in this field is 4096 (1000h), each section must start at multiples of 4096 bytes. If the first section is at 401000h and its size is 10 bytes, the next section must be at 402000h even if the address space between 401000h and 402000h will be mostly unused.

FileAlignment is the granularity of the alignment of the sections in the file. For example, if the value in this field is 512 (200h), each section must start at multiples of 512 bytes. If the first section is at file offset 200h and the size is 10 bytes, the next section must be located at file offset 400h: the space between file offsets 522 and 1024 is unused/undefined.

SizeOfImage is the overall size of the PE image in memory. It's the sum of all headers and sections aligned to SectionAlignment.

SizeOfHeaders is the size of all headers + section table. In short, this value is equal to the file size minus the combined size of all sections in the file. You can also use this value as the file offset of the first section in the PE file.

DataDirectory It is the final 128 bytes of OptionalHeader, which in turn is the final member of the PE header IMAGE_NT_HEADERS. DataDirectory is an array of 16 IMAGE_DATA_DIRECTORY structures, 8 bytes apiece, each relating to an important data structure in the PE file. Each array refers to a predefined item, such as the import table. The structure has 2 members which contain the location and size of the data structure in question: VirtualAddress is the relative virtual address (RVA) of the data structure , and isize contains the size in bytes of the data structure.



Windows PE Header


References: