PE Header Fundamentals: The First Step in Malware Analysis
This beginner-friendly guide introduces the fundamentals of the Portable Executable (PE) format, focusing on how the PE header and its various sections work.
PE Header Fundamentals: The First Step in Malware Analysis
This guide covers the basics of the Portable Executable (PE) format, including PE headers, imports, exports, and detecting packing or obfuscation and using tools like Detect It Easy (DIE) to analyze Windows executables.
What is the Portable Executable (PE) Format?
- The format of a file can reveal a lot of information about the functionality of the program. The Portable Executable (PE) file format is used by Windows executables, object code, and DLLs. The PE file format is a data structure that contains the information necessary for the Windows OS loader to manage the wrapped executable code. Nearly every file with executable code that is loaded by Windows is in the PE file format. Most of the malware we analyze is built for the Windows OS, so understanding how to extract information from PE files is critical.
PE Files begin with a header which includes information about the code, the type of application, required library functions, and space requirements. The information in a PE header is very valuable for formulating an effective approach to analyzing that specific binary.
Running the “file” command on a file is a quick way to determine what type of file you’re dealing with.
What are PE Headers?
- The PE file format contains a header followed by a series of sections. The header contains metadata about the file itself. Follwoing the header are the actual sections of the file, each of which contains useful information.
Important Sections of the PE File
- .text: This section contains the instructions that the CPU executes. Typically this is the only section that can execute and it should be the only section that includes code.
- .rdata: The .rdata section typically contains the import and export information. This section can also store other read-only data used by the program.
- .data: The .data section contains the programs global data. Global data is accessible from anywhere in the program.
- .rsrc: This section includes the resources used by the executable. This includes things like icons, images, menus, and strings.
PE File Sections Table
Section Name | Description |
---|---|
.text | Contains the executable code |
.rdata | Holds read-only data that is globally accessible within the program |
.data | Stores global data accessed throughout the program |
.idata | Stores the import function information (Not always present) |
.edata | Stores the export function information (Not always present) |
.pdata | Present only in 62-bit executables and stores exception-handling information |
.rsrc | Stores resources needed by the executable (images, icons, etc.) |
.reloc | Contains information for relocation of library files |
Common Tools for PE Analysis
PEStudio
PEStudio is a powerful static analysis tool that quickly scans Portable Executable (PE) files for suspicious indicators. It highlights metadata like imported libraries, strings, sections, and resources, and flags anomalies such as blacklisted functions, unsigned files, or unusual entropy levels. It’s often used as a first step to determine whether a file warrants deeper analysis.PEview
PEview is a lightweight tool that allows you to explore the structure of a PE file in detail. It breaks down headers, sections, imports, and exports in a clean and readable format. While it doesn’t analyze the file’s behavior, it’s excellent for understanding the internal organization of an executable and identifying unusual modifications.DIE (Detect It Easy)
DIE, short for “Detect It Easy,” is a versatile tool that identifies packers, cryptors, and compilers within PE files. It supports multiple file formats and provides detailed signatures and entropy information to help analysts determine if a file has been packed or obfuscated. Its straightforward interface and flexibility make it a great alternative to older tools like PEiD.
Examining PE Files with DIE (Detect It Easy)
- [Overview of image file headers and how to analyze them with DIE]
Imports and Exports of a PE File
- Imports
- Imports are functions or libraries that the executable relies on to interact with the Windows operating system. These functions are loaded from external DLLs (Dynamic Link Libraries), enabling the binary to perform system-level operations, such as file handling, network communication, or memory management.
- Exports
- Exports are functions or symbols made available by a binary for use by other executables or libraries. These functions are typically provided by DLLs to share functionality, such as APIs or reusable code, with other programs. Exported functions allow external programs to call into the binary and use its features.
- Common DLLs and Their Purposes
Kernel32.dll
Provides core Windows API functionality such as memory management, file operations, and process/thread creation. Almost all Windows programs rely on this DLL.Advapi32.dll
Handles advanced system operations like registry access, service control, and security-related tasks such as user authentication and cryptographic functions.User32.dll
Manages user interface components like windows, buttons, and mouse input. It is crucial for GUI-based applications.Gdi32.dll
Supports Graphics Device Interface (GDI) functions, which allow programs to create and manipulate visual objects like fonts, images, and shapes for rendering in the UI.Ntdll.dll
Acts as a bridge to the Windows kernel, providing low-level system functions. Malware often interacts with Ntdll.dll directly for stealthy operations, bypassing standard API calls.Wsock32.dll/Ws2_32.dll
Provides networking functionality via the Winsock API. Used for socket programming to enable TCP/IP communication between computers. Ws2_32.dll is the updated version and more commonly used in modern programs.Wininet.dll
Enables Internet-related functions such as HTTP and FTP communication. It is frequently used in applications that require downloading or uploading data over the web.
For more information check out this awesome resource MalAPI.io
- Function Naming Conventions
- Functions with “W” in Their Name
In Windows API functions, the “W” at the end of a function name indicates that it uses the wide-character (Unicode) encoding. For example:
- Functions with “W” in Their Name
CreateFileW
expects string parameters in Unicode format, providing better support for internationalization.
The counterpart to these functions are those with an “A”, which use ANSI encoding for string parameters (e.g.,
CreateFileA
).Windows internally defaults to Unicode-based functions (
W
versions) to avoid issues with character encoding and ensure compatibility with modern applications.
Static Analysis of a Malware Sample in Practice
- [Show a sample being analyzed, link to YouTube video timestamp]
- [Example with fewer imports indicating packing or obfuscation]
- [Explain virtual size vs. actual size in sections]
Packed and Obfuscated Malware
Obfuscation
Malware authors use obfuscation to hide the true intent of their code and make analysis more difficult. Obfuscation can involve renaming variables and functions to meaningless names, inserting junk code, or using encryption to disguise strings and instructions. These techniques hinder reverse engineers by making it harder to understand the malware’s functionality and detect malicious behavior during static analysis.
Packing
Packing is a technique used by malware authors to compress or encode the original code of an executable, often wrapping it in a loader program. This process reduces the visibility of the executable’s contents, making it harder for analysts and antivirus programs to inspect. During execution, the loader unpacks the original code in memory, where it is then executed.
Indicators of Packing
Packed malware often exhibits certain characteristics that can raise red flags during analysis, such as:
- Minimal or encrypted strings: Strings like function names, URLs, or error messages may be missing or appear as gibberish.
- Odd section sizes: Sections such as
.text
,.data
, or.rdata
might have unusually large or small sizes compared to typical programs. - High entropy: Packed sections are highly compressed or encrypted, leading to higher entropy (randomness).
- Suspicious imports: A packed executable might import only a few functions, like
LoadLibrary
orGetProcAddress
, to dynamically resolve additional APIs at runtime.
Detecting and Unpacking
To analyze packed malware, identifying the packer used is often the first step. Tools like PEiD, Detect It Easy (DIE), or Exeinfo PE can help recognize common packers. Once identified, unpacking involves these basic steps:
- Dynamic Analysis: Run the malware in a controlled environment (sandbox or VM) to observe its behavior and extract the unpacked code from memory.
- Breakpoint Placement: Use debuggers like x64dbg to set breakpoints at key instructions (e.g.,
VirtualAlloc
,LoadLibrary
). - Manual Dumping: After the malware unpacks itself in memory, dump the memory to obtain the original executable for further analysis.
Unpacking is an essential skill in malware analysis, as it allows analysts to strip away layers of obfuscation and examine the true functionality of malicious code.
Additional Visual Aids
- [Optional Additional Graphics:
- Timeline/flowchart of a static analysis process
- Diagram of the layered structure of a PE file]
PE Header Summary
The PE header provides critical information about the structure and behavior of a Portable Executable (PE) file. Here are the key points to understand:
Imports: Define the external functions and libraries the executable relies on, such as DLLs used for system operations (e.g.,
Kernel32.dll
,User32.dll
).Exports: List the functions or symbols that the executable makes available for other programs, often used in shared libraries like DLLs.
Timestamp: Indicates when the executable was compiled, though it can be manipulated by malware authors to evade detection or analysis.
Sections: Represent logical divisions of the executable, such as
.text
(code),.data
(initialized data), and.rdata
(read-only data). Analyzing these sections can reveal packed or obfuscated content.Subsystems: Specify the environment required for execution, such as
Windows GUI
for graphical applications orWindows CUI
for console programs.Resources: Contain embedded assets like icons, images, dialogs, or version information. Malware often hides additional payloads or configuration data in resource sections.
Additional Resources & Next Steps
- [Links to reference materials, tools, and further reading]
Additional Suggestions:
- Extra Sections to Consider:
- Real-World Case Study: A short section on an infamous malware sample and how its PE structure revealed crucial details.
- Additional Custom Graphics:
- A simplified flowchart showing the steps an analyst might take during initial static analysis of a PE file.