Post

PE Header Fundamentals: The First Step in Malware Analysis

This beginner-friendly guide introduces the fundamentals of the Portable Executable (PE) format, focusing on how the PE header and its various sections work.

PE Header Fundamentals: The First Step in Malware Analysis

ASTRA Labs Logo

PE Header Fundamentals: The First Step in Malware Analysis

This guide covers the basics of the Portable Executable (PE) format, including PE headers, imports, exports, and detecting packing or obfuscation and using tools like Detect It Easy (DIE) to analyze Windows executables.


What is the Portable Executable (PE) Format?

  • The format of a file can reveal a lot of information about the functionality of the program. The Portable Executable (PE) file format is used by Windows executables, object code, and DLLs. The PE file format is a data structure that contains the information necessary for the Windows OS loader to manage the wrapped executable code. Nearly every file with executable code that is loaded by Windows is in the PE file format. Most of the malware we analyze is built for the Windows OS, so understanding how to extract information from PE files is critical.

PE Files begin with a header which includes information about the code, the type of application, required library functions, and space requirements. The information in a PE header is very valuable for formulating an effective approach to analyzing that specific binary.

Running the “file” command on a file is a quick way to determine what type of file you’re dealing with.

FileCommandOutput


What are PE Headers?

  • The PE file format contains a header followed by a series of sections. The header contains metadata about the file itself. Follwoing the header are the actual sections of the file, each of which contains useful information.

Important Sections of the PE File

  • .text: This section contains the instructions that the CPU executes. Typically this is the only section that can execute and it should be the only section that includes code.
  • .rdata: The .rdata section typically contains the import and export information. This section can also store other read-only data used by the program.
  • .data: The .data section contains the programs global data. Global data is accessible from anywhere in the program.
  • .rsrc: This section includes the resources used by the executable. This includes things like icons, images, menus, and strings.

PE File Format

PE File Sections Table

Section NameDescription
.textContains the executable code
.rdataHolds read-only data that is globally accessible within the program
.dataStores global data accessed throughout the program
.idataStores the import function information (Not always present)
.edataStores the export function information (Not always present)
.pdataPresent only in 62-bit executables and stores exception-handling information
.rsrcStores resources needed by the executable (images, icons, etc.)
.relocContains information for relocation of library files
PE Header Diagram
Image source: corkami

Common Tools for PE Analysis

  1. PEStudio
    PEStudio is a powerful static analysis tool that quickly scans Portable Executable (PE) files for suspicious indicators. It highlights metadata like imported libraries, strings, sections, and resources, and flags anomalies such as blacklisted functions, unsigned files, or unusual entropy levels. It’s often used as a first step to determine whether a file warrants deeper analysis.

  2. PEview
    PEview is a lightweight tool that allows you to explore the structure of a PE file in detail. It breaks down headers, sections, imports, and exports in a clean and readable format. While it doesn’t analyze the file’s behavior, it’s excellent for understanding the internal organization of an executable and identifying unusual modifications.

  3. DIE (Detect It Easy)
    DIE, short for “Detect It Easy,” is a versatile tool that identifies packers, cryptors, and compilers within PE files. It supports multiple file formats and provides detailed signatures and entropy information to help analysts determine if a file has been packed or obfuscated. Its straightforward interface and flexibility make it a great alternative to older tools like PEiD.

Examining PE Files with DIE (Detect It Easy)

  • [Overview of image file headers and how to analyze them with DIE]

Imports and Exports of a PE File

  • Imports
    • Imports are functions or libraries that the executable relies on to interact with the Windows operating system. These functions are loaded from external DLLs (Dynamic Link Libraries), enabling the binary to perform system-level operations, such as file handling, network communication, or memory management.
  • Exports
    • Exports are functions or symbols made available by a binary for use by other executables or libraries. These functions are typically provided by DLLs to share functionality, such as APIs or reusable code, with other programs. Exported functions allow external programs to call into the binary and use its features.
  • Common DLLs and Their Purposes
    • Kernel32.dll
      Provides core Windows API functionality such as memory management, file operations, and process/thread creation. Almost all Windows programs rely on this DLL.

    • Advapi32.dll
      Handles advanced system operations like registry access, service control, and security-related tasks such as user authentication and cryptographic functions.

    • User32.dll
      Manages user interface components like windows, buttons, and mouse input. It is crucial for GUI-based applications.

    • Gdi32.dll
      Supports Graphics Device Interface (GDI) functions, which allow programs to create and manipulate visual objects like fonts, images, and shapes for rendering in the UI.

    • Ntdll.dll
      Acts as a bridge to the Windows kernel, providing low-level system functions. Malware often interacts with Ntdll.dll directly for stealthy operations, bypassing standard API calls.

    • Wsock32.dll/Ws2_32.dll
      Provides networking functionality via the Winsock API. Used for socket programming to enable TCP/IP communication between computers. Ws2_32.dll is the updated version and more commonly used in modern programs.

    • Wininet.dll
      Enables Internet-related functions such as HTTP and FTP communication. It is frequently used in applications that require downloading or uploading data over the web.

    For more information check out this awesome resource MalAPI.io

  • Function Naming Conventions
      • Functions with “W” in Their Name
        In Windows API functions, the “W” at the end of a function name indicates that it uses the wide-character (Unicode) encoding. For example:
    • CreateFileW expects string parameters in Unicode format, providing better support for internationalization.

    The counterpart to these functions are those with an “A”, which use ANSI encoding for string parameters (e.g., CreateFileA).

    Windows internally defaults to Unicode-based functions (W versions) to avoid issues with character encoding and ensure compatibility with modern applications.

Static Analysis of a Malware Sample in Practice

  • [Show a sample being analyzed, link to YouTube video timestamp]
  • [Example with fewer imports indicating packing or obfuscation]
  • [Explain virtual size vs. actual size in sections]

Packed and Obfuscated Malware

Obfuscation

Malware authors use obfuscation to hide the true intent of their code and make analysis more difficult. Obfuscation can involve renaming variables and functions to meaningless names, inserting junk code, or using encryption to disguise strings and instructions. These techniques hinder reverse engineers by making it harder to understand the malware’s functionality and detect malicious behavior during static analysis.


Packing

Packing is a technique used by malware authors to compress or encode the original code of an executable, often wrapping it in a loader program. This process reduces the visibility of the executable’s contents, making it harder for analysts and antivirus programs to inspect. During execution, the loader unpacks the original code in memory, where it is then executed.

PE Packing Process


Indicators of Packing

Packed malware often exhibits certain characteristics that can raise red flags during analysis, such as:

  • Minimal or encrypted strings: Strings like function names, URLs, or error messages may be missing or appear as gibberish.
  • Odd section sizes: Sections such as .text, .data, or .rdata might have unusually large or small sizes compared to typical programs.
  • High entropy: Packed sections are highly compressed or encrypted, leading to higher entropy (randomness).
  • Suspicious imports: A packed executable might import only a few functions, like LoadLibrary or GetProcAddress, to dynamically resolve additional APIs at runtime.

Detecting and Unpacking

To analyze packed malware, identifying the packer used is often the first step. Tools like PEiD, Detect It Easy (DIE), or Exeinfo PE can help recognize common packers. Once identified, unpacking involves these basic steps:

  1. Dynamic Analysis: Run the malware in a controlled environment (sandbox or VM) to observe its behavior and extract the unpacked code from memory.
  2. Breakpoint Placement: Use debuggers like x64dbg to set breakpoints at key instructions (e.g., VirtualAlloc, LoadLibrary).
  3. Manual Dumping: After the malware unpacks itself in memory, dump the memory to obtain the original executable for further analysis.

Unpacking is an essential skill in malware analysis, as it allows analysts to strip away layers of obfuscation and examine the true functionality of malicious code.

Additional Visual Aids

  • [Optional Additional Graphics:
    • Timeline/flowchart of a static analysis process
    • Diagram of the layered structure of a PE file]

PE Header Summary

The PE header provides critical information about the structure and behavior of a Portable Executable (PE) file. Here are the key points to understand:

  • Imports: Define the external functions and libraries the executable relies on, such as DLLs used for system operations (e.g., Kernel32.dll, User32.dll).

  • Exports: List the functions or symbols that the executable makes available for other programs, often used in shared libraries like DLLs.

  • Timestamp: Indicates when the executable was compiled, though it can be manipulated by malware authors to evade detection or analysis.

  • Sections: Represent logical divisions of the executable, such as .text (code), .data (initialized data), and .rdata (read-only data). Analyzing these sections can reveal packed or obfuscated content.

  • Subsystems: Specify the environment required for execution, such as Windows GUI for graphical applications or Windows CUI for console programs.

  • Resources: Contain embedded assets like icons, images, dialogs, or version information. Malware often hides additional payloads or configuration data in resource sections.

Additional Resources & Next Steps

  • [Links to reference materials, tools, and further reading]

Additional Suggestions:

  • Extra Sections to Consider:
    • Real-World Case Study: A short section on an infamous malware sample and how its PE structure revealed crucial details.
  • Additional Custom Graphics:
    • A simplified flowchart showing the steps an analyst might take during initial static analysis of a PE file.
This post is licensed under CC BY 4.0 by the author.