Previous step is here.
Let's continue! It's time to write an unpacker, this is what we are going to do during this step. We will not process import table for now, because we have some other things to do at this lesson.
We will begin from the following thing. To operate, the unpacker definitely needs two WinAPI functions: LoadLibraryA and GetProcAddress. In my old packer (that I've written once) I developed unpacker stub in MASM32 without creating import table at all. I looked for these function addresses in kernel, which is rather complicated and hardcore, besides that, this may cause serious antivirus suspicions. This time, let's create import table and make loader to tell us these function addresses. Of course, set of these two functions in import table is as suspicious as their total absence, but nothing prevents us from adding more random imports from different .dll files in future. Where will the loader store these two function addresses? It's time to expand our packed_file_info structure!
1 2 3 4 5 6 7 8 9 10 11 |
//Structure to store packed file information struct packed_file_info { BYTE number_of_sections; //Number of original file sections DWORD size_of_packed_data; //Size of packed data DWORD size_of_unpacked_data; //Size of original data DWORD load_library_a; //LoadLibraryA procedure address from kernel32.dll DWORD get_proc_address; //GetProcAddress procedure address from kernel32.dll DWORD end_of_import_address_table; //IAT end }; |
I added three fields to the structure. The loader will add LoadLibraryA and GetProcAddress function addresses from kernel32.dll to first two fields. Last field points to the end of import address table, and we will write null to it, to let the loader know that we don't need any more functions. I will tell you more about this later.
Now we need to create a new import table. My PE library will be very handy for this. (We will forget about old original import table for now).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
//.... //Set required virtual size image.set_section_virtual_size(added_section, total_virtual_size); //.... std::cout << "Creating imports..." << std::endl; //Create kernel32.dll library imports pe_base::import_library kernel32; kernel32.set_name("kernel32.dll"); //Set library name //Create function to import pe_base::imported_function func; func.set_name("LoadLibraryA"); //Its name kernel32.add_import(func); //Add it to the library //And second function func.set_name("GetProcAddress"); kernel32.add_import(func); //Add it, too //Get load_library_a field relative address (RVA) //of our packed_file_info structure, which was placed at the beginning of //added section, remember? DWORD load_library_address_rva = pe_base::rva_from_section_offset(added_section, offsetof(packed_file_info, load_library_a)); //Set this address as //import address table (IAT) address kernel32.set_rva_to_iat(load_library_address_rva); //Create imported library list pe_base::imported_functions_list imports; //Add our library to it imports.push_back(kernel32); //Set up import rebuilder pe_base::import_rebuilder_settings settings; //We don't need original import address table (explanations will be given further) settings.build_original_iat(false); //Rewrite IAT to the address //which was specified (load_library_address_rva) settings.save_iat_and_original_iat_rvas(true, true); //Place imports right after packed data end settings.set_offset_from_section_start(added_section.get_raw_data().size()); //Rebuild imports image.rebuild_imports(imports, added_section, settings); |
The first part is clear - we created library import, added a couple of functions, created imported library list consisting of kernel32.dll only. I will explain the line, where we set RVA to IAT (kernel32.set_rva_to_iat). For each imported library the following structure is created in import table:
Loader writes imported function addresses to Import Address Table (IAT) for each imported dll, and it takes imports or imported function ordinals from Original Import Address Table (or, in other words, Import Lookup Table). We can manage without last one, for example, all Borland compilers always operate that way, they don't care about Import Lookup Table. In this case our only Import Address Table initially contains imported function ordinals or names, and there, over these data, the loader writes imported function addresses. We will not create Original Import Address Table too, we will manage without it (import tables will take less space), so let's turn off this option in imports rebuilder.
The settings.save_iat_and_original_iat_rvas call sets up the rebuilder in such way, that it will not create its own IAT and Original IAT, but write everything by the addresses, which are specified in each library (do you remember kernel32.set_rva_to_iat call?).
Then we just rebuild import table. Start not yet finished packer, pass its name as first parameter, and watch the result. Make sure that everything works as expected:
Now we start the resulting binary in OllyDbg and make sure, that the loader wrote the addresses of two required functions to the correct places:
As you can see, the addresses we need have been written to 0x1009 and 0x100D, this means everything was done correctly. (Entry point address is absolutely random yet, and there isn't any unpacker, so the file still will not run, but we achieved a lot already).
Let's go further. Now we need to prepare our sources to develop an unpacker. We move all structures from main.cpp to structs.h file, it will contain the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
#pragma once #include <Windows.h> #pragma pack(push, 1) //Structure to store packed section data struct packed_section { char name[8]; //Section name DWORD virtual_size; //Virtual size DWORD virtual_address; //Virtual address (RVA) DWORD size_of_raw_data; //Raw data size DWORD pointer_to_raw_data; //Raw data file offset DWORD characteristics; //Section characteristics }; //Structure to store packed file information struct packed_file_info { BYTE number_of_sections; //Number of original file sections DWORD size_of_packed_data; //Size of packed data DWORD size_of_unpacked_data; //Size of original data DWORD load_library_a; //LoadLibraryA procedure address from kernel32.dll DWORD get_proc_address; //GetProcAddress procedure address from kernel32.dll DWORD end_of_import_address_table; //IAT end }; #pragma pack(pop) |
There is no need to explain anything here, we just moved the code. We will include this file to main.cpp in turn:
1 2 |
//Our structures header file #include "structs.h" |
It's time for hardcore! We will develop an unpacker. I thought a little and decided not to use MASM32, but develop it in C with C++ elements and inline assembler - this will increase code readability. So, we create a new project in solution and call it unpacker. We add unpacker.cpp file to it and create parameters.h. Further we set up the project settings like we did in lzo-2.06 at first step, to make the build small and base independent. Set unpacker_main as entry point (Linker - Advanced - Entry Point). Further, in Configuration Manager (see step 1) make this project to be built always in Release configuration:
Set simple_pe_packer project dependency to unpacker project (Project Dependencies, as at step 1) and add parameters.h file to packer project includes - we will write required unpacker build parameters to this file:
1 2 |
//Header file with unpacker parameters #include "../unpacker/parameters.h" |
Now we start to develop the unpacker itself. Open unpacker.cpp...
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
//Include structures file from packer project #include "../simple_pe_packer/structs.h" //Create function without prologue and epilogue void __declspec(naked) unpacker_main() { //Create prologue manually __asm { push ebp; mov ebp, esp; sub esp, 128; } //... description is below ...// //Create epilogue manually _asm { leave; ret; } } |
So, I start the explanations. Firstly, we included the file, which contains packer structures declarations - we will need them in unpacker. Further we create entry point - the
unpacker_main procedure. Notice that this function is declared as naked. This tells the compiler not to create prologue and epilogue (stack frame) for this function automatically. We need to do this manually, and why - I will explain this at the next step. Now we create the same prologue and epilogue as created by MSVC++ compiler. "sub esp, 128" line allocates 128 bytes on stack - this will be enough for our needs. The packer will not do anything serious at this step. We need prologue and epilogue to let us allocate memory on stack without additional issues. At the end we write ret instruction - return to kernel. Now we will write most simple packer body. Let it just welcome us by displaying a Message Box with "Hello!" message.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
//Image loading address unsigned int original_image_base; //First section relative address, //in which the packer stores its information //and packed data themselves unsigned int rva_of_first_section; //These instructions are required only to //replace the addresses in unpacker builder with real ones __asm { mov original_image_base, 0x11111111; mov rva_of_first_section, 0x22222222; } |
Here we declared two local variables. First one contains actual image loading address, and the second one - relative first section address, in which, as you remember, we store all required unpacker information and packed data themselves. Using packer, we will save real values instead of 0x11111111 and 0x22222222.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
//Get pointer to structure with information //carefully prepared by packer const packed_file_info* info; //It is stored in the beginning //of packed file first section info = reinterpret_cast<const packed_file_info*>(original_image_base + rva_of_first_section); //Two LoadLibraryA and GetProcAddress function prototypes typedefs typedef HMODULE (__stdcall* load_library_a_func)(const char* library_name); typedef INT_PTR (__stdcall* get_proc_address_func)(HMODULE dll, const char* func_name); //Read their addresses from packed_file_info structure //Loader puts them there for us load_library_a_func load_library_a; get_proc_address_func get_proc_address; load_library_a = reinterpret_cast<load_library_a_func>(info->load_library_a); get_proc_address = reinterpret_cast<get_proc_address_func>(info->get_proc_address); |
It looks like everything is clear here. At the beginning of first packed file section there is packed_file_info structure, which is created by packer. It has three additional fields, which are filled by the loader itself - we set up import table in that way, as you remember. We get LoadLibraryA and GetProcAddress function addresses from these fields. You can also ask, why I declare all variables first, and assign their values later, although I could make this with one line. The thing is, that it is not possible to declare a variable and assign its value right away in naked functions.
And the last (for now) packer code part:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
//Create buffer on stack char buf[32]; //user32.dll *reinterpret_cast<DWORD*>(&buf[0]) = 'resu'; *reinterpret_cast<DWORD*>(&buf[4]) = 'd.23'; *reinterpret_cast<DWORD*>(&buf[8]) = 'll'; //Load user32.dll library HMODULE user32_dll; user32_dll = load_library_a(buf); //MessageBoxA function prototype typedef typedef int (__stdcall* message_box_a_func)(HWND owner, const char* text, const char* caption, DWORD type); //MessageBoxA *reinterpret_cast<DWORD*>(&buf[0]) = 'sseM'; *reinterpret_cast<DWORD*>(&buf[4]) = 'Bega'; *reinterpret_cast<DWORD*>(&buf[8]) = 'Axo'; //Get MessageBoxA function address message_box_a_func message_box_a; message_box_a = reinterpret_cast<message_box_a_func>(get_proc_address(user32_dll, buf)); //Hello! *reinterpret_cast<DWORD*>(&buf[0]) = 'lleH'; *reinterpret_cast<DWORD*>(&buf[4]) = '!!o'; //MessageBox call message_box_a(0, buf, buf, MB_ICONINFORMATION); |
This also should be clear in general, except weird string filling. We allocated buf buffer on stack. All our strings also should be allocated on stack - we can not write anything to data section, because it will inevitably bring to relocations occurrence, and the code will become base dependent. This is why we so absurdly write strings directly to stack buffer by 4 bytes. We also need to keep in mind backward bytes order, which is used in x86 architecture, and we write code for it, so letters in 4 byte string pieces are arranged backwards.
At first we load user32.dll library, then we get MessageBoxA procedure address from it, and then we call it. That's all with unpacker!
There is one thing left - we need to insert packer code to packed file and set it up. I decided to automate it. To do this, let's add a new project named unpacker_converter to a solution. The purpose of this project is to open the result of unpacker compilation - the unpacker.exe file, read its only section data (actually, the code) and convert it to an .h file, which we include to simple_pe_packer project. Let's set the same include directory in unpacker_converter project as in simple_pe_packer project, in order to include PE library .h files, then add main.cpp file to project and start development.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
#include <iostream> #include <fstream> #include <vector> #include <string> #include <iomanip> //PE library header file #include <pe_32_64.h> //Directives to allow linking with built PE library #ifndef _M_X64 # ifdef _DEBUG # pragma comment(lib, "../../Debug/pe_lib.lib") # else # pragma comment(lib, "../../Release/pe_lib.lib") # endif #else # ifdef _DEBUG # pragma comment(lib, "../../x64/Debug/pe_lib.lib") # else # pragma comment(lib, "../../x64/Release/pe_lib.lib") # endif #endif int main(int argc, char* argv[]) { //Usage hints if(argc != 3) { std::cout << "Usage: unpacker_converter.exe unpacker.exe output.h" << std::endl; return 0; } //Open unpacker.exe file - its name //and path are stored in argv array at index 1 std::ifstream file(argv[1], std::ios::in | std::ios::binary); if(!file) { //If file open failed - display message and exit with an error std::cout << "Cannot open " << argv[1] << std::endl; return -1; } try { std::cout << "Creating unpacker source file..." << std::endl; //Try to open the file as 32-bit PE file //Last two arguments are false, because we don't need them //"raw" file bound import data and //"raw" debug information data //They are not used while packing, so we don't load these data pe32 image(file, false, false); //Get unpacker sections list pe_base::section_list& unpacker_sections = image.get_image_sections(); //Make sure, that there is only one section (because unpacker doesn't have imports and relocations) if(unpacker_sections.size() != 1) { std::cout << "Incorrect unpacker" << std::endl; return -1; } //Get reference to this section data std::string& unpacker_section_data = unpacker_sections.at(0).get_raw_data(); //Remove null bytes at the end of this section, //which were added by compiler for alignment pe_base::strip_nullbytes(unpacker_section_data); //Оpen output .h file for writing //Its name is stored in argv[2] std::ofstream output_source(argv[2], std::ios::out | std::ios::trunc); //Start to generate the source code output_source << std::hex << "#pragma once" << std::endl << "unsigned char unpacker_data[] = {"; //Current read data length unsigned long len = 0; //Total section data length std::string::size_type total_len = unpacker_section_data.length(); //For each byte of data for(std::string::const_iterator it = unpacker_section_data.begin(); it != unpacker_section_data.end(); ++it, ++len) { //Add line endings to //provide code readability if((len % 16) == 0) output_source << std::endl; //Write byte value output_source << "0x" << std::setw(2) << std::setfill('0') << static_cast<unsigned long>(static_cast<unsigned char>(*it)); //And a comma if needed if(len != total_len - 1) output_source << ", "; } //End of code output_source << " };" << std::endl; } catch(const pe_exception& e) { //If by any reason it fails to open //Display error message and exit std::cout << e.what() << std::endl; return -1; } return 0; } |
I will not provide detailed explanation of this code - lots of it is familiar to you already. I just say that it simply generated unpacker.h file from unpacker.exe file like this:
1 2 3 4 5 6 7 8 9 10 |
#pragma once unsigned char unpacker_data[] = { 0x55, 0x8b, 0xec, 0x81, 0xec, 0x80, 0x00, 0x00, 0x00, 0xc7, 0x45, 0xfc, 0x11, 0x11, 0x11, 0x11, 0xc7, 0x45, 0xf8, 0x22, 0x22, 0x22, 0x22, 0x8b, 0x45, 0xfc, 0x03, 0x45, 0xf8, 0x8b, 0x48, 0x09, 0x8b, 0x70, 0x0d, 0x8d, 0x45, 0xd8, 0x50, 0xc7, 0x45, 0xd8, 0x75, 0x73, 0x65, 0x72, 0xc7, 0x45, 0xdc, 0x33, 0x32, 0x2e, 0x64, 0xc7, 0x45, 0xe0, 0x6c, 0x6c, 0x00, 0x00, 0xff, 0xd1, 0x8d, 0x4d, 0xd8, 0x51, 0x50, 0xc7, 0x45, 0xd8, 0x4d, 0x65, 0x73, 0x73, 0xc7, 0x45, 0xdc, 0x61, 0x67, 0x65, 0x42, 0xc7, 0x45, 0xe0, 0x6f, 0x78, 0x41, 0x00, 0xff, 0xd6, 0x6a, 0x40, 0x8d, 0x4d, 0xd8, 0x51, 0x51, 0x6a, 0x00, 0xc7, 0x45, 0xd8, 0x48, 0x65, 0x6c, 0x6c, 0xc7, 0x45, 0xdc, 0x6f, 0x21, 0x00, 0x00, 0xff, 0xd0, 0xc9, 0xc3 }; |
These data are the hexadecimal representation of first and only unpacker code section data. It is very small and simple yet. How do we make unpacker_converter to generate automatically such h-file for us while rebuilding the unpacker? We need to edit unpacker project settings (Build Events - Post-Build Event):
1 |
"..\unpacker_converter.exe" "..\Release\unpacker.exe" "..\simple_pe_packer\unpacker.h" |
Why did not I use macro $(Configuration) in this setup? For unpacker project it will always resolve into "Release", because in both debug and release configurations the project builds as release (we changed this in Configuration Manager earlier). Thus, we will just copy unpacker_converter.exe file from ITS current configuration folder to project root folder, and the unpacker project will be able to take it from there. So, the last thing we should do is to fix unpacker_converter project configuration (Build Events - Post-Build Event):
1 |
copy /Y "..\$(Configuration)\unpacker_converter.exe" "..\unpacker_converter.exe" |
It is necessary to set Project Dependencies: unpacker on unpacker_converter (probably, this is not absolutely logical, but that's fine). After that we will build everything in both release and debug configurations.
I will explain, what we are going to write to parameters.h file. It will contain the following:
1 2 3 4 |
#pragma once static const unsigned int original_image_base_offset = 0x0C; static const unsigned int rva_of_first_section_offset = 0x13; |
We write offsets relative to unpacker code beginning (in built binary form) of two numbers - 0x11111111 and 0x22222222. These numbers will be rewritten by unpacker, and offsets 0xC (12) and 0x13 (19) are calculated in any HEX editor or using any autogenerated unpacker.h file. They are likely remain unchanged, because we are not going to write code before two mov commands in unpacker.
Let's add to simple_pe_packer project includes autogenerated unpacker.h file:
1 2 |
//Unpacker body (autogenerated) #include "unpacker.h" |
The last stage of this step will be inserting unpacker body to packed file. We did this thing at the previous step:
1 2 |
//Here will be unpacker code and something else in the future unpacker_section.get_raw_data() = "Nothing interesting here..."; |
Now we will insert unpacker code there and set it up:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
//... { //Get a reference to unpacker section data std::string& unpacker_section_data = unpacker_section.get_raw_data(); //Write unpacker code there //This code is stored in autogenerated file //unpacker.h, which we included in main.cpp unpacker_section_data = std::string(reinterpret_cast<const char*>(unpacker_data), sizeof(unpacker_data)); //Write image load address //to required offsets *reinterpret_cast<DWORD*>(&unpacker_section_data[original_image_base_offset]) = image.get_image_base_32(); //and first packed file section virtual address, //which stores data to unpack and information about them //At the beginning of this section, as you remember, //packed_file_info structure is stored *reinterpret_cast<DWORD*>(&unpacker_section_data[rva_of_first_section_offset]) = image.get_image_sections().at(0).get_virtual_address(); } //Add this section too const pe_base::section& unpacker_added_section = image.add_section(unpacker_section); //Set new entry point - now it points to //the beginning of unpacker image.set_ep(image.rva_from_section_offset(unpacker_added_section, 0)); //... |
That's all! Now the unpacker will be set up and added to the packed file! Let's test it. As always, we will pack ourselves, as a result we get packed_simple_pe_packer.exe file. We will run it and see long-expected window, for which we had to work so much!
So, the unpacker is correctly built, set up, converted and run, which is good. At the next steps we will make it to do more sensible things!
As always I share the full packer solution (except PE library): Own PE packer step 3