Previous step is here.
Straight off I want to say that as I write this series of articles I fix some things and update my PE library (Note, that this step is for 0.1.x versions, too).
And we continue to develop our own packer. At this step it is time to turn directly to PE file packing. I shared a simple packer long time ago, which was ineffective by two reasons: firstly, it uses standard Windows functions for data packing and unpacking, which are rather slow and have low compression rate, secondly, all PE file sections were packed individually, which is not very optimal. This time I will do this differently. We are going to read data of all sections at once, assemble them into one block and pack. So, the resulting file will have only one section (actually two, I will explain this later), we can place all the resources, the packer code and helper tables into it. This will provide some benefits, because we don't need to spend space for file alignment, besides that, LZO algorithm is much more effective than RtlCompressBuffer in all respects.
Therefore, packer operation algorithm will be roughly the following one: read all sections, copy their data to one buffer and pack it, place the packed buffer to a new section, delete all remaining sections. We will have to store all original file sections parameters, to let unpacker restore them later. Let's write a specific structure for this:
1 2 3 4 5 6 7 8 9 10 11 |
#pragma pack(push, 1) //Structure to store packed section data struct packed_section { char name[8]; //Section name DWORD virtual_size; //Virtual size DWORD virtual_address; //Virtual address (RVA) DWORD size_of_raw_data; //Raw data size DWORD pointer_to_raw_data; //Raw data file offset DWORD characteristics; //Section characteristics }; |
This structure will be placed to specific offset in packed file for each section, and packer code will read these structures. They will store all required information to restore PE file sections.
Besides that, it will be handy to have a structure to store various useful original file information, which will also be necessary for unpacker. It will have three fields only for now, and most possibly I will expand it in time:
1 2 3 4 5 6 7 8 |
//Structure to store information about packed file struct packed_file_info { BYTE number_of_sections; //Number of original file sections DWORD size_of_packed_data; //Size of packed data DWORD size_of_unpacked_data; //Size of original data }; #pragma pack(pop) |
Please pay your attention that both structures have alignment 1. This is required to reduce their size. Besides that, setting the alignment size obviously will save you from various issues with reading structures from file during its unpacking.
Let's go further. It is good to calculate file sections entropy before packing, to determine if it is reasonable to pack it or it was already packed. My library allows to do that. Also, it is worth to check whether we have .NET binary file - we will not pack such files.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
... try { //Try to open the file as 32-bit PE file //Last two arguments are false, because we don't need them //"raw" file bound import data and //"raw" debug information data //They are not used while packing, so we don't load these data pe32 image(file, false, false); //Check if .NET image was passed if(image.is_dotnet()) { std::cout << ".NEt image cannot be packed!" << std::endl; return -1; } //Calculate file sections entropy to make sure that file was not packed { std::cout << "Entropy of sections: "; double entropy = image.calculate_entropy(); std::cout << entropy << std::endl; //There is an opinion, //that PE files have normal entropy below 6.8 //If it has more, than file is most possibly compressed //So (for now) we will not pack files //with high entropy, this makes little sense if(entropy > 6.8) { std::cout << "File has already been packed!" << std::endl; return -1; } } ... |
Now let's turn to sections packing. Let's add #include <string> line to the beginning of main.cpp - we will need strings to form data blocks (they place data sequentially, and we can write them to file directly from the string). We may also use vectors (vector), however, there is no big difference.
To begin we should initialize LZO library at first:
1 2 3 4 5 6 |
//Initialize LZO compression library if(lzo_init() != LZO_E_OK) { std::cout << "Error initializing LZO library" << std::endl; return -1; } |
Read file sections:
1 2 3 4 5 6 7 8 9 10 |
std::cout << "Reading sections..." << std::endl; //Get PE file sections list const pe_base::section_list& sections = image.get_image_sections(); if(sections.empty()) { //If file has no sections, we have nothing to pack std::cout << "File has no sections!" << std::endl; return -1; } |
Turn to file packing:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
//PE file basic information structure packed_file_info basic_info = {0}; //Get and save original section count basic_info.number_of_sections = sections.size(); //String to store each section packed_section structure one-by-one std::string packed_sections_info; { //Allocate required memory for these structures packed_sections_info.resize(sections.size() * sizeof(packed_section)); //Raw data of all sections, read from the file and assembled together std::string raw_section_data; //Current section index unsigned long current_section = 0; //List all sections for(pe_base::section_list::const_iterator it = sections.begin(); it != sections.end(); ++it, ++current_section) { //Next section reference const pe_base::section& s = *it; { //Create section information structure //and fill it packed_section& info = reinterpret_cast<packed_section&>(packed_sections_info[current_section * sizeof(packed_section)]); //Section characteristics info.characteristics = s.get_characteristics(); //File data pointer info.pointer_to_raw_data = s.get_pointer_to_raw_data(); //File data size info.size_of_raw_data = s.get_size_of_raw_data(); //Relative virtual section address info.virtual_address = s.get_virtual_address(); //Virtual section address info.virtual_size = s.get_virtual_size(); //Copy section name (8 symbols maximum) memset(info.name, 0, sizeof(info.name)); memcpy(info.name, s.get_name().c_str(), s.get_name().length()); } //If section is empty, switch no the next one if(s.get_raw_data().empty()) continue; //If it is not empty - copy its data to a string //containing all sections data raw_section_data += s.get_raw_data(); } //If all sections are empty, there is nothing to pack! if(raw_section_data.empty()) { std::cout << "All sections of PE file are empty!" << std::endl; return -1; } //We will pack both buffers assembled together //(see below) packed_sections_info += raw_section_data; } |
I will explain the code above a little. We created two buffers - packed_sections_info and raw_section_data. Ignore the fact that these are strings (std::string), they can store binary data. First buffer stores sequential packed_section structures, which are created and filled for all sections in PE file. Second buffer stores all sections raw data assembled together. We will be able to split and put this data to sections again after unpacking, because the information about section raw data sizes is stored in the first buffer and will be available for unpacker. Then we go further - we need to pack resulting raw_section_data buffer. We can pack packed_sections_info buffer with it - well, let's do this. We concatenate strings (in fact, binary buffers) packed_sections_info and raw_section_data - this is performed in previous code block.
Further we will create a new PE file section to place our packed data:
1 2 3 4 5 6 7 8 |
//New section pe_base::section new_section; //Name - .rsrc (see description below) new_section.set_name(".rsrc"); //Available for reading, writing, execution new_section.readable(true).writeable(true).executable(true); //Reference to section raw data std::string& out_buf = new_section.get_raw_data(); |
So, we created a new section (but have not added it to PE file yet). Why did I name it .rsrc? I did it for one simple reason. All files have their resources in section named .rsrc. Main file icon and its version information are also stored in resources. Unfortunately, Windows explorer can read file icon and display it ONLY if section with resources is named .rsrc. As far as I know this issue was fixed in later Windows versions and service packs, but it is better to get reinsured. We do not work with resources so far, so this is done for the future.
Data compression is the next step. Slightly low-level part... Here we will need Boost library. Don't you have it? It's time to download, build and install it! This is very easy. But for the library class, which I am going to use further, there is no need to build it. Just download the library, unpack it to some folder, for example, C:\boost, and put the path to boost header files to project include directories. If I will need boost class later, which should be built, I will explain how to do that.
Let's add #include <boost/scoped_array.hpp> line to main.cpp headers. Then we will pack the data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
//Create smart pointer //and allocate memory required by LZO algorithm //Smart pointer will release this memory //automatically if needed //We use lzo_align_t type to //align memory as required //(from LZO documentation) boost::scoped_array<lzo_align_t> work_memory(new lzo_align_t[LZO1Z_999_MEM_COMPRESS]); //Unpacked data length lzo_uint src_length = packed_sections_info.size(); //Save it to our file information structure basic_info.size_of_unpacked_data = src_length; //Packed data length //(unknown yet) lzo_uint out_length = 0; //Compressed data buffer //(length based on LZO documentation recommendations) out_buf.resize(src_length + src_length / 16 + 64 + 3); //Perform data compression std::cout << "Packing data..." << std::endl; if(LZO_E_OK != lzo1z_999_compress(reinterpret_cast<const unsigned char*>(packed_sections_info.data()), src_length, reinterpret_cast<unsigned char*>(&out_buf[0]), &out_length, work_memory.get()) ) { //If something goes wrong, exit std::cout << "Error compressing data!" << std::endl; return -1; } //Store packed data length in our structure basic_info.size_of_packed_data = out_length; //Resize output buffer with compressed data by //resulting compressed data length //which is now known out_buf.resize(out_length); //Assemble the buffer, this will be //resulting data of our new section out_buf = //basic_info structure data std::string(reinterpret_cast<const char*>(&basic_info), sizeof(basic_info)) //Output buffer + out_buf; //Check if file became really smaller if(out_buf.size() >= src_length) { std::cout << "File is incompressible!" << std::endl; return -1; } |
Now we have to delete unnecessary PE file sections and add our new section to it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
{ //Get reference to first existing //PE file section const pe_base::section& first_section = image.get_image_sections().front(); //Set virtual address of section to be added (see below) new_section.set_virtual_address(first_section.get_virtual_address()); //Get reference to last existing PE file //section const pe_base::section& last_section = image.get_image_sections().back(); //Calculate total virtual data size DWORD total_virtual_size = //Last section virtual address last_section.get_virtual_address() //Aligned last section virtual size + pe_base::align_up(last_section.get_virtual_size(), image.get_section_alignment()) //Subtract first section size - first_section.get_virtual_address(); //Delete all PE file sections image.get_image_sections().clear(); //Change file alignment if it was greater than //0x200 - this is minimal allowed value //for aligned PE files image.realign_file(0x200); //Add our section and get reference to //already added section with recalculated addresses and sizes pe_base::section& added_section = image.add_section(new_section); //Set required virtual size for it image.set_section_virtual_size(added_section, total_virtual_size); } |
What happened here? I will explain it in more detail. At first we determined virtual address of first PE file section (see this below). After that we determined total virtual size of all sections. As section virtual size equals virtual address + aligned virtual size of previous one, then, having virtual address and size of last file section, we got total virtual size of all sections plus first section address. We get pure virtual size of all sections together by subtracting that first section address from this number. This can be performed easier - by calling image.get_size_of_image() function, which would return, in fact, the same, but from PE file header, but well. Further we delete all existing PE file sections. After that we add our section to PE file and get a reference to added section with recalculated addresses and sizes (we work with this reference after adding). Then we should reserve enough memory to unpack all sections into it - that's why we change newly added section virtual size to total size of all existing sections. Added section virtual size will be calculated automatically by default. This doesn't fully fit our requirements - we need that memory region occupied with our section should totally match the region occupied by all original file sections. My library allows to set section virtual address explicitly, if this section is first in file (i.e. there was no other sections before this one was added). This is our situation actually. This is the reason why we determined virtual address of first section and set it for our new section.
We also changed file alignment to minimum allowed value for aligned files, while file did not have any sections, to make everything go faster.
However, one section is not enough and we have to create and add another one. You might ask: "What for?" The answer is simple: first section after its unpacking will contain all original file sections data. And we still have to place unpacker somewhere. You might say: so, place it to the end of the section. But then it will be rewritten by original file data during unpacking! You may, of course, really place it to the same section, and allocate memory (with VirtualAlloc or somehow else) right before unpacking itself and copy unpacker body there, and run it from that memory. But this memory has to be released somehow. If we do this from itself, application will crash: memory is released, and EIP processor register, which points to currently executing assembler command, points to nowhere. Thus, we can't do without additional section. If you look at UPX or Upack, you will see that they have 2 to 3 sections too.
1 2 3 4 5 6 7 8 9 10 11 12 |
{ //New section pe_base::section unpacker_section; //name - coderpub unpacker_section.set_name("coderpub"); //Available for reading and execution unpacker_section.readable(true).executable(true); //Here will be unpacker code and something else in the future unpacker_section.get_raw_data() = "Nothing interesting here..."; //Add this section too image.add_section(unpacker_section); } |
Let's turn to next step. We will mock at PE file a little:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
//Delete all often used directories //we will return them further //and manage correctly, but that way for now //We leave imports only (and we will not process them yet) image.remove_directory(IMAGE_DIRECTORY_ENTRY_BASERELOC); image.remove_directory(IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT); image.remove_directory(IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT); image.remove_directory(IMAGE_DIRECTORY_ENTRY_EXPORT); image.remove_directory(IMAGE_DIRECTORY_ENTRY_IAT); image.remove_directory(IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG); image.remove_directory(IMAGE_DIRECTORY_ENTRY_RESOURCE); image.remove_directory(IMAGE_DIRECTORY_ENTRY_SECURITY); image.remove_directory(IMAGE_DIRECTORY_ENTRY_TLS); image.remove_directory(IMAGE_DIRECTORY_ENTRY_DEBUG); //Strip directory table, removing all empty directories //Not completely, but to minimum of 12 elements, because the //original file can have first 12 ones in use image.strip_data_directories(16 - 4); //Remove stub from header, if there was any image.strip_stub_overlay(); |
I deleted almost all more or less usable directories from headers. This is completely wrong, because most files will stop working after that. But you understand, that we are improving the packer step-by-step, so let it be this way for now. I left imports directory only, and I did not handle it in any way. Imports are the first ones, which we have to manage properly, because it is very hard to find a file without imports, and we have to test our packer on something.
Further I stripped the directory table, and because all of directories are deleted now, I removed stub from the header (usually there is DOS stub and Rich MSVC++ signatures, we don't need this). We strip directory table down to 12 elements, not less. Elements from 1 to 12 may be present in original file and we have to restore them. Of course, we could leave absolute minimum of elements in the table, but this will not give benefits in size, and add more code to unpacker, if we will suddenly need to expand the table back. Why do we have to cut the table exactly to 12 elements? That's because last four are definitely not required to launch PE file successfully, and we can manage without them easily. We can also check dynamically if the file has 12th (Configuration directory), 11th (TLS directory) and so on directories, and if not, to strip directory table even more, but, I repeat, there is no big reason in this.
Last thing for us to do is to save the packed file under a new name:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
//Create new PE file //Get the name of original file without directory std::string base_file_name(argv[1]); std::string dir_name; std::string::size_type slash_pos; if((slash_pos = base_file_name.find_last_of("/\\")) != std::string::npos) { dir_name = base_file_name.substr(0, slash_pos + 1); //Source file directory base_file_name = base_file_name.substr(slash_pos + 1); //Source file name } //Give a name to a new file: "packed_" + original_file_name //Concatenate it with original directory name to save it to a folder where //original file is stored base_file_name = dir_name + "packed_" + base_file_name; //Create file std::ofstream new_pe_file(base_file_name.c_str(), std::ios::out | std::ios::binary | std::ios::trunc); if(!new_pe_file) { //If failed to create file - display an error message std::cout << "Cannot create " << base_file_name << std::endl; return -1; } //Rebuild PE image //Strip DOS header, writing NT headers over it //(second parameter (true) is responsible for this) //Do not recalculate SizeOfHeaders - third parameter is responsible for this image.rebuild_pe(new_pe_file, true, false); //Message user that file is successfully packed std::cout << "Packed image was saved to " << base_file_name << std::endl; |
Nothing complicated happens in this code part, everything has to be more or less understandable from comments. So, this is all we do at this step. The step is more than rich, and you have something to think about. Obviously, the packed file will not be loaded, because it doesn't have any unpacker, we don't handle imports and don't fix entry point and many more... However, we can estimate compression rate and check, if everything is packed in the intended way, using any PE file viewer (I use CFF Explorer).
Original file:
Packed file:
As you can see, first section Virtual Address + Virtual Size in the second screenshot matches SizeOfImage in the first one. First section virtual address was not changed. This is what we wanted to achieve. On the second page you can see the contents of second "coderpub" section. Compression rate is not bad - from 1266 kb to 362 kb.
See you at the next step! Questions are welcomed, you can ask them in comments.
And, as always, I share actual project version with latest changes: own PE packer step 2.