Marco Ramilli explained MBR works and how is it possible to write a
bootloader program, this skill will help you to analyze next BootLoader Malware.
From time to time we might observe special Malware storing themselves into a MBR and run during the booting process. Attackers could use this neat technique to infect and to mess-up your disk and eventually asking for a ransom before restoring original disk-configurations (Petya was just one of the most infamous boot-ransomware). But this is only an already known scenario while humongous possibilities are still available for the attacker who holds physical rights to open your disk and to write in it whatever he desires. For this reason I believe it would be interesting to understand how MBR works and how is it possible to write a boot loader program, this skill will help you during the analysis of your next Boot Loader Malware.
How the PC boot process works ?
Actually the boot process is super easy. When you press the power button you are providing the right power to every electronic chips who needs it. The BIOS once is reached by electrical power starts by running its own stored code and when it finishes running its initialization routines it looks for bootable devices. A bootable device is a physically connected device who has 521 bytes of code at its beginning and that contains the boot magic number: 0x55AA as last 2 bytes. If the BIOS find 510 bytes followed by 0x55AA it takes the previous 510 bytes moves them into RAM (to 0x7c00 address) and assumes they are executable bytes. This code is the so-called bootloader. Just a side note: bootloader shall be written in 16bit since x86 compatible CPUs are working in “real_mode” due the limited available instruction set.
I am used to write and read assembly on “Intel sintax” (it’s the one I learned during my studies) but today I’d love to use GNU Assembler (compiler&linker) who implements AT&T syntax, which is quite different from the Intel one but it will just work fine for the simple code we are going to write. The first tool we are going to use is as, the GNU compiler, which takes as input an assembly file and it returns its binary representation. as -o boot.o boot.asm is what we are looking for. After the compiler we need a “linker” (GNU linker is called ld). We need to tell to the liner that we want a plain binary file without linked libraries or linked symbols, fir such a reason we’re going to use –oformat binar. We also need to tell to the “linker” where the code starts (-e main). We would add the parameter -Ttext 0x7c00 just in case the code we are going to write does not fit into a 16bit address space, so we will force our linker to map the main function at such address which we know be the address where the BIOS runs bootloaders. Assuming our code named boot.asm and our original entry point to be labelled as ‘main’ we could use the following command: ld -o boot.bin –oformat binary -e main -Ttext 0x7c00 boot.o. For running the compiled code I’ve just used qemu in the following simple way: qemu-system-x86_64 boot.bin
The following code runs on boot showing up 3 strings and a realtime clock progression. The code have been developed as demo, not caring about performance and optimization, I am sure the code could be optimized and beautified, but this is not my point for this post.
Since the BIOS is in near memory, we are able to use a whole BIOS instruction set as described in here. The used interrupts for the demo bootloader are the following:
1. Int_10,02 for setting up screen size
2. int_10,07 for cleaning the screen from BIOS outputs
3. int_12a,02 for setting cursor positions
4. int_1a,02 for reading the clock status
5.int_10,0e for writing character to screen
Following the source “booting source” code:
Even if the code is
2] .global main
The last two lines:
112] .fill 510-(.-init), 1, 0
113] .word 0xaa55
The entire code exploits %cx register to setup the current state. For example %cx could be: 0x0000if msg is printed, 0x0001 if msg2 is printed, 0x0002 if msg3 is printed and 0x0003 if we want to start the clock printing loop. A very nice primitive command lodsb is used to iterate over string characters (for more details here) in order to print them to monitor until null byte ( ).
Running the boot image
The original post is available on Marco Ramilli’s blog
About the author: Marco Ramilli, Founder of Yoroi
I am a computer security scientist with an intensive hacking background. I do have a MD in computer engineering and a PhD on computer security from University of Bologna. During my PhD program I worked for US Government (@ National Institute of Standards and Technology, Security Division) where I did intensive researches in Malware evasion techniques and penetration testing of electronic voting systems.
I do have experience on security testing since I have been performing penetration testing on several US electronic voting systems. I’ve also been encharged of testing uVote voting system from the Italian Minister of homeland security. I met Palantir Technologies where I was introduced to the Intelligence Ecosystem. I decided to amplify my cybersecurity experiences by diving into SCADA security issues with some of the biggest industrial aglomerates in Italy. I finally decided to found Yoroi: an innovative Managed Cyber Security Service Provider developing some of the most amazing cybersecurity defence center I’ve ever experienced! Now I technically lead Yoroi defending our customers strongly believing in: Defence Belongs To Humans