The 404CTF is a well-known French online CTF which lasts for a month. This year there were more than 2000 participants, and I managed to place in the top 150. I learned so much during that month thanks to the high-quality challenges. Among them, I especially enjoyed the 3 Revers(ibl)e Engineering
challenges, where I learned much more about automating reverse engineering tasks. That’s why I decided to share this writeup of the second one.
Revers(ibl)e Engineering [1/2]
Here is a translated version of the challenge description we were given :
After a challenging year marked by competitions, you decide to return to your hometown. You are filled with emotion as you spot the dojo of your childhood and decide to visit it. Your former sensei is there, smiling.
"Clairvoyance is the most formidable weapon of a fighter. To anticipate each movement before it is made, that is the true mastery of combat. Rise to my challenge and prove your worth."
Retrieve a zip archive with netcat containing a crackme and a token, send back the token with the solution to the crackme to a second server, receive a flag... Easy. Small detail: you have twenty seconds to do all this, and the binary changes with each attempt.
Connection:
nc challenges.404ctf.fr 31998 > chall.zip
nc challenges.404ctf.fr 31999
Author: @Izipak (_hdrien)
You can access these two examples of archives sent by the server : archive 1, archive 2.
Let’s test one of these binaries :
$ ./crackme.bin A
L'argument doit comporter 16 caractères.
$ ./crackme.bin AAAAAAAAAAAAAAAA
Dommage... Essaie encore!
crackme.bin
asks for a 16 characters long input, then shuts down as it doesn’t seem to be the correct password.
Let’s try to send a solution :
$ nc challenges.404ctf.fr 31998 > chall.zip && unzip chall.zip && cat token.txt && printf '\n' && nc challenges.404ctf.fr 31999
Archive: chall.zip
extracting: crackme.bin
extracting: token.txt
d5c35f615fa524da3326d295ea7585b6
Token ?
d5c35f615fa524da3326d295ea7585b6
Alors, la solution ?
test
Nope...
The token seems to be used by the server to identify which crackme we’ve got.
Let’s analyse the first crackme.
Static analysis
Using Ghidra, we can find these functions :
undefined8 main(int param_1,long user_input)
{
int win_condition;
undefined8 uVar1;
size_t user_input_len;
void *encoded_input;
undefined8 encoded_pass_1;
undefined8 encoded_pass_2;
int input_len;
if (param_1 < 2) {
puts("J\'ai besoin d\'un argument!");
uVar1 = 1;
}
else {
user_input_len = strlen(*(char **)(user_input + 8));
input_len = (int)user_input_len;
if (input_len == 16) {
encoded_pass_1 = 0xa9dab58698ccb89d;
encoded_pass_2 = 0xbbd949da83d394c9;
encoded_input = (void *)encode_input(*(undefined8 *)(user_input + 8));
win_condition = memcmp(encoded_input,&encoded_pass_1,16);
if (win_condition == 0) {
puts("GG!");
uVar1 = 0;
}
else {
puts("Dommage... Essaie encore!");
uVar1 = 1;
}
}
else {
puts(&DAT_00102028);
uVar1 = 1;
}
}
return uVar1;
}
void * encode_input(long input)
{
byte bVar1;
byte bVar2;
void *pvVar3;
int local_c;
pvVar3 = malloc(0x10);
for (local_c = 0; local_c < 0x10; local_c = local_c + 1) {
bVar1 = *(byte *)(input + local_c);
bVar1 = bVar1 ^ (byte)((bVar1 >> 2 & 1) << 4) ^ (bVar1 >> 5) * '\x02' & 2;
bVar1 = bVar1 ^ (byte)(((uint)bVar1 & (int)(uint)bVar1 >> 3 & 1U) << 7);
bVar1 = bVar1 ^ (bVar1 >> 7 & (byte)((int)(uint)bVar1 >> 2) & 1) * '\x02';
bVar1 = bVar1 ^ ((char)bVar1 >> 7) * -2 & 2U;
bVar1 = bVar1 ^ (byte)((int)(uint)bVar1 >> 5) & 1 & (byte)((int)(uint)bVar1 >> 1) ^ 0x40;
bVar2 = bVar1 ^ bVar1 >> 1 & 1;
bVar1 = bVar2 ^ (byte)((int)(uint)bVar2 >> 5) & 1 & (byte)((int)(uint)bVar2 >> 2) ^
(byte)((bVar1 >> 5 & 1) << 4);
bVar1 = bVar1 ^ (byte)((bVar1 & 1) << 2);
bVar1 = bVar1 ^ (byte)((int)(uint)bVar1 >> 1) & 1 & (byte)((int)(uint)bVar1 >> 6) ^ 0x21;
bVar1 = bVar1 ^ bVar1 >> 5 & 1 ^ 0x80;
bVar1 = bVar1 ^ (byte)(((int)(uint)bVar1 >> 2 & (int)(uint)bVar1 >> 1 & 1U) << 3) ^
(bVar1 >> 7) << 3;
*(byte *)((long)local_c + (long)pvVar3) =
bVar1 ^ (byte)(((int)(uint)bVar1 >> 3 & (int)(uint)bVar1 >> 4 & 1U) << 2) ^ 5;
}
return pvVar3;
}
The binary is stripped of symbols, but I’ve renamed some of the function and variables to make this readable.
As we can see in the code above, the main
function starts by checking if our input (stored as user_input
) is indeed 16 characters long :
user_input_len = strlen(*(char **)(user_input + 8));
input_len = (int)user_input_len;
if (input_len == 16) {
If that’s the case, it passes it to the encode_input
function. This function will encode our input, and return it as encoded_input
. Finally, the encoded password stored under encoded_pass_1
and encoded_pass_2
will be compared to encoded_input
:
encoded_pass_1 = 0xa9dab58698ccb89d;
encoded_pass_2 = 0xbbd949da83d394c9;
encoded_input = (void *)encode_input(*(undefined8 *)(user_input + 8));
win_condition = memcmp(encoded_input,&encoded_pass_1,16);
A detail to understand in the previous snippet of code, is that the memcmp
function is given a pointer towards encoded_pass_1
, but is being told that the string to compare is 16 bytes long. Since both encoded_pass_1
and encoded_pass_2
are 8 bytes long, and stored right next to each other on the stack (being declared one after the other), memcmp
will compare our encoded input to both variables.
Next, the program will tell us if we gave the right password depending on the memcmp
return value :
if (win_condition == 0) {
puts("GG!");
uVar1 = 0;
}
else {
puts("Dommage... Essaie encore!");
uVar1 = 1;
}
In other words, the password will be a 16 characters long input that once encoded by encode_input
corresponds to encoded_pass_1
and encoded_pass_2
.
Let’s analyze the second crackme to try and find differences between the two programs. Using Ghidra, we can find :
encoded_pass_1 = 0x19182d3d0fd0973e;
encoded_pass_2 = 0x30272025ded09705;
void * encode_input(long param_1)
{
byte bVar1;
byte bVar2;
void *pvVar3;
int local_c;
pvVar3 = malloc(0x10);
for (local_c = 0; local_c < 0x10; local_c = local_c + 1) {
bVar1 = *(byte *)(param_1 + local_c) ^ (*(byte *)(param_1 + local_c) >> 4) << 7 ^ 0x80;
bVar1 = bVar1 ^ (bVar1 >> 7) << 2;
bVar1 = bVar1 ^ (byte)(((int)(uint)bVar1 >> 2 & (int)(uint)bVar1 >> 1 & 1U) << 7);
bVar1 = bVar1 ^ (bVar1 >> 6) * '\x02' & 2 ^ 0x80;
bVar2 = bVar1 ^ (byte)(((uint)(bVar1 >> 7) & (int)(uint)bVar1 >> 4 & 1U) << 2) ^
(byte)((bVar1 >> 6 & 1) << 4);
bVar2 = bVar2 ^ (byte)((bVar2 >> 1 & 1) << 2);
bVar2 = bVar2 ^ (byte)(((uint)(bVar1 >> 7) & (int)(uint)bVar2 >> 5 & 1U) << 2);
bVar1 = bVar2 ^ (byte)(((int)(uint)bVar2 >> 4 & (int)(uint)bVar2 >> 1 & 1U) << 6) ^ 6;
bVar1 = bVar1 ^ ((byte)((int)(uint)bVar1 >> 6) & (byte)((int)(uint)bVar1 >> 3) & 1) * '\x02' ^
0xc0;
bVar1 = bVar1 ^ (bVar1 & (byte)((int)(uint)bVar1 >> 2) & 1) * '\x02';
*(byte *)((long)local_c + (long)pvVar3) =
bVar1 ^ ((byte)((int)(uint)bVar1 >> 5) & (byte)((int)(uint)bVar1 >> 6) & 1) * '\x02' ^ 1;
}
return pvVar3;
}
As we can see, it seems that both the encoded password and encoding function are randomly generated for each crackme. But since we’ve identified both the encoded password, and the encoding function, we can simply bruteforce the crackme.
To do so, we’ll need to test every characters possible to find their encoded version and cross reference our results with the encoded password. We’ll also need to automate this procedure since it must be completed in under 20 seconds.
Let’s try all this on the first crackme using GDB and Python.
Dynamic analysis
We can execute any GDB command from a Python script, and get its output as a string to parse. For example, here is a script retrieving the entry point of a program :
import gdb
output = gdb.execute("info file", from_tty=False, to_string=True)
for i in output.split("\n"):
if "Entry point" in i:
print("This is the entry point : ", i.split(" ")[2])
$ gdb -q crackme.bin -x example.py
GEF for linux ready, type `gef' to start, `gef config' to configure
88 commands loaded and 5 functions added for GDB 13.2 in 0.00ms using Python engine 3.11
GEF for linux ready, type `gef' to start, `gef config' to configure
88 commands loaded and 5 functions added for GDB 13.2 in 0.00ms using Python engine 3.11
Reading symbols from crackme.bin...
(No debugging symbols found in crackme.bin)
This is the entry point : 0x1080
Let’s try, step by step, to test a few characters. We’ll code a script executing the same commands we do, and then put it all together to crack this binary.
First, the binary being stripped and PIE being activated, we’ll need to calculate the address of the instructions we want to put breakpoints at based on the entry point and offsets we’ll determine. Let’s start by retrieving the entry point :
gef➤ info file
Symbols from "/home/coucou/Documents/404CTF_WriteUps/Reversible_Engineering_1/chall_example1/crackme.bin".
Local exec file:
`/home/coucou/Documents/404CTF_WriteUps/Reversible_Engineering_1/chall_example1/crackme.bin',
file type elf64-x86-64.
Entry point: 0x1080'
[...]
gef➤ run
Starting program: /home/coucou/Documents/404CTF_WriteUps/Reversible_Engineering_1/chall_example1/crackme.bin
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
J'ai besoin d'un argument!
[Inferior 1 (process 49546) exited with code 01]
gef➤ info file
Symbols from "/home/coucou/Documents/404CTF_WriteUps/Reversible_Engineering_1/chall_example1/crackme.bin".
Local exec file:
`/home/coucou/Documents/404CTF_WriteUps/Reversible_Engineering_1/chall_example1/crackme.bin',
file type elf64-x86-64.
Entry point: 0x555555555080'
[...]
Once we’ve ran the program once, the addresses won’t change inside GDB anymore.
We’ve coded this first part earlier to show how Python works with GDB, so let’s jump directly to the next step : extracting the encoded password. Those should be in memory when they’re being defined in the main
function :
encoded_pass_1 = 0xa9dab58698ccb89d;
encoded_pass_2 = 0xbbd949da83d394c9;
Let’s try to find these instructions in GDB :
───────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
0x5555555551c7 jmp 0x55555555523d
0x5555555551c9 movabs rax, 0xa9dab58698ccb89d
0x5555555551d3 movabs rdx, 0xbbd949da83d394c9
→ 0x5555555551dd mov QWORD PTR [rbp-0x20], rax
0x5555555551e1 mov QWORD PTR [rbp-0x18], rdx
0x5555555551e5 mov rax, QWORD PTR [rbp-0x30]
0x5555555551e9 add rax, 0x8
0x5555555551ed mov rax, QWORD PTR [rax]
0x5555555551f0 mov rdi, rax
gef➤ print $rax
$1 = 0xa9dab58698ccb89d
gef➤ print $rdx
$3 = 0xbbd949da83d394c9
We can see that the encoded password is stored in the registers. Now let’s calculate the offset between this instruction and the entry point using Python :
>>> 0x5555555551dd - 0x555555555080
349
Using all of this, we can write this script extracting the encoded password :
import gdb
ENCODED_PASS_OFFSET = 349
def gdb_exec(command: str) -> str:
return gdb.execute(command, from_tty=False, to_string=True)
def find_entry() -> str:
info = gdb_exec("info file")
for i in info.split("\n"):
if "Entry point:" in i:
return i.split(" ")[2]
return ""
def find_encoded_pass(entry_addr: str) -> [hex]:
pass_addr = hex(int(entry_addr, 16)+ENCODED_PASS_OFFSET)
gdb_exec(f"break *{pass_addr}")
gdb_exec("run "+"A"*16)
full_pass = [ "_" for i in range(16) ]
enc_pass = gdb_exec("print $rax")
enc_pass = enc_pass.split(" ")[2][2:18]
for i in range(int(len(enc_pass)/2)):
full_pass[7-i] = hex(int(enc_pass[(i*2):(i*2)+2], 16))
enc_pass = gdb_exec("print $rdx")
enc_pass = enc_pass.split(" ")[2][2:18]
for i in range(int(len(enc_pass)/2)):
full_pass[15-i] = hex(int(enc_pass[(i*2):(i*2)+2], 16))
gdb_exec("delete breakpoints")
return full_pass
if __name__ == "__main__":
gdb_exec("file crackme.bin")
gdb_exec("run")
entry_addr = find_entry()
enc_pass = find_encoded_pass(entry_addr)
print(enc_pass)
$ gdb -q -x test.py
GEF for linux ready, type `gef' to start, `gef config' to configure
88 commands loaded and 5 functions added for GDB 13.2 in 0.00ms using Python engine 3.11
GEF for linux ready, type `gef' to start, `gef config' to configure
88 commands loaded and 5 functions added for GDB 13.2 in 0.00ms using Python engine 3.11
J'ai besoin d'un argument!
[Inferior 1 (process 50339) exited with code 01]
Breakpoint 1, 0x00005555555551dd in ?? ()
['0x9d', '0xb8', '0xcc', '0x98', '0x86', '0xb5', '0xda', '0xa9', '0xc9', '0x94', '0xd3', '0x83', '0xda', '0x49', '0xd9', '0xbb']
Now we’ll need to input characters and extract their encoded versions. The encoded input should be in memory during the memcmp
call :
win_condition = memcmp(encoded_input,&encoded_pass_1,16);
Let’s try and find this instruction in GDB :
gef➤ r $(python3 -c 'import sys; sys.stdout.buffer.write(b"ABCDEFGHIJQLMNOP")')
───────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
0x5555555551ff mov edx, 0x10
0x555555555204 mov rsi, rax
0x555555555207 mov rdi, rcx
●→ 0x55555555520a call 0x555555555050 <memcmp@plt>
↳ 0x555555555050 <memcmp@plt+0000> jmp QWORD PTR [rip+0x2fba] # 0x555555558010 <[email protected]>
0x555555555056 <memcmp@plt+0006> push 0x2
0x55555555505b <memcmp@plt+000b> jmp 0x555555555020
0x555555555060 <malloc@plt+0000> jmp QWORD PTR [rip+0x2fb2] # 0x555555558018 <[email protected]>
0x555555555066 <malloc@plt+0006> push 0x3
0x55555555506b <malloc@plt+000b> jmp 0x555555555020
───────────────────────────────────────────────────────────────────────────────────────────── arguments (guessed) ────
memcmp@plt (
$rdi = 0x00005555555592a0 → 0xa5b3bab8bdafa2a8,
$rsi = 0x00007fffffffd960 → 0xa9dab58698ccb89d,
$rdx = 0x0000000000000010,
$rcx = 0x00005555555592a0 → 0xa5b3bab8bdafa2a8
)
gef➤ x/2gx $rdi
0x5555555592a0: 0xa5b3bab8bdafa2a8 0xb933b638b1bcaa2f
Our encoded password seems to be stored inside rdi
. Since we are working on a little endian system, we can deduce that the character A
is being encoded to 0xa8
, B
to 0xa2
, etc. Let’s calculate the offset between this instruction and the entry point using Python :
>>> 0x55555555520a - 0x555555555080
394
Using all of this, we can add these functions to our script to input 16 characters and extract their encoded version :
MEMSET_OFFSET = 394
def set_break(entry_addr: str) -> None:
new_char_addr = hex(int(entry_addr, 16)+MEMSET_OFFSET)
gdb_exec(f"break *{new_char_addr}")
def encode(x: [chr]) -> [int]:
argument = ""
for i in x:
argument += i
gdb_exec("run "+argument)
enc_x_raw = gdb_exec("x/2gx $rdi")
full_enc_x = [ "_" for i in range(16) ]
enc_x = enc_x_raw.split("\t")[1][2:18]
for i in range(int(len(enc_x)/2)):
full_enc_x[7-i] = hex(int(enc_x[(i*2):(i*2)+2], 16))
enc_x = enc_x_raw.split("\t")[2][2:18]
for i in range(int(len(enc_x)/2)):
full_enc_x[15-i] = hex(int(enc_x[(i*2):(i*2)+2], 16))
return full_enc_x
if __name__ == "__main__":
gdb_exec("file crackme.bin")
gdb_exec("run")
entry_addr = find_entry()
enc_pass = find_encoded_pass(entry_addr)
set_break(entry_addr)
encoded = encode("ABCDEFGHIJQLMNOP")
print(encoded)
$ gdb -q -x test.py
GEF for linux ready, type `gef' to start, `gef config' to configure
88 commands loaded and 5 functions added for GDB 13.2 in 0.00ms using Python engine 3.11
GEF for linux ready, type `gef' to start, `gef config' to configure
88 commands loaded and 5 functions added for GDB 13.2 in 0.00ms using Python engine 3.11
J'ai besoin d'un argument!
[Inferior 1 (process 51348) exited with code 01]
Breakpoint 1, 0x00005555555551dd in ?? ()
Breakpoint 2, 0x000055555555520a in ?? ()
['0xa8', '0xa2', '0xaf', '0xbd', '0xb8', '0xba', '0xb3', '0xa5', '0x2f', '0xaa', '0xbc', '0xb1', '0x38', '0xb6', '0x33', '0xb9']
Those values match the one we got earlier : our script is indeed working ! Let’s add the final touch to it, and crack this binary.
Exploit
To finish our script, we’ll need to repeat the operation we did previously, checking new characters every time, and comparing their encoded version to the encoded password to find the solution. This function should do the trick :
CHARACTERS = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
def find_pass(enc_pass: [hex]) -> str:
password = [ "_" for i in range(16) ]
for i in range(4):
charac = CHARACTERS[i*16:16+i*16]
if i == 3:
charac.append("A")
charac.append("A")
encoded = encode(charac)
for k in range(16):
for j in range(16):
if enc_pass[k] == encoded[j]:
password[k] = charac[j]
return "".join(password)
if __name__ == "__main__":
gdb_exec("file crackme.bin")
gdb_exec("run")
entry_addr = find_entry()
enc_pass = find_encoded_pass(entry_addr)
set_break(entry_addr)
password = find_pass(enc_pass)
print(password)
$ gdb -q -x solver.py
cE2bxX4T3j5q496S
$ ./crackme.bin cE2bxX4T3j5q496S
GG!
We got the solution ! Now all that remains, is sending it back to the servers immediately using Python. This script puts it all together and should give us the flag in a few seconds :
$ nc challenges.404ctf.fr 31998 > chall.zip && unzip chall.zip && chmod +x crackme.bin && gdb -q -x solver.py
Token ?
> 7bdbab98c126b915d314e52aeb63b33b
Alors, la solution ?
> M2GW4OQsS0MiLsYm
GG. Voila ton flag!
404CTF{e9d749db81e9f8caf745a5547da13579}
We get the flag !
You can find more write-ups of the 404CTF on my github