Can you crack it ? Hell yeah!
UK intelligence agency GCHQ challange… Can you crack it ?
Whilst browsing BBC website I found an article which caught my eye http://www.bbc.co.uk/news/technology-15968878. Basically UK intelligence agency GCHQ has launched a code-cracking competition to help attract new talent. I thought I’ll give it a try to see how complicated it is. And so it begun…
Warm up
Target: http://www.canyoucrackit.co.uk/ Ok first thing that was suspicios is why it is in a image format and not in plain text. So I had an assumption that something more is hidden in this image than just a simple text representation. I loaded picture in different picture editors and tried to manipulate colors contrast and so on… However after wasting about an hour I had no idea what could it be. I felt like I was getting nowhere. However I found something in the image header. I used online exif viewer http://regex.info/exif.cgi?imgurl=http%3A%2F%2Fwww.canyoucrackit.co.uk%2Fimages%2Fcyber.png The comment field looked interesting.
QkJCQjIAAACR2PFtcCA6q2eaC8SR+8dmD/zNzLQC+td3tFQ4qx8O447T DeuZw5P+0SsbEcYR78jKLw==
Even monkey can tell that it is Base64 coded data. So I decoded it and found something which was not making any sense at the moment. The string which starts with BBBB2…. Let’s leave it for awhile.
Digging deeper
Couldn’t find anything else interesting so I started digging data represented in the image. Some interesting string in the binary file …X=AAAAuCX=BBBBu (I can sense correlation between the data encoded in the image header and represented in the image) … However what really caught my eye is that it starts with 0xEB (which normally represent jmp instruction) at the end we have …0xcd 0×80 0×90 0×90 which would represent:
CD80 int 0x80 0x90 nop 0x90 nop
Made me think that this is a machine code. Let’s try to decompile it
ndisasm -b 32 code.bin > disasmcode.asm
produced me the following data:
00000000 EB04 jmp short 0x6 00000002 AF scasd 00000003 C2BFA3 ret 0xa3bf 00000006 81EC00010000 sub esp,0x100 0000000C 31C9 xor ecx,ecx 0000000E 880C0C mov [esp+ecx],cl 00000011 FEC1 inc cl 00000013 75F9 jnz 0xe 00000015 31C0 xor eax,eax 00000017 BAEFBEADDE mov edx,0xdeadbeef 0000001C 02040C add al,[esp+ecx] 0000001F 00D0 add al,dl 00000021 C1CA08 ror edx,0x8 00000024 8A1C0C mov bl,[esp+ecx] 00000027 8A3C04 mov bh,[esp+eax] 0000002A 881C04 mov [esp+eax],bl 0000002D 883C0C mov [esp+ecx],bh 00000030 FEC1 inc cl 00000032 75E8 jnz 0x1c 00000034 E95C000000 jmp dword 0x95 00000039 89E3 mov ebx,esp 0000003B 81C304000000 add ebx,0x4 00000041 5C pop esp 00000042 58 pop eax 00000043 3D41414141 cmp eax,0x41414141 00000048 7543 jnz 0x8d 0000004A 58 pop eax 0000004B 3D42424242 cmp eax,0x42424242 00000050 753B jnz 0x8d 00000052 5A pop edx 00000053 89D1 mov ecx,edx 00000055 89E6 mov esi,esp 00000057 89DF mov edi,ebx 00000059 29CF sub edi,ecx 0000005B F3A4 rep movsb 0000005D 89DE mov esi,ebx 0000005F 89D1 mov ecx,edx 00000061 89DF mov edi,ebx 00000063 29CF sub edi,ecx 00000065 31C0 xor eax,eax 00000067 31DB xor ebx,ebx 00000069 31D2 xor edx,edx 0000006B FEC0 inc al 0000006D 021C06 add bl,[esi+eax] 00000070 8A1406 mov dl,[esi+eax] 00000073 8A341E mov dh,[esi+ebx] 00000076 883406 mov [esi+eax],dh 00000079 88141E mov [esi+ebx],dl 0000007C 00F2 add dl,dh 0000007E 30F6 xor dh,dh 00000080 8A1C16 mov bl,[esi+edx] 00000083 8A17 mov dl,[edi] 00000085 30DA xor dl,bl 00000087 8817 mov [edi],dl 00000089 47 inc edi 0000008A 49 dec ecx 0000008B 75DE jnz 0x6b 0000008D 31DB xor ebx,ebx 0000008F 89D8 mov eax,ebx 00000091 FEC0 inc al 00000093 CD80 int 0x80 00000095 90 nop 00000096 90 nop 00000097 E89DFFFFFF call dword 0x39 0000009C 41 inc ecx 0000009D 41 inc ecx 0000009E 41 inc ecx 0000009F 41 inc ecx
Quick code scan gave me impression that it IS a way to go as we can find instructions like
.. mov edx,0xdeadbeef .. inc al int 0x80; nice way to exit from the program. ..
We also have three nice loops. Let’s run it and see what happens.
char code[] = "\xeb\x04\xaf\xc2\xbf\xa3\x81\xec\x00\x01\x00\x00\x31\xc9\x88\x0c" "\x0c\xfe\xc1\x75\xf9\x31\xc0\xba\xef\xbe\xad\xde\x02\x04\x0c\x00" "\xd0\xc1\xca\x08\x8a\x1c\x0c\x8a\x3c\x04\x88\x1c\x04\x88\x3c\x0c" "\xfe\xc1\x75\xe8\xe9\x5c\x00\x00\x00\x89\xe3\x81\xc3\x04\x00\x00" "\x00\x5c\x58\x3d\x41\x41\x41\x41\x75\x43\x58\x3d\x42\x42\x42\x42" "\x75\x3b\x5a\x89\xd1\x89\xe6\x89\xdf\x29\xcf\xf3\xa4\x89\xde\x89" "\xd1\x89\xdf\x29\xcf\x31\xc0\x31\xdb\x31\xd2\xfe\xc0\x02\x1c\x06" "\x8a\x14\x06\x8a\x34\x1e\x88\x34\x06\x88\x14\x1e\x00\xf2\x30\xf6" "\x8a\x1c\x16\x8a\x17\x30\xda\x88\x17\x47\x49\x75\xde\x31\xdb\x89" "\xd8\xfe\xc0\xcd\x80\x90\x90\xe8\x9d\xff\xff\xff\x41\x41\x41\x41" int main(void) {(*(void(*)()) code)(); return 0;} |
After executing code above it just “does” nothing. So I started to analyze the code.
Ok we have 3 loops which basically does key-sheduling and decrypting. And it looks like it is a RC4 implementation. However this peace of code tells something we are missing.
... 00000034 E95C000000 jmp dword 0x95 00000039 89E3 mov ebx,esp 0000003B 81C304000000 add ebx,0x4 00000041 5C pop esp 00000042 58 pop eax 00000043 3D41414141 cmp eax,0x41414141 00000048 7543 jnz 0x8d 0000004A 58 pop eax 0000004B 3D42424242 cmp eax,0x42424242 00000050 753B jnz 0x8d ..
After executing call instruction we pop out the return address which is pointing to the end of our mystical data and we have 0×41414141 in the memory during execution. Everything makes sense until we try to pop from the stack to the eax register that would have to be 0×42424242. However we get nothing as we already reached the bottom. Clearly we are missing some data which should start 0×42424242 (BBBB). Luckily, we already have something from the image header. So I added the data to the existing code and tried to run again. Using gdb I could examine the memory after execution and found something interesting.
tadas@platoon:~/crack> gdb code GNU gdb (GDB) SUSE (6.8.91.20090930-2.4) Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i586-suse-linux". For bug reporting instructions, please see: ... Reading symbols from /home/tadas/crack/code...done. (gdb) b *code+141 Breakpoint 1 at 0x804c08d (gdb) r Starting program: /home/tadas/crack/code Missing separate debuginfo for /lib/ld-linux.so.2 Try: zypper install -C "debuginfo(build-id)=d7706cbaa0ca09319cb645eac789cb8399078797" Missing separate debuginfo for /lib/libc.so.6 Try: zypper install -C "debuginfo(build-id)=ee302691046515fe3766ae3b7d47afd3e3a8d063" Breakpoint 1, 0x0804c08d in code () (gdb) x/s $edi-0x32 0xbffff2da: "GET /15b436de1f9107f3778aad525e5d0b20.js HTTP/1.1" (gdb)
Oh yeah !!! Great feeling we found something what we were looking for… GET /15b436de1f9107f3778aad525e5d0b20.js HTTP/1.1. Although it took me a while to get familiar with the code and understand it I managed to pass the first level
Stage 2
http://www.canyoucrackit.co.uk/15b436de1f9107f3778aad525e5d0b20.js Got that script. Clear definition of the virtual machine which needs to be implemented. Nothing challenging just boring scripting to implement that VM. It took me hell a lot of time to implement it more than I thought and it’s nothing really interesting so I wouldn’t go into details. After completing this task I got another URL GET /da75370fe15c4148bd4ceec861fbdaa5.exe HTTP/1.0
Stage 3
So far good results has been achieved. I’ve downloaded executable which is windows executable. It’s interesting as the code in the stage 1 is written for linux and now we have windows binaries. So I tried to run it, however I got missing dll error. I was missing cygcrypt-0.dll. Well, it probably tells us that this piece of executable will do some sort of crypting/encrypting. Anyway, I downloaded all missing dll’s and tried to run again
C:\sandbox>da75370fe15c4148bd4ceec861fbdaa5.exe keygen.exe usage: keygen.exe hostname C:\sandbox>da75370fe15c4148bd4ceec861fbdaa5.exe www.canyoucrackit.co.uk keygen.exe error: license.txt not found
Created empty license.txt file and tried again
keygen.exe error: license.txt invalid
Pfuuhh…. Now I got impression that they are looking for people who are capable to write key generators
. So the next step is what we should place into the license.txt file to make this thing work. Quick look into the executable file showed many open strings
hqDTK7b8K2rvw
keygen.exe
usage: keygen.exe hostname
r license.txt error: license.txt not found
%s loading stage1 license key(s)...
loading stage2 license key(s)...
error: license.txt invalid
error: gethostbyname() failed
error: connect("%s") failed
GET /%s/%x/%x/%x/key.txt HTTP/1.0
request:
%s error: send() failed
response:
Two strings looks interesting /%s/%x/%x/%x/key.txt and hqDTK7b8K2rvw. The first one probably is our answer we just need to replace %s and %x with correct values and second one looks like it is some sort of DES-based password hash. Anyway first of all I had to figure out what kind of stuff I have to put into the license.txt file to make this thing happy. After spending hours digging disassembled code I found out that first thing we need to put into the file is string which starts with gchq following by hqDTK7b8K2rvw decrypted password which is 8 characters long. Ok, so I started password cracker and continued my investigation. Instead of waiting password cracker to complete (it can take ages as we know it
) I did a bit of paching by replacing hqDTK7b8K2rvw with something I already know the answer like aaaaaaaa encrypted with salt hq and I’ve got hqWLFYOHRc1H6. Now my license file looks like this gchqaaaaaaaa.
keygen.exe loading stage1 license key(s)... loading stage2 license key(s)... request: GET /hqWLFYOHRc1H6/0/0/0/key.txt HTTP/1.0 response: HTTP/1.0 404 Not Found
Hah it’s verifying password, however we don’t really need it ? As we can later replace with original one. So far so good… But still not valid and key.txt not found. Okey more debugging reviled that lincese.txt must contain extra data:
- gchq
- 8 character password (decrypted)
- 4 bytes (stage 1) ?
- 4 bytes (stage 2) ?
- 4 bytes(stage 2) ?
So I did license.txt file gchqaaaaaaaa123412345678
GET /hqWLFYOHRc1H6/34333231/34333231/38373635/key.txt HTTP/1.0
It doe’s something but it’s not what I really want. I just have to figure out what info do I need. First thing I thought about is firmware in js file which I couldn’t figure out why they are and firmware is exactly 8 bytes long firmware: [0xd2ab1f05, 0xda13f110]. I placed into my licence.txt file and tried again. Fairly soon I got the idea how URL is constructed:
/hash/stage1(4 bytes in hex)/stage2(4 bytes in hex)/stage2(4 bytes in hex)/key.txt
And so it continues… I almost lost my hope and was nearly to gave up to guess the exact missing data and combination which would have to be inserted into the final URL. I was certain that firmware had to be used somewhere, however I was not sure that this data is our final answer to the problem (maybe it’s just a key or encrypted data) but I have no idea about stage 1 data. Tried many combinations. It was already very late and quiet night, so I thought I’ll make a pause and go for a walk to get some fresh air. After 30 minutes on my way back home I remembered that we have jmp instruction which leaves 4 bytes unused in this solution. Why would we want that? And we are searching for 4 bytes !. So I tried placing into my URL /hqDTK7b8K2rvw/a3bfc2af/d2ab1f05/da13f110/key.txt and got my answer (of course it took me some time trying different combinations)!! Yeahooo. Done. Great success! My final license.txt file looked like this:
67 63 68 71 61 61 61 61 61 61 61 61 AF C2 BF A3 05 1F AB D2 10 F1 13 DA
With my patched version it produces the following output:
C:\sandbox>patched.exe localhost keygen.exe loading stage1 license key(s)... loading stage2 license key(s)... request: GET /hqWLFYOHRc1H6/a3bfc2af/d2ab1f05/da13f110/key.txt HTTP/1.0 response: HTTP/1.0 404 Not Found
So I just replaced with original hash and that did the job.
And the final answer is Pr0t3ct!on#cyber_security@12*12.2011+
Conclusion
To sum it up I think the point of this challenge dividing it into stages was to test different skills:
- Stage 1 spot the data, try to understand the data, analyze it, glue it
- Stage 2 got the definition sit down implement the solution get your answer
- Stage 3 brute force and guessing
I think where could be more than one correct answer… Especially playing with URL in the last stage.
Personally Stage 1 was most interesting and challenging to me. And overall it took me 3 days to find the final solution.



