Analysis of the sample blackenergy

Published on 2013-02-11 14:00:00.

Last week, I wanted to do some RE so I picked a sample called 11.exe which md5’s signature is 92c52a4e6dda119e73240b17100abf1f (4116da5e629e1fad70ee415f9739af5b after unpacking). I’m not going to detail the unpacking process since it is a classic runPE. In this article I’m going to explain how to extract the configuration and how to trigger the C&C to grab plugins.

IDA idb (v6.4) here

Function: getProcByHash

When you open the sample within IDA, we directly notice that it includes very little imports. Directly at the start, we can see the following pattern numerous times:


The first push is a “hash”, it calls a function that returns an address and directly calls eax.

So we can call the function: GetProcByHash. First, we have to identify the “hash” function. We look at the function GetProcByHash (40162Eh).

At the begining of a loop, we notice a call. This call seems to be interesting.


No doubt, it is the “hash” function.


We now face multiple solutions:

I chose the second option and created code to enumerate all exported names in ASM.

The code just takes a dll name in input and then loads it and compute “hash” for each exported function. Code available here

$ wine buildhash.exe kernel32.dll | head
AcquireSRWLockExclusive 0x593cb506
AcquireSRWLockShared 0x3634926f
ActivateActCtx 0x5147f60f
AddAtomA 0x1e1865e5
AddAtomW 0x1e1865f3
AddConsoleAliasA 0x06dc97e5
AddConsoleAliasW 0x06dc97f3

We computed hashes for couple windows library (kernel32.dll, user32.dll, advapi32.dll, ntdll.dll, ws2_32.dll, wininet.dll ). Now we have to import them in IDA as an enumeration.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import idaapi
from idaapi import Choose2

id = idc.AddEnum(0, "import", idaapi.hexflag())
for line in open('hash_list','r').readlines():
    line = line.strip('\n')
    (name, hash) = line.split(' ')
    hash = int(hash, 16)
    idc.AddConstEx(id, "hash_%s" % (name), hash, idaapi.BADADDR)

During the writing of this small IDA script I found some other elegant solutions

After using the script, we can press the button “m” to get the function name.


Now we can really start to work.

Config extraction

By scrolling in the data, something looks suspicious: a block of data of 264len with a reference as the beginning.


By showing the reference, we arrived to the function 4058ACh (DecConfigStartThrAndWait). It pushes the config in arg0 of the function 4029ECh (decode_config)


This loop has 2 interesting calls:


To have a better point of view, I extracted the config from the binary

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys

if __name__ == "__main__":
  fp = open(sys.argv[1])

./python 11.u.exe | hd
00000000  60 01 00 00 e7 00 00 00  10 00 00 00 00 00 00 00  |`...............|
00000010  5c f6 ee 79 2c df 05 e1  ba 2b 63 25 c4 1a 5f 10  |\..y,....+c%.._.|
00000020  2b 83 08 14 6f f3 42 18  d7 15 42 50 83 48 fc 47  |+...o.B...BP.H.G|
00000030  05 8b bc 78 97 12 45 f9  0d a4 3d 77 fa e9 7c e0  |...x..E...=w..|.|
00000040  96 fd a6 16 5b 4b a5 79  8e 72 53 8c 56 9c 13 36  |....[K.y.rS.V..6|
00000050  fb de 84 48 ca 06 01 46  ee bf 9f e0 b3 c4 8b 0f  |...H...F........|
00000060  ec c5 5d 0d 61 52 9d 87  ca 71 46 70 3a fe b1 a7  |..].aR...qFp:...|
00000070  26 5f ae 0c d4 01 5d e7  c6 8c c1 9d 96 3d 79 da  |&_....]......=y.|
00000080  ed 5f d0 ff ae 3b 97 1c  50 01 ca 98 fb eb b0 58  |._...;..P......X|
00000090  d3 17 b8 4d 90 e9 ef 8d  f0 9f 04 2c c9 31 b8 a1  |...M.......,.1..|
000000a0  29 9a ff bb 16 ee 97 22  9c 84 f1 58 c7 9d 8f 8d  |)......"...X....|
000000b0  8e 0b b2 89 c0 e8 58 e2  e7 85 18 4f bd a7 49 ed  |......X....O..I.|
000000c0  8f d8 0f 0d 8a 38 4e 56  3b 72 03 96 01 06 40 68  |.....8NV;r....@h|
000000d0  9d c0 a6 94 c1 10 ad c0  7d 01 e0 2e c0 71 c1 f6  |........}....q..|
000000e0  3f 76 7b ac 2e a8 5d bf  8d 97 46 aa 4f f5 15 a1  |?v{...]...F.O...|
000000f0  fb 25 ce fe c6 f7 30 93  85 e2 06 ee 53 c7 12 77  |.%....0.....S..w|
00000100  d9 b0 d4 46 ee a0 33 00                           |...F..3.|

By looking at the code, we can extract the structure used by the config data

------------ ------------------------------------------------- ----------------
**Offset**   **Value**                                         **Comment**
0x00         0x0160                                            Final len
0x4          0xe7                                              Compressed len
0x8          0x10                                              Key len
0x10         5c f6 ee 79 2c df 05 e1 ba 2b 63 25 c4 1a 5f 10   Key value
0x20         config data
------------ ------------------------------------------------- ----------------

Now we have to identify the encryption algorithm.


By looking at the first call, we notices that it looks like RC4 KSA but with a tiny modification.


Orignal RC4 pseudo code from wiki

for i from 0 to 255
    S[i] := i
j := 0
for i from 0 to 255
    j := (j + S[i] + key[i mod keylength]) mod 256
    swap values of S[i] and S[j]

Blackenergy RC4 init code

def rc4_init(key)
    box = range(256)
    for i in range(256):
        box[i] = (box[i] ^ ord(key[i % len(key)])) % 256
    return box

The PRGA is the same than RC4 from wiki.

With this information we can write a script to decode the configuration

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import string

def arc4(key, data):
    x = 0
    box = range(256)
    for i in range(256):
        box[i] = (box[i] ^ ord(key[i % len(key)])) % 256

    x = y = 0
    out = []
    for char in data:
        x = (x + 1) % 256
        y = (y + box[x]) % 256
        box[x], box[y] = box[y], box[x]
        out.append(chr(ord(char) ^ box[(box[x] + box[y]) % 256]))

    return ''.join(out)

if __name__ == "__main__":
    fp = open(sys.argv[1])
    key =
    data =
    data = arc4(key, data)

./python config | hd
00000000  3c 00 3f 78 6d 6c 20 76  65 72 00 73 69 6f 6e 3d  |<.?xml ver.sion=|
00000010  22 31 2e 74 30 fd f7 cb  63 e3 64 ff e6 67 1e 6f  |"1.t0...c.d..g.o|
00000020  77 0c 3e fa 83 73 2d 31  32 35 9f d0 3f 3e 0d 0a  |w.>..s-125..?>..|
00000030  19 3c 62 6b 56 6e cd 6c  9c 17 20 7e b4 73 18 11  |.<bkVn.l.. ~.s..|
00000040  3a 0d 47 0f 43 0e 61 21  74 79 70 ff f3 68 b7 7e  |:.G.C.a!typ..h.~|
00000050  de 95 2f 0b 47 19 1c 61  64 50 72 c7 3a 2f 10 38  |../.G..adPr.:/.8|
00000060  34 2e 32 e3 cc 31 30 dd  0e 1e 36 7f 83 77 68 69  |4.2..10...6..whi|
00000070  74 65 b1 61 72 d7 b9 63  6c e0 73 2e 7f 70 d0 79  ||
00000080  2e 43 3c 68 1a 64 68 0d  34 26 29 11 19 84 f8 63  |.C<h.dh.4&)....c|
00000090  6d 53 64 0a 29 2f 0b 48  69 14 5f b7 3f 79 01 31  |mSd.)/.Hi._.?y.1|
000000a0  37 36 33 35 34 ba 04 f8  c7 08 30 39 cf 7f ba f6  |76354.....09....|
000000b0  df e8 f1 18 32 38 df 97  d7 33 ef 1f 76 02 70 8e  |....28...3..v.p.|
000000c0  2e 8c e2 83 86 df f3 66  72 fb 71 fb 33 42 d6 0d  ||
000000d0  91 1c 01 62 75 69 6c 64  5f d3 9f e6 30 6f 8e 0d  |...build_...0o..|
000000e0  8b 97 53 1c 26 98 00 13                           |..S.&...|


As you can see, the data seems to be compress. It’s the reason of the final len in the structure. After the RC4 decryption process, the program calls the function located at 402AE9h (inflate). To decompress the config, I decide to rip the code since I’ve no clue what the algo is.


To rip the code, I opened the binary and directly read the asm and treated it like a classic shellcode.

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

/* gcc -m32 -fno-stack-protector -z execstack -o rip rip.c */

typedef void (*finflate)(char *src, char *dst);

int main(int argc, char *argv[]){
        FILE *fp;
        char sc_inflate[0x14F];
        char inf[0x1000];
        char def[0x1000];
        int len;

        if(argc != 3){
                fprintf(stderr, "%s <11.u.exe> <config>", argv[0]);
                return EXIT_FAILURE;

        fp = fopen(argv[1], "r");
        fseek(fp, 0x1ee9, SEEK_SET);
        len = fread(sc_inflate, 1, sizeof(sc_inflate), fp);
        finflate inflate = (finflate)sc_inflate;
        fp = fopen(argv[2], "r");
        len = fread(def, 1, sizeof(def), fp);

        inflate(def, inf);

        printf("%s", inf);
        /*fwrite(inf, 1, sizeof(inf), stdout);*/

        return EXIT_SUCCESS;

A second method consist of using the library aplib.

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

#include "aplib.h"

int main(int argc, char *argv[]){
        unsigned int size, size2;
        char *src, *dst, *workmem;
        FILE *fp;

        fp = fopen(argv[1], "r");
        fseek(fp, 0, SEEK_END);
        size = ftell(fp);
        fseek(fp, 0, SEEK_SET);
        src = malloc(size);
        size2 = fread(src, 1, size, fp);

        dst = malloc(size2*4);
        size = aP_depack_asm(src, dst);

        fp = fopen(argv[2], "w");
        fwrite(dst, 1, size, fp);

        return EXIT_SUCCESS;

This output:

./aplib/decompress config.dec config.dec.out
cat config.dec.out 
<?xml version="1.0" encoding="windows-1251"?>


I failed, I did not save the pcap file and then when I decided to write this article, the C&C was already down…

It communicates using the HTTP protocol by sending POST requests.

The POST contains one parameter called “hnp”.

“hnp” contains the real param encrypted with the key fond in config and encoded in hex format.

Grab plugin config

The param to grab the plugin config is


It returns the configuration encrypted in RC4 using the id parameter as the key. I do not have a backup of the plugin configuration but it contains the name of the plugin with its version, parameters and key.

Grab plugin


It returns a DLL encrypted with RC4 using the key specified in the configuration receive before.


Little script to trigger the C&C and grab the configuration or the plugin:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import string
import itertools
import requests

def arc4(key, data):
    x = 0
    box = range(256)
    for i in range(256):
        box[i] = (box[i] ^ ord(key[i % len(key)])) % 256

    x = y = 0
    out = []
    for char in data:
        x = (x + 1) % 256
        y = (y + box[x]) % 256
        box[x], box[y] = box[y], box[x]
        out.append(chr(ord(char) ^ box[(box[x] + box[y]) % 256]))

    return ''.join(out)

if __name__ == "__main__":
    headers = {'User-Agent': 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en)'}
    server_key = "17635454375409656001655428185364111"
    client_key = "B595735B55773320500E7F8202B118F5"
    plugin_key = "e69ac82a011a69ce912500ea78e95dd4"

    get_module_s = "getp=s&id=%s&ln=en&cn=US&nt=2600&bid=01" % (client_key)
    get_config = "id=%s&ln=en&cn=US&nt=2600&bid=01" % (client_key)
    payload = {'hnp': arc4(server_key, get_config).encode('hex')}
    r ="", data=payload, headers=headers)
    data = arc4(client_key, r.content)


”s” plugin

When the C&C was up, only one plugin was distributed and it was called “s”.

./md5sum s.dll 
e01810f4cedbaf045cd625b82ec05c62  s.dll

I not going to reverse the plugin but it’s a classic plugin used to send spam. It checks if the infected machine can access some well known SMTP servers like,, If it manages to connect to one of those server it will fetch the email and template from the C&C server.