By Ismael Valenzuela.

In this post we will walk through some of the most effective techniques used to filter suspicious connections and investigate network data for traces of malware using Bro, some quick and dirty scripting and other free available tools like CIF.

This post doesn’t pretend to be a comprehensive introduction to Bro (check the references section at then end of the post for that) but rather a quick reference with tips and hints on how to spot malware traffic using Bro logs and other open source tools and resources.

All the pcap files used throughout this post can be obtained from GitHub. Some of them have been obtained from the large dataset of pcaps available at contagiodump.

Finally, if you are new to Bro I suggest that you start by downloading the latest version of Security Onion , a must-have Linux distribution for packet ninjas. Since version 12.04.4 Security Onion comes with the new Bro 2.2 installed by default so all you need to do is to open the terminal, grab the samples and maybe some coffee… (There is never enough coffee!).

Traffic Analysis with Bro

We will start replaying our first sample through Bro with:

 $ bro –r sample1.pcap local

This command tells Bro to read and process sample1.pcap, pretty much like tcpdump or any other pcap tool does. By adding the keyword “local” at the end of the command, we ask Bro to load the ‘local’ script file, which in SecurityOnion is located in /opt/bro/share/bro/site/local.bro.

When the command is completed, Bro will generate a number of logs in the current working directory. These logs are highly structured, plain text ASCII and therefore Unix friendly, meaning that you can use your command line kung-fu with awk, grep, sort, uniq, head, tail and all the other usual suspects.

To see the summary of connections for sample1.pcap we can have a quick look at conn.log:

 $ cat conn.log

The figure above shows an excerpt of the output of this command. Notice how the output of Bro logs is structured in columns, each of them representing different fields. These fields are shown in the 7th line of the output header, starting with "ts" (timestamp in seconds since epoch) and "uid" (a unique identifier of the connection that is used to correlate information across Bro logs). Refer to the Bro documentation to learn more about the rest of the fields.

 #separator \x09
#set_separator ,
#empty_field (empty)
#unset_field -
#path conn
#open 2014-03-07-13-51-01
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p proto service duration orig_bytes resp_bytes conn_state local_orig missed_bytes history orig_pkts orig_ip_bytesresp_pkts resp_ip_bytes tunnel_parents
#types time string addr port addr port enum string interval count count string bool count string count count count count table[string]

We can observe a number of connections to port 80 (tcp) and port 53 (udp). Conn.log also reports the result of these connections under the field conn_state. Let’s have a closer look at that using bro-cut an awk-based field extractor for Bro logs.

 $ cat conn.log | bro-cut id.orig_h, id.orig_p, id.resp_h, id.resp_p, proto, conn_state
…
172.16.88.10 49508 172.16.88.135 80 tcp REJ
172.16.88.10 49510 172.16.88.135 80 tcp REJ
172.16.88.10 57852 172.16.88.135 53 udp SF
172.16.88.10 49509 172.16.88.135 80 tcp REJ
172.16.88.10 57399 172.16.88.135 53 udp SF
172.16.88.10 49510 172.16.88.135 80 tcp REJ
172.16.88.10 57456 172.16.88.135 53 udp SF
172.16.88.10 49511 172.16.88.135 80 tcp S0
172.16.88.10 62602 172.16.88.135 53 udp SF
172.16.88.10 54957 172.16.88.135 53 udp SF
172.16.88.10 49511 172.16.88.135 80 tcp SH
172.16.88.10 49512 172.16.88.135 80 tcp S0
172.16.88.10 64623 172.16.88.135 53 udp SF
172.16.88.10 53702 172.16.88.135 53 udp SF
172.16.88.10 49512 172.16.88.135 80 tcp SH
172.16.88.10 49513 172.16.88.135 80 tcp S0
172.16.88.10 52164 172.16.88.135 53 udp SF
172.16.88.10 49513 172.16.88.135 80 tcp SH
172.16.88.10 49516 172.16.88.135 80 tcp S0
172.16.88.10 54832 172.16.88.135 53 udp SF
172.16.88.10 49516 172.16.88.135 80 tcp SH
172.16.88.10 49517 172.16.88.135 80 tcp S0
172.16.88.10 64102 172.16.88.135 53 udp SF
172.16.88.10 51110 172.16.88.135 53 udp SF
172.16.88.10 49517 172.16.88.135 80 tcp SH
172.16.88.10 49518 172.16.88.135 80 tcp S0
172.16.88.10 55957 172.16.88.135 53 udp SF
172.16.88.10 49519 172.16.88.135 80 tcp S0
172.16.88.10 58988 172.16.88.135 53 udp SF
172.16.88.10 49518 172.16.88.135 80 tcp SH

In this case, we can observe that some of the connections attempted on port 80 were rejected (REJ), while others never had a reply (S0) or left the connection half-open (SH, which means a SYN-ACK from the responder was never seen). The reason for this behavior is that sample1.pcap was obtained from one of my sandboxes where 172.16.88.135 is a Virtual Machine running Remnux with fakedns and netcat listening on port 80 instead of a full web server.

Since we know that there is some http traffic going on here, let’s have a look at another log generated by Bro: http.log

 $ cat http.log | bro-cut id.orig_h, id.orig_p, id.resp_h, id.resp_p, host, uri, referrer

172.16.88.10 49493 172.16.88.135 80 f52pwerp32iweqa57k37lwp22erl48g63m39n60ou.net / -
172.16.88.10 49495 172.16.88.135 80 h54jtbqmuj56hwb48e41p42g33h34c29grbqfxm29.ru / -
172.16.88.10 49511 172.16.88.135 80 iqcqmrn30iuoubuo11crfydvkylrbtmtev.info / -
172.16.88.10 49512 172.16.88.135 80 ezdsaqbulsgzh44m59p42eqmrkxa57n40brcq.com / -
172.16.88.10 49513 172.16.88.135 80 o41lwmqnqarmxiyi35iyftpzaye21osjyjq.ru / -
172.16.88.10 49516 172.16.88.135 80 n30arh24frisbslqmqoxgvpvk47o11pritev.biz / -
172.16.88.10 49517 172.16.88.135 80 jsa57n20hyisjxcre11fwl58gta37i65ovf32o51.info / -
172.16.88.10 49518 172.16.88.135 80 j36lxf52hsj56itc49lqayoveymwfzosi15jw.org / -
172.16.88.10 49519 172.16.88.135 80 g53lvo61ayoucrm49kzgvm69irhwl58erjwfu.net / -
...

Anything weird here? Definitely! The host field of the http.log shows entries that don’t seem to correspond with normal browsing.

A closer look at the dns.log produced by Bro will confirm this:

 $ cat dns.log | bro-cut query | sort –u

a37fwf32k17gsgylqb58oylzgvlsi35b58m19bt.com
a47d20ayd10nvkshqn50lrltgqcxb68n20gup62.com
a47dxn60c59pziulsozaxm59dqj26dynvfsnw.com
a67gwktaykulxczeueqf52mvcue61e11jrc59.com
axgql48mql28h34k67fvnylwo51csetj16gzcx.ru
ayp52m49msmwmthxoslwpxg43evg63esmreq.info
azg63j36dyhro61p32brgyo21k37fqh14d10k37fx.com
cvlslworouardudtcxato51hscupunua57.org
cyh44jud50g33iuarlzgqbup22fqisixf62kr.org
d10h34othyp62b18lyfwnzazj26p42fud50gzc49.biz
d20iwe51ftitg53lvl18a27hvlqjyjtd20gue61.com
dqhzhtbto21h14lvp12iqhtlrnxasarcte61.biz
drp42i25ati55m69pvgza57nyh34hwk57i55m19n60.ru
iqcqmrn30iuoubuo11crfydvkylrbtmtev.info
iqo11c69mud20krk57j16fqnrfwgva67oraql48.com
isjqn30a27hwgqbxnxksi65hrnsgyc49mylt.biz
iupqhxfwpylxm29jsexovj16cqfybwb68aw.org
iwpslvesj26i65oynxhtoyc39o41asdvnqc59.com
j36lxf52hsj56itc49lqayoveymwfzosi15jw.org
jshvprc29ntm69p52j36a17m39ozk67g53crfqow.net
jvbtore21fzm39fse51p32auizl28gxaul68px.com
k17g63l58jucvd30brhyovhsptd10lxd60gqfv.biz
k27ori65cve61kvc49hxptdrb48myo61fueves.org
k47isgzkxp62o51etmwazewmvpvgwbvmvfz.com
kqd60lvlsg63bsg33e11i55kvo41nrj36hzbthr.info
kvm49mynrd60l48lynre21hqfun20a47hyn20kq.org
kyoqpxg53nuf42g43oqo21l48a17d40o31k67j16h44.org
l18k17mzpum69jvlyp62c29hzeyi25kta47a37lv.ru
n50owhwguj66evkug33ewntn10n40puhtlxay.org
nrd30j46cxnwmyc69bscrcyiuhvf22otg43mq.com
nub58p52b38ismtg63mwlwm29evd20g13f52otb68.info
nxhyosg43a47exhum19g23f52fro21byayk57fs.info
o21mwm29gzouhvpub68g43dzntgzn30aultd30.net
o31j16n30eyiql58btmxe21euowb38pxf22b68ou.net
psgsgumukxb18b58dxd40e31f22g53a37bzmxcz.com
pxoxgzkqmqp12a47azjzpze11hteri35iti45.info
pyn30h64krm69bwf12azp52fulskvh24m19nrjy.org
(output truncated)

Looking at the length of the domains requested we could observe a pattern. First of all we will cut out the TLDs (com, info, net…) and then calculate the length of each of the strings.

 $ cat dns.log | bro-cut query | sort -u | cut -d . -f1 > domains-withoutTLD
 $ for i in `cat domains-withoutTLD`; do echo "${#i}"; done | sort –u

34
35
36
37
38
39
40
41
42
43

So all these strings are within a close range of 34 to 43 characters long. Casualty? Not really, a variant of the ZeuS botnet, the so-called ZeuS Gameover, is known for implementing P2P and Domain Generation Algorithm (DGA) communications to determine the current Command and Control (C&C) domain. When these bots can’t communicate with its botnet via P2P, DGA is used. The domain names generated by ZeuS Gameover consist of a string with a length of 32 to 48 chars and one of the following TLDs: ru, com, biz, net or org. The list contains over 1000 domains and changes every 7 days, based on the current date.

A regular expression like this can be used to search for ZeuS domains:

 [a-z0-9]{32,48}\.(ru|com|biz|info|org|net)

ZeuS Gameover has been reported as one of the most active banking Trojan in 2013, along with Citadel, another well-known piece of malware that has targeted a large number of financial organizations with focus on Europe and the Middle East.

Kleissner.org maintains a list of 1000 valid domains for ZeuS Gameover and updates it every week. A simple bash script can compare a list of domains obtained from dns.log to the list published by Kleissner.org:

 $ cat dns.log | bro-cut query | sort -u | > domains

$ for i in `cat domains`; do grep $i ZeusGameover_Domains; done

SSL Traffic and Notice.log

Malware authors are making increased use of SSL traffic to mask communications with C&C servers, data exfiltration and other malicious actions. Since decrypting SSL communications is not feasible in most of the scenarios, malware analysts must employ other techniques to spot badness in encrypted sessions. TLS or SSL handshake failures, suspicious, invalid or weird certificates can be indicators of such badness in your network traffic and the good news is that Bro, by default, does some of that analysis already for you, suggesting potentially interesting network activity for you to investigate.

To demonstrate how Bro can help with finding those indicators, we’ll look at sample2.pcap

 $ bro -r sample2.pcap local

See that a notice.log file has been created in the working directory, along with http.log, ssl.log and others.

Let’s have a look at the contents of notice.log:

 $ cat notice.log | bro-cut msg, sub

SSL certificate validation failed with (unable to get local issuer certificate) CN=www.tl6ou6ap7fjroh2o.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.vklxa6kz.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.5rthkzelyecfpir56.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.dctpbbpif6zy54mspih.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.getvdkk6ibned7k3krkc.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.hstk2emyai4yqa5.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.icab4ctxldy.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.bnbhckfytu.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.e6nbbzucq2zrhzqzf.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.cvapjjtbfd6yohbarw5q.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.zhbohcqeanv5hw.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.v6onqj4tmlmcchw23bl.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.gaqq6ld5gdgib.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.hlixz2cz43jepqwl.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.jn4k5f5wi65edy7emll.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.4geh5kzuywu3u.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.rshopmsscpfbw6p.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.c2rwawybhf.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.3gbl5nlxxs37ycdbhvcr.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.qhpomorewmsgxkg2d.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.wtytpviziqgpxsz.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.f5zhq25qq.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.3ktww4bg.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.c2nhdwaukm.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.iqm3bvunu.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.pts5agysxnvyyvbysfv.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.ygn472gapjnkkbplith.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.jaaok2kcxn.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.ktq2go444i.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.ferqncujta3wvl.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.2u5j3bw2r.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.uopxo7ik3i2nti.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.2ugfspjvd3tjaa.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.vjonqvyku.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.6canpulqbqdbqkxc6is.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.42ixw6g5fu44w7sth.net
SSL certificate validation failed with (unable to get local issuer certificate) CN=www.kqwm2iwsvh4xd2q.net
(output truncated)

Hmmm… that looks really suspicious again!

Let’s have a look at the contents of the ssl.log now:

 $ cat ssl.log | bro-cut server_name, subject, issuer_subject

www.seu4oxkf6.com CN=www.tl6ou6ap7fjroh2o.net CN=www.tbajutyf.com
www.fjpv.com CN=www.vklxa6kz.net CN=www.ohqnkijzzo5vt.com
www.pdpqsu.com CN=www.5rthkzelyecfpir56.net CN=www.qbboo7mcwzv7.com
www.vkojgy6imcvg.com CN=www.dctpbbpif6zy54mspih.net CN=www.m6hoayo5cga.com
www.dbyryztrr7sui3rskjvikes.com CN=www.getvdkk6ibned7k3krkc.net CN=www.7pz4gaio6uc25dyfor.com
www.xqwf7xs6nycmciil3t5e4fy5v.com CN=www.hstk2emyai4yqa5.net CN=www.wc62pgaaorhccubc.com
www.rix56ao4hxldum4zbyim.com CN=www.icab4ctxldy.net CN=www.wmylm3gln.com
www.uabjbwhkanlomodm5xst.com CN=www.bnbhckfytu.net CN=www.w4rlc25peis46haafa.com
www.dl2eypxu3.com CN=www.e6nbbzucq2zrhzqzf.net CN=www.cbj5ajz4qgeieshx32n.com
www.ebd7caljnsax.com CN=www.cvapjjtbfd6yohbarw5q.net CN=www.brbqn4rqhscp4rdq.com
www.qnqxclmrk2cqskkb732czjma.com CN=www.zhbohcqeanv5hw.net CN=www.w3rfg432.com
www.bxstw.com CN=www.v6onqj4tmlmcchw23bl.net CN=www.yc2xz27yoe76.com
www.b6lwb6v.com CN=www.gaqq6ld5gdgib.net CN=www.nu6u7osxzhmgx64.com
www.xf3225vc7drvcgborjll3.com CN=www.ryfg74xnxjg42ln3.net CN=www.y6bn3trq5cesxk.com
www.7dezfrpxuvmtr.com CN=www.svhbg7k2ed7ijcloj2.net CN=www.tfijljrmlqi.net
www.pcnia4i6e6w.com CN=www.yastvwre5fvpq3av.net CN=www.c6dmymzw.com
www.zvnbxtgu5dwe6lwc.com CN=www.u7c2brldvuk3xil.net CN=www.owgtwdiazfmzmwu6a5.com
www.ofbw37.com CN=www.qyccfgkjb.net CN=www.gs52pdnqyd.com
www.zr7kfc25mofcq.com CN=www.oi6z76t4.net CN=www.oe7gv5kxhix2i7eil.com
www.cmeh4agzyphi.com CN=www.jnqlvjcoou26znx.net CN=www.p4tgeg6dhp.com
www.k2u3bnbhxhpl.com CN=www.llhtnj3yyk.net CN=www.qotouwlbhjt.com
www.bneghg3axzl75sn7k2pdzor.com CN=www.shucgk26k4x5inet.net CN=www.j4n2j3sz57cf.com
www.ytedf3vqd4hxjo7rmhe6.com CN=www.noyxmydlc3ncgwv4t7hc.net CN=www.xem2wczmpqtypvzzpsex.com
www.by4seu7gjht7.com CN=www.wgrv4vpyx.net CN=www.eyvoebmi4ls6o6.com
www.cx7dg5bcn4cy.com CN=www.lipko2t5yqirjrqn2e.net CN=www.l4kvblp6bd.com
www.zn26rblhi.com CN=www.5nmv7zbdqdvgbfem6l.net CN=www.l3zkpiwawmpwjbzf.com
www.ecajni2stg3733w4jgi75.com CN=www.k3dbsxb423am5bwcb.net CN=www.uuwdimryu2gi42.com
www.x3os5xrkcr7a2rpmxre2.com CN=www.km6ptswm7mo.net CN=www.giovpc7o3.com
www.2c27bhbej.com CN=www.pymflkqpqdgghnfj.net CN=www.jocupasu2o6b2af2tn.com
www.4x4fp.com CN=www.icab4ctxldy.net CN=www.wmylm3gln.com
www.busdvimuibiundyob3e74js.com CN=www.xwwc4mvab66dnn.net CN=www.7hhuhzlztld46.com
www.zk2sv4vbwtanvh6x.com CN=www.bjxrmwnhp44enzypv6dc.net CN=www.b2ond2dxj.net
www.nijvbs5nuyn7zkemgi.com CN=www.wgwr7qn7v3j.net CN=www.u57w6yc5rvv.com
www.hamsnp.com CN=www.ge26nt2rx.net CN=www.aewmz33hq6rn7x7nud3.com
www.gsen3cievf3px7anzc6j.com CN=www.3zz5we62e.net CN=www.w7sb5mdv7w.com
www.3lwerxmlqmq2jsjioqgx5kkyc.com CN=www.ohfe52bk6gyfzojwgts.net CN=www.jhzi7jmhledqxg.com
www.2ipe23pugsiii.com CN=www.6hfs2womid.net CN=www.aq3w5zrobmejm.com
www.f3vzvxsedn.com CN=www.eelcaqcncssfzliilic.net CN=www.xshjb4uihtmpxh.com
www.hh62esff4qj5.com CN=www.mqhz74wxch4gj.net CN=www.wcmcdpazt7iw7g.com
www.juipuxm76hu6df6.com CN=www.5nmv7zbdqdvgbfem6l.net CN=www.l3zkpiwawmpwjbzf.com
www.6ll3wnw5dmg.com CN=www.suy5hv542.net CN=www.5mypgv7tgzypyaz63w.com
www.h5hgbrs75gl3c5uh5xnld3i.com CN=www.4x4j6xhtk5qh.net CN=www.rmybfv4mrpzlcicfg.net

Again, parsing these logs with bro-cut and other command line tools to generate a list of suspicious domains is straightforward. That list can be compared to a list of well-known malicious domains, or used with various domain reputation services. We will talk more about how to leverage threat intelligence feeds with Bro later in this post.

Let’s carry on with our analysis. A closer look at the http.log reveals some potentially interesting User Agents under the user_agent field:

 $ cat http.log | bro-cut user_agent | sort –u

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022)

cgminer 2.7.5

Can you see that cgminer user agent? It is a well known fact that malware can use unusual, weird or unique user agents as part of the headers of the HTTP requests. A good study on that was written by Robert Vandenbrink.

In this case the user agent indicates that we’re looking at a bot whose purpose is to deliver bitcoin mining traffic. For more information about this particular bot check Liam Randall’s solutions and scripts on his GitHub

The new file analysis framework

The file analysis framework is a new feature introduced with Bro 2.2 that provides plenty of new functionalities to network analysts. One of the most powerful features is the ability to extract files from network streams based on multiple criteria: geo spatial (i.e. per country of origin), signature based, destination based, etc.

Files can be extracted from various protocols including FTP, HTTP, SMTP and IRC. Others like Bit torrent and SMB will be added in the near future.

Thanks to the powerful Bro language, the new file analysis framework can be combined with actions to do awesome stuff like look up in a malware hash registry, upload to virustotal, to a cuckoo sandbox or even tweet the results of your analysis!

To demonstrate some of its capabilities we’ll analyze sample3.pcap. As usual we start replaying the capture with Bro:

 $ bro -r sample3.pcap local

You should have a new log: files.log. Let’s have a look at its contents:

 $ cat files.log | bro-cut fuid, mime_type, filename, total_bytes, md5

FC7cMq18xeqtT9IGD3 application/zip - 31044 0cbc25ade65bcd7a28dd8ac62ea20186

We have a unique entry. We don’t have a filename but Bro has recorded the MIME type and even computed the MD5 hash for us!

Can we extract that file? Of course we can! Open your text editor of choice and save these lines as extract-all.bro

 event file_new(f: fa_file)
        {
                Files::add_analyzer(f, Files::ANALYZER_EXTRACT);
        }

Congratulations! You’ve written your first Bro script. Next, run the capture against Bro again, this time replacing the ‘local’ script with the new one you just created. You might need to run this as root:

 $ bro -r sample3.pcap extract-all.bro

This command will create a new directory extract_files where all files extracted will be located:

 $ ls extract_files

extract-HTTP-FC7cMq18xeqtT9IGD3

Let’s confirm what kind of file we’re looking at:

 $ file extract-HTTP-FC7cMq18xeqtT9IGD3 

extract-HTTP-FC7cMq18xeqtT9IGD3: Zip archive data, at least v2.0 to extract

$ xxd extract-HTTP-FC7cMq18xeqtT9IGD3 | head -10

0000000: 504b 0304 1400 0808 0800 208f 1c41 0000  PK........ ..A..
0000010: 0000 0000 0000 0000 0000 0d00 0000 6234  ..............b4
0000020: 612f 6234 612e 636c 6173 73c5 7979 5c9b  a/b4a.class.yy\.
0000030: 5b76 d8b9 9240 427c 8010 1606 db18 63fb  [v...@B|......c.
0000040: 6110 606c 24b0 0783 0149 0801 daf7 0d09  a.`l$....I......
0000050: edfb 2eb4 22e4 7979 33c9 bc74 3259 babd  ....".yy3..t2Y..
0000060: d725 93ce 6bac a493 f4bd e729 76e3 cc8c  .%..k......)v...
0000070: d32d 69d2 25d3 769a a64d 9ba6 49da a4cd  .-i.%.v..M..I...
0000080: d2c9 d2e9 b44d 9c73 0578 c36f dee4 af9a  .....M.s.x.o....
0000090: 9fbe 7bbe 7bcf 3dfb 39f7 9ecf 3fff a73f  ..{.{.=.9...?..?

$ xxd extract-HTTP-FC7cMq18xeqtT9IGD3 | tail -10

00078b0: db66 0000 6234 612f 6234 642e 636c 6173  .f..b4a/b4d.clas
00078c0: 7350 4b01 0214 0014 0008 0808 0020 8f1c  sPK.......... ..
00078d0: 4167 fdc8 0309 0700 00a7 0f00 000d 0000  Ag..............
00078e0: 0000 0000 0000 0000 0000 0034 7000 0062  ...........4p..b
00078f0: 3461 2f62 3465 2e63 6c61 7373 504b 0102  4a/b4e.classPK..
0007900: 0a00 0a00 0008 0000 208f 1c41 0000 0000  ........ ..A....
0007910: 0000 0000 0000 0000 0400 0000 0000 0000  ................
0007920: 0000 0000 0000 7877 0000 6234 612f 504b  ......xw..b4a/PK
0007930: 0506 0000 0000 0700 0700 9401 0000 9a77  ...............w
0007940: 0000 0000                                ....

While the first bytes in the file header (also known as magic numbers) suggest a ZIP file, the content of the file indicates the presence of Java class files. We can easily confirm that by executing:

 $ jar xf extract-HTTP-FC7cMq18xeqtT9IGD3

Which extracts the Java classes to the b4d directory.

We’ll leave the analysis of the Java classes for now, but can you identify if this is a malicious file with the information we have at this moment? Well, let’s see what others know about this file. Remember the MD5 hash included in the files.log? A quick search in Virustotal reveals that we’re looking at a Java 0-day that was included in the Blackhole Exploit Kit (CVE-2012-4681).

As you can see, the possibilities of using the new file analysis framework are endless. Add a bit of knowledge of the Bro programming language, some python scripting goodness and a few APIs to malware analysis services and you have an awesome cocktail!

Bro, Threat Intelligence and CIF

Threat Intelligence is the new holy grail of security. Finding relevant and up-to-date information on malicious threats is key for all the phases of the security lifecycle, from prevention, to detection, incident response, containment and forensic analysis. The most common types of threat intelligence required by analysts are IP addresses, domains, urls and file hashes that have been observed in relation to malicious activity.

Many organizations provide data feeds that are freely available and that can be used with the new Bro’s Intel Framework to log hits seen in network streams, like those from ZeuS and SpyEye Tracker, Malware Domains, Spamhaus, Shadowserver, Dragon Research Group, and others.

While you could download these data feeds on a regular basis, maintaining an updated repository that is actually usable by your tools can be a daunting task, especially given the number of sources and disparity of formats used. This is where the Collective Intelligence Framework (CIF) comes to the rescue.

CIF is now on version 1 (stable) and allows you to parse, normalize, store, process, query, share and produce data sets of threat intelligence.

Having installed a few CIF servers I can tell you it’s somewhat complex (maybe not complex but rather tedious), so I will refer you to the official documentation if you want to set up your own instance (see the References below). For the rest of this section I will assume that you have access to a running instance of CIF.

To enable the Bro Intel Framework and allow the integration of CIF feeds, add these three lines to your local.bro file (in Security Onion that’s in /opt/bro/share/bro/site/local.bro):

 @load frameworks/intel/seen
@load frameworks/intel/do_notice
@load policy/integration/collective-intel

CIF is used mainly in two ways: either to query for data stored about an IP address, a domain or a url, or to produce feeds based on the stored data sets. The data feeds available in version 1 can be seen here:

https://code.google.com/p/collective-intelligence-framework/wiki/API_FeedTypes_v1

In our example, we’ll generate a list of domains related to malware with a confidence level of 75 or greater. To make sure the output is formatted for Bro append “-p bro”

 $ cif -q domain/malware -c 75 –p bro > domain-malware.intel

Note that this command won’t work if you don’t have CIF installed. If you don’t have access to a CIF server you can grab a copy of a file formatted for Bro here (note that this will be outdated by the time you download it so use it for testing purposes only).

The figure below shows the contents of the file generated in CIF’s native format (without using the BRO plugin).

In order to import the new data feed we just generated we need to configure Bro’s Input Framework. To do so, add the following lines to your local.bro file:

 redef Intel::read_files += {
"/opt/bro/feeds/domain-malware.intel",
};

Where /opt/bro/feeds/domain-malware.intel is where you have placed the file generated by CIF. You can add as many files as you want. For more information about different methods to refer to these .intel files check http://blog.bro.org/2014/01/intelligence-data-and-bro_4980.html.

Now the Input Framework will read the information from our text-based file and will send it to the Intel Framework for processing.

To demonstrate the combined usage of Bro and CIF I have created sample4.pcap, a simple capture that contains a DNS query to a malicious domain (winrar-soft.ru). Let’s replay this capture with Bro after making all the changes described above:

 $ bro -r sample4.pcap local

See how a new file, intel.log has been created:

 $ cat intel.log 

#separator \x09
#set_separator ,
#empty_field (empty)
#unset_field -
#path intel
#open 2014-03-07-21-28-09
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p fuid file_mime_type file_desc seen.indicator seen.indicator_type seen.where sources
#types time string addr port addr port string string string string enum enum table[string]
1394223877.224159 C7J79H2v6YLWMaJEk6 192.168.68.138 54212 192.168.68.1 53 - - - winrar-soft.ru Intel::DOMAIN DNS::IN_REQUEST CIF - need-to-know
#close 2014-03-07-21-28-10

Since winrar-soft.ru was included in the feed generated by CIF and imported into Bro, now we can identify any attempt of connection to this malicious domain.

Conclusions

Security analysts will never have enough tools or resources to fight malware. Bro and CIF are two of those invaluable resources that every malware analyst should be aware of.

As their creators state, Bro is much more than an IDS. Bro is a full-featured network analysis framework created with a powerful tool, the Bro Programming Language.

If you want to know more about Bro, CIF, Malware Analysis or Network Forensics check the References section.

About the author

Ismael Valenzuela (GCFA, GREM, GCIA, GCIH, GPEN, GWAPT, GCWN, GCUX, GSNA, CISSP, CISM, 27001 Lead Auditor & ITIL Certified) works as a Principal Architect at McAfee Foundstone Services EMEA. Find him on twitter at @aboutsecurity or at http://blog.ismaelvalenzuela.com