In essence, the main idea came to use WAF + YARA (YARA right-to-left = ARAY) to detect malicious files at the WAF level before WAF can forward them to the backend e.g. files uploaded through web functions see: https://owasp.org/www-community/vulnerabilities/Unrestricted_File_Upload
When a web page allows uploading files, most of the WAFs are not inspecting files before sending them to the backend. Implementing WAF + YARA could provide malware detection before WAF forwards the files to the backend.
Yes, one solution is to use ModSecurity + Clamav, most of the pages call ClamAV as a process and not as a daemon, in this case, analysing a file could take more than 50 seconds per file. See this resource: https://kifarunix.com/intercept-malicious-file-upload-with-modsecurity-and-clamav/
:-( A few clues here Black Hat Asia 2019 please continue reading and see below our quick LAB deployment.
Basically, It is a quick deployment (1) with pre-compiled and ready-to-use YARA rules via ModSecurity (WAF) using a custom rule; (2) this custom rule will perform an inspection and detection of the files that might contain malicious code, (3) typically web functions (upload files) if the file is suspicious will reject them receiving a 403 code Forbidden by ModSecurity.
YaraCompile.py
compiles all the yara rules. (Python3 code)test.conf
is a virtual host that contains the mod security rules. (ModSecurity Code)modsec_yara.py
in order to inspect the file that is trying to upload. (Python3 code)/YaraRules/Compiled
/YaraRules/rules
/YaraRules/YaraScripts
/etc/apache2/sites-enabled
/temporal
Blueteamers
: Rule enforcement, best alerting, malware detection on files uploaded through web functions.Redteamers/pentesters
: GreyBox scope , upload and bypass with a malicious file, rule enforcement.Security Officers
: Keep alerting, threat hunting.SOC
: Best monitoring about malicious files.CERT
: Malware Analysis, Determine new IOC.The Proof of Concept is based on Debian 11.3.0 (stable) x64 OS system, OWASP CRC v3.3.2 and Yara 4.0.5, you will find the automatic installation script here wafaray_install.sh
and an optional manual installation guide can be found here: manual_instructions.txt
also a PHP page has been created as a "mock" to observe the interaction and detection of malicious files using WAF + YARA.
alex@waf-labs:~$ su root
root@waf-labs:/home/alex#
# Remember to change YOUR_USER by your username (e.g waf)
root@waf-labs:/home/alex# sed -i 's/^\(# User privi.*\)/\1\nalex ALL=(ALL) NOPASSWD:ALL/g' /etc/sudoers
root@waf-labs:/home/alex# exit
alex@waf-labs:~$ sudo sed -i 's/^\(deb cdrom.*\)/#\1/g' /etc/apt/sources.list
alex@waf-labs:~$ sudo sed -i 's/^# \(deb\-src http.*\)/ \1/g' /etc/apt/sources.list
alex@waf-labs:~$ sudo sed -i 's/^# \(deb http.*\)/ \1/g' /etc/apt/sources.list
alex@waf-labs:~$ echo -ne "\n\ndeb http://deb.debian.org/debian/ bullseye main\ndeb-src http://deb.debian.org/debian/ bullseye main\n" | sudo tee -a /etc/apt/sources.list
alex@waf-labs:~$ sudo apt-get update
alex@waf-labs:~$ sudo apt-get install sudo -y
alex@waf-labs:~$ sudo apt-get install git vim dos2unix net-tools -y
alex@waf-labs:~$ git clone https://github.com/alt3kx/wafarayalex@waf-labs:~$ cd wafaray
alex@waf-labs:~$ dos2unix wafaray_install.sh
alex@waf-labs:~$ chmod +x wafaray_install.sh
alex@waf-labs:~$ sudo ./wafaray_install.sh >> log_install.log
# Test your LAB environment
alex@waf-labs:~$ firefox localhost:8080/upload.php
Once the Yara Rules were downloaded and compiled.
It is similar to when you deploy ModSecurity, you need to customize what kind of rule you need to apply. The following log is an example of when the Web Application Firewall + Yara detected a malicious file, in this case, eicar was detected.
Message: Access denied with code 403 (phase 2). File "/temporal/20220812-184146-YvbXKilOKdNkDfySME10ywAAAAA-file-Wx1hQA" rejected by
the approver script "/YaraRules/YaraScripts/modsec_yara.py": 0 SUSPECTED [YaraSignature: eicar]
[file "/etc/apache2/sites-enabled/test.conf"] [line "56"] [id "500002"]
[msg "Suspected File Upload:eicar.com.txt -> /temporal/20220812-184146-YvbXKilOKdNkDfySME10ywAAAAA-file-Wx1hQA - URI: /upload.php"]
$ sudo service apache2 stop
$ sudo service apache2 start
$ cd /var/log
$ sudo tail -f apache2/test_access.log apache2/test_audit.log apache2/test_error.log
A malicious file is uploaded, and the ModSecurity rules plus Yara denied uploading file to the backend if the file matched with at least one Yara Rule. (Example of Malware: https://secure.eicar.org/eicar.com.txt) NOT EXECUTE THE FILE.
For this demo, we disable the rule 933110 - PHP Inject Attack
to validate Yara Rules. A malicious file is uploaded, and the ModSecurity rules plus Yara denied uploading file to the backend if the file matched with at least one Yara Rule. (Example of WebShell PHP: https://github.com/drag0s/php-webshell) NOT EXECUTE THE FILE.
A malicious file is uploaded, and the ModSecurity rules plus Yara denied uploading file to the backend if the file matched with at least one Yara Rule. (Example of Malware Bazaar (RecordBreaker): https://bazaar.abuse.ch/sample/94ffc1624939c5eaa4ed32d19f82c369333b45afbbd9d053fa82fe8f05d91ac2/) NOT EXECUTE THE FILE.
In case that you want to download more yara rules, you can see the following repositories:
Alex Hernandez aka (@_alt3kx_)
Jesus Huerta aka @mindhack03d
Israel Zeron Medina aka @spk085
Visually inspect all of the regex matches (and their sexier, more cloak and dagger cousins, the YARA matches) found in binary data and/or text. See what happens when you force various character encodings upon those matched bytes. With colors.
pipx install yaralyzer
# Scan against YARA definitions in a file:
yaralyze --yara-rules /secret/vault/sigmunds_malware_rules.yara lacan_buys_the_dip.pdf
# Scan against an arbitrary regular expression:
yaralyze --regex-pattern 'good and evil.*of\s+\w+byte' the_crypto_archipelago.exe
# Scan against an arbitrary YARA hex pattern
yaralyze --hex-pattern 'd0 93 d0 a3 d0 [-] 9b d0 90 d0 93' one_day_in_the_life_of_ivan_cryptosovich.bin
'/.+/'
and immediately get a window into all the bytes in the file that live between front slashes. Same story for quotes, BOMs, etc. Any regex YARA can handle is supported so the sky is the limit.chardet
library is a sophisticated library for guessing character encodings and it is leveraged here.chardet
will also be leveraged to see if the bytes fit the pattern of any known encoding. If chardet
is confident enough (configurable), an attempt at decoding the bytes using that encoding will be displayed.The Yaralyzer's functionality was extracted from The Pdfalyzer when it became apparent that visualizing and decoding pattern matches in binaries had more utility than just in a PDF analysis tool.
YARA, for those who are unaware1, is branded as a malware analysis/alerting tool but it's actually both a lot more and a lot less than that. One way to think about it is that YARA is a regular expression matching engine on steroids. It can locate regex matches in binaries like any regex engine but it can also do far wilder things like combine regexes in logical groups, compare regexes against all 256 XORed versions of a binary, check for base64
and other encodings of the pattern, and more. Maybe most importantly of all YARA provides a standard text based format for people to share their 'roided regexes with the world. All these features are particularly useful when analyzing or reverse engineering malware, whose authors tend to invest a great deal of time into making stuff hard to find.
But... that's also all YARA does. Everything else is up to the user. YARA's just a match engine and if you don't know what to match (or even what character encoding you might be able to match in) it only gets you so far. I found myself a bit frustrated trying to use YARA to look at all the matches of a few critical patterns:
\".+\"
and \'.+\'
)/.+/
). Front slashes demarcate a regular expression in many implementations and I was trying to see if any of the bytes matching this pattern were actually regexes.YARA just tells you the byte position and the matched string but it can't tell you whether those bytes are UTF-8, UTF-16, Latin-1, etc. etc. (or none of the above). I also found myself wanting to understand what was going in the region of the matched bytes and not just in the matched bytes. In other words I wanted to scope the bytes immediately before and after whatever got matched.
Enter The Yaralyzer, which lets you quickly scan the regions around matches while also showing you what those regions would look like if they were forced into various character encodings.
It's important to note that The Yaralyzer isn't a full on malware reversing tool. It can't do all the things a tool like CyberChef does and it doesn't try to. It's more intended to give you a quick visual overview of suspect regions in the binary so you can hone in on the areas you might want to inspect with a more serious tool like CyberChef.
Install it with pipx
or pip3
. pipx
is a marginally better solution as it guarantees any packages installed with it will be isolated from the rest of your local python environment. Of course if you don't really have a local python environment this is a moot point and you can feel free to install with pip
/pip3
.
pipx install yaralyzer
Run yaralyze -h
to see the command line options (screenshot below).
For info on exporting SVG images, HTML, etc., see Example Output.
If you place a filed called .yaralyzer
in your home directory or the current working directory then environment variables specified in that .yaralyzer
file will be added to the environment each time yaralyzer is invoked. This provides a mechanism for permanently configuring various command line options so you can avoid typing them over and over. See the example file .yaralyzer.example
to see which options can be configured this way.
Only one .yaralyzer
file will be loaded and the working directory's .yaralyzer
takes precedence over the home directory's .yaralyzer
.
Yaralyzer
is the main class. It has a variety of constructors supporting:
.yara
file in a directorybytes
Should you want to iterate over the BytesMatch
(like a re.Match
object for a YARA match) and BytesDecoder
(tracks decoding attempt stats) objects returned by The Yaralyzer, you can do so like this:
from yaralyzer.yaralyzer import Yaralyzer
yaralyzer = Yaralyzer.for_rules_files(['/secret/rule.yara'], 'lacan_buys_the_dip.pdf')
for bytes_match, bytes_decoder in yaralyzer.match_iterator():
do_stuff()
The Yaralyzer can export visualizations to HTML, ANSI colored text, and SVG vector images using the file export functionality that comes with Rich. SVGs can be turned into png
format images with a tool like Inkscape or cairosvg
. In our experience they both work though we've seen some glitchiness with cairosvg
.
PyPi Users: If you are reading this document on PyPi be aware that it renders a lot better over on GitHub. Pretty pictures, footnotes that work, etc.
chardet.detect()
thinks about the likelihood your bytes are in a given encoding/language:chardet
s behest