Crawlector (the name Crawlector is a combination of Crawler & Detector) is a threat hunting framework designed for scanning websites for malicious objects.
Note-1: The framework was first presented at the No Hat conference in Bergamo, Italy on October 22nd, 2022 (Slides, YouTube Recording). Also, it was presented for the second time at the AVAR conference, in Singapore, on December 2nd, 2022.
Note-2: The accompanying tool EKFiddle2Yara (is a tool that takes EKFiddle rules and converts them into Yara rules) mentioned in the talk, was also released at both conferences.
This is for checking for malicious urls against every page being scanned. The framework could either query the list of malicious URLs from URLHaus server (configuration: url_list_web), or from a file on disk (configuration: url_list_file), and if the latter is specified, then, it takes precedence over the former.
It works by searching the content of every page against all URL entries in url_list_web or url_list_file, checking for all occurrences. Additionally, upon a match, and if the configuration option check_url_api is set to true, Crawlector will send a POST request to the API URL set in the url_api configuration option, which returns a JSON object with extra information about a matching URL. Such information includes urlh_status (ex., online, offline, unknown), urlh_threat (ex., malware_download), urlh_tags (ex., elf, Mozi), and urlh_reference (ex., https://urlhaus.abuse.ch/url/1116455/). This information will be included in the log file cl_mlog_<current_date><current_time><(pm|am)>.csv (check below), only if check_url_api is set to true. Otherwise, the log file will include the columns urlh_url (list o f matching malicious URLs) and urlh_hit (number of occurrences for every matching malicious URL), conditional on whether check_url is set to true.
URLHaus feature could be disabled in its entirety by setting the configuration option check_url to false.
It is important to note that this feature could slow scanning considering the huge number of malicious urls (~ 130 million entries at the time of this writing) that need to be checked, and the time it takes to get extra information from the URLHaus server (if the option check_url_api is set to true).
It is very important that you familiarize yourself with the configuration file cl_config.ini before running any session. All of the sections and parameters are documented in the configuration file itself.
The Yara offline scanning feature is a standalone option, meaning, if enabled, Crawlector will execute this feature only irrespective of other enabled features. And, the same is true for the crawling for domains/sites digital certificate feature. Either way, it is recommended that you disable all non-used features in the configuration file.
log_to_file
or log_to_cons
), if a Yara rule references only a module's attributes (ex., PE, ELF, Hash, etc...), then Crawlector will display only the rule's name upon a match, excluding offset and length data.To visit/scan a website, the list of URLs must be stored in text files, in the directory βcl_sitesβ.
Crawlector accepts three types of URLs:
[a-zA-Z0-9_-]{1,128} = <url>
<id>[
depth:<0|1>-><\d+>,
total:<\d+>,
sleep:<\d+>] = <url>
For example,
mfmokbel[depth:1->3,total:10,sleep:0] = https://www.mfmokbel.com
which is equivalent to: mfmokbel[d:1->3,t:10,s:0] = https://www.mfmokbel.com
where, <id> := [a-zA-Z0-9_-]{1,128}
depth, total and sleep, can also be replaced with their shortened versions d, t and s, respectively.
40 (10 + (10*3))
URLs.Note 1: Type 3 URL could be turned into type 1 URL by setting the configuration parameter live_crawler to false, in the configuration file, in the spider section.
Note 2: Empty lines and lines that start with β;β or β//β are ignored.
The spider functionality is what gives Crawlector the capability to find additional links on the targeted page. The Spider supports the following featuers:
Type 3
, for the Spider functionality to workexclude_url
config. option. For example, *.zip|*.exe|*.rar|*.zip|*.7z|*.pdf|.*bat|*.db
include_url
config. option. For example, */checkout/*|*/products/*
exclude_https
add_ext_links
. This feature honours the exclude_url
and include_url
config. option.ext_links_only
. This feature honours the exclude_url
and include_url
config. option.site_ranking
in the configuration file provides some options to alter how the CSV file is to be readsite
section provides the capability to expand on a given site, by attempting to find all available top-level domains (TLDs) and/or subdomains for the same domain. If found, new tlds/subdomains will be checked like any other domainrapid_api_key
in the configuration filefind_tlds
enabled, in addition to Omnisint Labs API tlds results, the framework attempts to find other active/registered domains by going through every tld entry, either, in the tlds_file
or tlds_url
tlds_url
is set, it should point to a url that hosts tlds, each one on a new line (lines that start with either of the characters ';', '#' or '//' are ignored)tlds_file
, holds the filename that contains the list of tlds (same as for tlds_url
; only the tld is present, excluding the '.', for ex., "com", "org")tlds_file
is set, it takes precedence over tlds_url
tld_dl_time_out
, this is for setting the maximum timeout for the dnslookup function when attempting to check if the domain in question resolves or nottld_use_connect
, this option enables the functionality to connect to the domain in question over a list of ports, defined in the option tlds_connect_ports
tlds_connect_ports
accepts a list of ports, comma separated, or a list of ranges, such as 25-40,90-100,80,443,8443 (range start and end are inclusive) tld_con_time_out
, this is for setting the maximum timeout for the connect functiontld_con_use_ssl
, enable/disable the use of ssl when attempting to connect to the domainsave_to_file_subd
is set to true, discovered subdomains will be saved to "\expanded\exp_subdomain_<pm|am>.txt"save_to_file_tld
is set to true, discovered domains will be saved to "\expanded\exp_tld_<pm|am>.txt"exit_here
is set to true, then Crawlector bails out after executing this [site] function, irrespective of other enabled options. It means found sites won't be crawled/spideredcl_sites
are allowed.Open for pull requests and issues. Comments and suggestions are greatly appreciated.
Mohamad Mokbel (@MFMokbel)
YARA rule Analyzer to improve rule quality and performance
YARA rules can be syntactically correct but still dysfunctional. yaraQA tries to find and report these issues to the author or maintainer of a YARA rule set.
The issues yaraQA tries to detect are e.g.:
2 of them
in the condition)$ = "\\Debug\\" fullword
)$ = "AA"
; can be excluded from the analysis using --ignore-performance
)I'm going to extend the test set over time. Each minor version will include new features or new tests.
pip install -r requirements.txt
usage: yaraQA.py [-h] [-f yara files [yara files ...]] [-d yara files [yara files ...]] [-o outfile] [-b baseline] [-l level]
[--ignore-performance] [--debug]
YARA RULE ANALYZER
optional arguments:
-h, --help show this help message and exit
-f yara files [yara files ...]
Path to input files (one or more YARA rules, separated by space)
-d yara files [yara files ...]
Path to input directory (YARA rules folders, separated by space)
-o outfile Output file that lists the issues (JSON, default: 'yaraQA-issues.json')
-b baseline Use a issues baseline (issues found and reviewed before) to filter issues
-l level Minium level to show (1=informational, 2=warning, 3=critical)
--ignore-performance Suppress performance-related rule issues
--debug Debug output
python3 yaraQA.py -d ./test/
Suppress all performance issues and only show detection / logic issues.
python3 yaraQA.py -d ./test/ --ignore-performance
Suppress all issues of informational character
python3 yaraQA.py -d ./test/ -level 2
Use a baseline to only see new issues (not the ones that you've already reviewed). The baseline file is an old JSON output of a reviewed state.
python3 yaraQA.py -d ./test/ -b yaraQA-reviewed-issues.json
Example rules with issues can be found in the ./test
folder.
yaraQA writes the detected issues to a file named yaraQA-issues.json
by default.
This listing shows an example of the output generated by yaraQA in JSON format:
[
{
"rule": "Demo_Rule_1_Fullword_PDB",
"id": "SM1",
"issue": "The rule uses a PDB string with the modifier 'wide'. PDB strings are always included as ASCII strings. The 'wide' keyword is unneeded.",
"element": {
"name": "$s1",
"value": "\\\\i386\\\\mimidrv.pdb",
"type": "text",
"modifiers": [
"ascii",
"wide",
"fullword"
]
},
"level": "info",
"type": "logic",
"recommendation": "Remove the 'wide' modifier"
},
{
"rule": "Demo_Rule_1_Fullword_PDB",
"id": "SM2",
"issue": "The rule uses a PDB string with the modifier 'fullword' but it starts with two backslashes and thus the modifier could lead to a dysfunctional rule.",
"element": {
"name": " $s1",
"value": "\\\\i386\\\\mimidrv.pdb",
"type": "text",
"modifiers": [
"ascii",
"wide",
"fullword"
]
},
"level": "warning",
"type": "logic",
"recommendation": "Remove the 'fullword' modifier"
},
{
"rule": "Demo_Rule_2_Short_Atom",
"id": "PA2",
"issue": "The rule contains a string that turns out to be a very short atom, which could cause a reduced performance of the complete rule set or increased memory usage.",
"element": {
"name": "$s1",
"value": "{ 01 02 03 }",
"type": "byte"
},
"level": "warning",
"type": "performance",
"recommendation": "Try to avoid using such short atoms, by e.g. adding a few more bytes to the beginning or the end (e.g. add a binary 0 in front or a space after the string). Every additional byte helps."
},
{
"rule": "Demo_Rule_3_Fullword_FilePath_Section",
"id": "SM3",
"issue": "The rule uses a string with the modifier 'fullword' but it starts and ends with two backslashes and thus the modifier could lead to a dysfunctional rule.",
"element": {
"name": "$s1",
"value": "\\\\ZombieBoy\\\\",
"type": "text",
"modifiers": [
"ascii",
"fullword"
]
},
"level": "warning",
"type": "logic",
"recommendation": "Remove the 'fullword' modifier"
},
{
"rule": "Demo_Rule_4_Condition_Never_Matches",
"id": "CE1",
"issue": "The rule uses a condition that will never match",
"element": {
"condition_segment": "2 of",
"num_of_strings": 1
},
"level": "error",
"type": "logic",
"recommendation": "Fix the condition"
},
{
"rule": "Demo_Rule_5_Condition_Short_String_At_Pos",
"id": "PA1",
"issue": "This rule looks for a short string at a particular position. A short string represents a short atom and could be rewritten to an expression using uint(x) at position.",
"element": {
"condition_segment": "$mz at 0",
"string": "$mz",
"value": "MZ"
},
"level": "warning",
"type": "performance",
"recommendation": ""
},
{
"rule": "Demo_Rule_5_Condition_Short_String_At_Pos",
"id": "PA2",
"issue": "The rule contains a string that turns out to be a very short atom, which could cause a reduced performance of the complete rule set or increased memory usage.",< br/> "element": {
"name": "$mz",
"value": "MZ",
"type": "text",
"modifiers": [
"ascii"
]
},
"level": "warning",
"type": "performance",
"recommendation": "Try to avoid using such short atoms, by e.g. adding a few more bytes to the beginning or the end (e.g. add a binary 0 in front or a space after the string). Every additional byte helps."
},
{
"rule": "Demo_Rule_6_Condition_Short_Byte_At_Pos",
"id": "PA1",
"issue": "This rule looks for a short string at a particular position. A short string represents a short atom and could be rewritten to an expression using uint(x) at position.",
"element": {
"condition_segment": "$mz at 0",
"string": "$mz",
"value": "{ 4d 5a }"
},
"level": "warning",
"type": "performance",
"recommendation": ""
},
{
"rule": "Demo_Rule_6_Condition_Short_Byte_At_Pos",
"id": "PA2",
"issue": "The rule contains a string that turns out to be a very short atom, which could cause a reduced performance of the complete rule set or increased memory usage.",
"element": {
"name": "$mz",
"value": "{ 4d 5a }",
"type": "byte"
},
"level": "warning",
"type": "performance",
"recommendation": "Try to avoid using such short atoms, by e.g. adding a few more bytes to the beginning or the end (e.g. add a binary 0 in front or a space after the string). Every additional byte helps."
},
{
"rule": "Demo_Rule_6_Condition_Short_Byte_At_Pos",
"id": "SM3",
"issue": "The rule uses a string with the modifier 'fullword' but it starts and ends with two backsla shes and thus the modifier could lead to a dysfunctional rule.",
"element": {
"name": "$s1",
"value": "\\\\Section\\\\in\\\\Path\\\\",
"type": "text",
"modifiers": [
"ascii",
"fullword"
]
},
"level": "warning",
"type": "logic",
"recommendation": "Remove the 'fullword' modifier"
}
]
Sandboxes are commonly used to analyze malware. They provide a temporary, isolated, and secure environment in which to observe whether a suspicious file exhibits any malicious behavior. However, malware developers have also developed methods to evade sandboxes and analysis environments. One such method is to perform checks to determine whether the machine the malware is being executed on is being operated by a real user. One such check is the RAM size. If the RAM size is unrealistically small (e.g., 1GB), it may indicate that the machine is a sandbox. If the malware detects a sandbox, it will not execute its true malicious behavior and may appear to be a benign file
The GetPhysicallyInstalledSystemMemory
API retrieves the amount of RAM that is physically installed on the computer from the SMBIOS firmware tables. It takes a PULONGLONG
parameter and returns TRUE
if the function succeeds, setting the TotalMemoryInKilobytes
to a nonzero value. If the function fails, it returns FALSE
.
The amount of physical memory retrieved by the GetPhysicallyInstalledSystemMemory
function must be equal to or greater than the amount reported by the GlobalMemoryStatusEx
function; if it is less, the SMBIOS data is malformed and the function fails with ERROR_INVALID_DATA
, Malformed SMBIOS data may indicate a problem with the user's computer .
The register rcx
holds the parameter TotalMemoryInKilobytes
. To overwrite the jump address of GetPhysicallyInstalledSystemMemory
, I use the following opcodes: mov qword ptr ss:[rcx],4193B840
. This moves the value 4193B840
(or 1.1 TB) to rcx
. Then, the ret instruction is used to pop the return address off the stack and jump to it, Therefore, whenever GetPhysicallyInstalledSystemMemory
is called, it will set rcx
to the custom value."
The CertVerify is a tool designed to detect executable files (exe, dll, sys) that have been signed with untrusted or leaked code signing certificates. The purpose of this tool is to identify potentially malicious files that have been signed using certificates that have been compromised, stolen, or are not from a trusted source.
Executable files signed with compromised or untrusted code signing certificates can be used to distribute malware and other malicious software. Attackers can use these files to bypass security controls and to make their malware appear legitimate to victims. This tool helps to identify these files so that they can be removed or investigated further.
As a continuous project of the previous malware scanner, i have created such a tool. This type of tool is also essential in the event of a security incident response.
The CertVerify cannot guarantee that all files identified as suspicious are necessarily malicious. It is possible for files to be falsely identified as suspicious, or for malicious files to go undetected by the scanner.
The scanner only targets code signing certificates that have been identified as malicious by the public community. This includes certificates extracted by malware analysis tools and services, and other public sources. There are many unverified malware signing certificates, and it is not possible to obtain the entire malware signing certificate the tool can only detect some of them. For additional detection, you have to extract the certificate's serial number and fingerprint information yourself and add it to the signatures.
The scope of this tool does not include the extraction of code signing information for special rootkits that have already preempted and operated under the kernel, such as FileLess bootkits, or hidden files hidden by high-end technology. In other words, if you run this tool, it will be executed at the user level. Similar functions at the kernel level are more accurate with antirootkit or EDR. Please keep this in mind and focus on the ideas and principles... To implement the principle that is appropriate for the purpose of this tool, you need to development a driver(sys) and run it into the kernel with NT\SYSTEM privileges.
Nevertheless, if you want to run this tool in the event of a Windows system intrusion incident, and your purpose is sys files, boot into safe mode or another boot option that does not load the extra driver(sys) files (load only default system drivers) of the Windows system before running the tool. I think this can be a little more helpful.
Alternatively, mount the Windows system disk to the Linux and run the tool in the Linux environment. I think this could yield better results.
datetime="2023-03-06 20:17:57",scan_id="87ea3e7b-dedc-4016-a43e-5c83f8d27c6e",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject\certverify\test\chrome.exe",signature_hash="sha256",serial_number="0e4418e2dede36dd2974c3443afb5ce5",thumbprint="7d3d117664f121e592ef897973ef9c159150e3d736326e9cd2755f71e0febc0c",subject_name="Google LLC",issu er_name="DigiCert Trusted G4 Code Signing RSA4096 SHA384 2021 CA1",file_created_at="2023-03-03 23:20:41",file_modified_at="2022-04-14 06:17:04"
datetime="2023-03-06 20:17:58",scan_id="87ea3e7b-dedc-4016-a43e-5c83f8d27c6e",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject\certverify\test\LineLauncher.exe",signature_hash="sha256",serial_number="0d424ae0be3a88ff604021ce1400f0dd",thumbprint="b3109006bc0ad98307915729e04403415c83e3292b614f26964c8d3571ecf5a9",subject_name="DigiCert Timestamp 2021",issuer_name="DigiCert SHA2 Assured ID Timestamping CA",file_created_at="2023-03-03 23:20:42",file_modified_at="2022-03-10 18:00:10"
datetime="2023-03-06 20:17:58",scan_id="87ea3e7b-dedc-4016-a43e-5c83f8d27c6e",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject\certverify\test\LineUpdater.exe",signature_hash="sha256",serial_number="0d424ae0be3a88ff604021ce1400f0dd",thumb print="b3109006bc0ad98307915729e04403415c83e3292b614f26964c8d3571ecf5a9",subject_name="DigiCert Timestamp 2021",issuer_name="DigiCert SHA2 Assured ID Timestamping CA",file_created_at="2023-03-03 23:20:42",file_modified_at="2022-04-06 10:06:28"
datetime="2023-03-06 20:17:59",scan_id="87ea3e7b-dedc-4016-a43e-5c83f8d27c6e",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject\certverify\test\TWOD_Launcher.exe",signature_hash="sha256",serial_number="073637b724547cd847acfd28662a5e5b",thumbprint="281734d4592d1291d27190709cb510b07e22c405d5e0d6119b70e73589f98acf",subject_name="DigiCert Trusted G4 RSA4096 SHA256 TimeStamping CA",issuer_name="DigiCert Trusted Root G4",file_created_at="2023-03-03 23:20:42",file_modified_at="2022-04-07 09:14:08"
datetime="2023-03-06 20:18:00",scan_id="87ea3e7b-dedc-4016-a43e-5c83f8d27c6e",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject \certverify\test\VBoxSup.sys",signature_hash="sha256",serial_number="2f451139512f34c8c528b90bca471f767b83c836",thumbprint="3aa166713331d894f240f0931955f123873659053c172c4b22facd5335a81346",subject_name="VirtualBox for Legacy Windows Only Timestamp Kludge 2014",issuer_name="VirtualBox for Legacy Windows Only Timestamp CA",file_created_at="2023-03-03 23:20:43",file_modified_at="2022-10-11 08:11:56"
datetime="2023-03-06 20:31:59",scan_id="f71277c5-ed4a-4243-8070-7e0e56b0e656",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject\certverify\test\chrome.exe",signature_hash="sha256",serial_number="0e4418e2dede36dd2974c3443afb5ce5",thumbprint="7d3d117664f121e592ef897973ef9c159150e3d736326e9cd2755f71e0febc0c",subject_name="Google LLC",issuer_name="DigiCert Trusted G4 Code Signing RSA4096 SHA384 2021 CA1",file_created_at="2023-03-03 23:20:41",file_modified_at="2022-04-14 06:17:04"
datetime="2023-03-06 20:32:00",scan_id="f71277c 5-ed4a-4243-8070-7e0e56b0e656",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject\certverify\test\LineLauncher.exe",signature_hash="sha256",serial_number="0d424ae0be3a88ff604021ce1400f0dd",thumbprint="b3109006bc0ad98307915729e04403415c83e3292b614f26964c8d3571ecf5a9",subject_name="DigiCert Timestamp 2021",issuer_name="DigiCert SHA2 Assured ID Timestamping CA",file_created_at="2023-03-03 23:20:42",file_modified_at="2022-03-10 18:00:10"
datetime="2023-03-06 20:32:00",scan_id="f71277c5-ed4a-4243-8070-7e0e56b0e656",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject\certverify\test\LineUpdater.exe",signature_hash="sha256",serial_number="0d424ae0be3a88ff604021ce1400f0dd",thumbprint="b3109006bc0ad98307915729e04403415c83e3292b614f26964c8d3571ecf5a9",subject_name="DigiCert Timestamp 2021",issuer_name="DigiCert SHA2 Assured ID Timestamping CA",file_created_at="2023-03-03 23:20:42",file_modified_at="2022-04-06 10:06:28"
datetime="2023-03-06 20:32:01",scan_id="f71277c5-ed4a-4243-8070-7e0e56b0e656",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject\certverify\test\TWOD_Launcher.exe",signature_hash="sha256",serial_number="073637b724547cd847acfd28662a5e5b",thumbprint="281734d4592d1291d27190709cb510b07e22c405d5e0d6119b70e73589f98acf",subject_name="DigiCert Trusted G4 RSA4096 SHA256 TimeStamping CA",issuer_name="DigiCert Trusted Root G4",file_created_at="2023-03-03 23:20:42",file_modified_at="2022-04-07 09:14:08"
datetime="2023-03-06 20:32:02",scan_id="f71277c5-ed4a-4243-8070-7e0e56b0e656",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject\certverify\test\VBoxSup.sys",signature_hash="sha256",serial_number="2f451139512f34c8c528b90bca471f767b83c836",thumbprint="3aa166713331d894f240f0931955f123873659053c172c4b22facd5335a81346",subjec t_name="VirtualBox for Legacy Windows Only Timestamp Kludge 2014",issuer_name="VirtualBox for Legacy Windows Only Timestamp CA",file_created_at="2023-03-03 23:20:43",file_modified_at="2022-10-11 08:11:56"
datetime="2023-03-06 20:33:45",scan_id="033976ae-46cb-4c2e-a357-734353f7e09a",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject\certverify\test\chrome.exe",signature_hash="sha256",serial_number="0e4418e2dede36dd2974c3443afb5ce5",thumbprint="7d3d117664f121e592ef897973ef9c159150e3d736326e9cd2755f71e0febc0c",subject_name="Google LLC",issuer_name="DigiCert Trusted G4 Code Signing RSA4096 SHA384 2021 CA1",file_created_at="2023-03-03 23:20:41",file_modified_at="2022-04-14 06:17:04"
datetime="2023-03-06 20:33:45",scan_id="033976ae-46cb-4c2e-a357-734353f7e09a",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject\certverify\test\LineLauncher.exe",signature_hash="sha 256",serial_number="0d424ae0be3a88ff604021ce1400f0dd",thumbprint="b3109006bc0ad98307915729e04403415c83e3292b614f26964c8d3571ecf5a9",subject_name="DigiCert Timestamp 2021",issuer_name="DigiCert SHA2 Assured ID Timestamping CA",file_created_at="2023-03-03 23:20:42",file_modified_at="2022-03-10 18:00:10"
datetime="2023-03-06 20:33:45",scan_id="033976ae-46cb-4c2e-a357-734353f7e09a",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject\certverify\test\LineUpdater.exe",signature_hash="sha256",serial_number="0d424ae0be3a88ff604021ce1400f0dd",thumbprint="b3109006bc0ad98307915729e04403415c83e3292b614f26964c8d3571ecf5a9",subject_name="DigiCert Timestamp 2021",issuer_name="DigiCert SHA2 Assured ID Timestamping CA",file_created_at="2023-03-03 23:20:42",file_modified_at="2022-04-06 10:06:28"
datetime="2023-03-06 20:33:46",scan_id="033976ae-46cb-4c2e-a357-734353f7e09a",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192. 168.0.23",infected_file="F:\code\pythonProject\certverify\test\TWOD_Launcher.exe",signature_hash="sha256",serial_number="073637b724547cd847acfd28662a5e5b",thumbprint="281734d4592d1291d27190709cb510b07e22c405d5e0d6119b70e73589f98acf",subject_name="DigiCert Trusted G4 RSA4096 SHA256 TimeStamping CA",issuer_name="DigiCert Trusted Root G4",file_created_at="2023-03-03 23:20:42",file_modified_at="2022-04-07 09:14:08"
datetime="2023-03-06 20:33:47",scan_id="033976ae-46cb-4c2e-a357-734353f7e09a",os_version="Windows",hostname="DESKTOP-S5VJGLH",ip_address="192.168.0.23",infected_file="F:\code\pythonProject\certverify\test\VBoxSup.sys",signature_hash="sha256",serial_number="2f451139512f34c8c528b90bca471f767b83c836",thumbprint="3aa166713331d894f240f0931955f123873659053c172c4b22facd5335a81346",subject_name="VirtualBox for Legacy Windows Only Timestamp Kludge 2014",issuer_name="VirtualBox for Legacy Windows Only Timestamp CA",file_created_at="2023-03-03 23:20:43",file_modified_at="2022-10-11 08:11:56"
The tool is being tested in the beta phase, and it only gathers MacOS system information at this time.
The code is poorly organized and requires significant improvements.
Bash tool used for proactive detection of malicious activity on macOS systems.
I was inspired by Venator-Swift and decided to create a bash version of the tool.
curl https://raw.githubusercontent.com/ab2pentest/MacOSThreatTrack/main/MacOSThreatTrack.sh | bash
[+] System info
[+] Users list
[+] Environment variables
[+] Process list
[+] Active network connections
[+] SIP status
[+] GateKeeper status
[+] Zsh history
[+] Bash history
[+] Shell startup scripts
[+] PF rules
[+] Periodic scripts
[+] CronJobs list
[+] LaunchDaemons data
[+] Kernel extensions
[+] Installed applications
[+] Installation history
[+] Chrome extensions
Visually inspect all of the regex matches (and their sexier, more cloak and dagger cousins, the YARA matches) found in binary data and/or text. See what happens when you force various character encodings upon those matched bytes. With colors.
pipx install yaralyzer
# Scan against YARA definitions in a file:
yaralyze --yara-rules /secret/vault/sigmunds_malware_rules.yara lacan_buys_the_dip.pdf
# Scan against an arbitrary regular expression:
yaralyze --regex-pattern 'good and evil.*of\s+\w+byte' the_crypto_archipelago.exe
# Scan against an arbitrary YARA hex pattern
yaralyze --hex-pattern 'd0 93 d0 a3 d0 [-] 9b d0 90 d0 93' one_day_in_the_life_of_ivan_cryptosovich.bin
'/.+/'
and immediately get a window into all the bytes in the file that live between front slashes. Same story for quotes, BOMs, etc. Any regex YARA can handle is supported so the sky is the limit.chardet
library is a sophisticated library for guessing character encodings and it is leveraged here.chardet
will also be leveraged to see if the bytes fit the pattern of any known encoding. If chardet
is confident enough (configurable), an attempt at decoding the bytes using that encoding will be displayed.The Yaralyzer's functionality was extracted from The Pdfalyzer when it became apparent that visualizing and decoding pattern matches in binaries had more utility than just in a PDF analysis tool.
YARA, for those who are unaware1, is branded as a malware analysis/alerting tool but it's actually both a lot more and a lot less than that. One way to think about it is that YARA is a regular expression matching engine on steroids. It can locate regex matches in binaries like any regex engine but it can also do far wilder things like combine regexes in logical groups, compare regexes against all 256 XORed versions of a binary, check for base64
and other encodings of the pattern, and more. Maybe most importantly of all YARA provides a standard text based format for people to share their 'roided regexes with the world. All these features are particularly useful when analyzing or reverse engineering malware, whose authors tend to invest a great deal of time into making stuff hard to find.
But... that's also all YARA does. Everything else is up to the user. YARA's just a match engine and if you don't know what to match (or even what character encoding you might be able to match in) it only gets you so far. I found myself a bit frustrated trying to use YARA to look at all the matches of a few critical patterns:
\".+\"
and \'.+\'
)/.+/
). Front slashes demarcate a regular expression in many implementations and I was trying to see if any of the bytes matching this pattern were actually regexes.YARA just tells you the byte position and the matched string but it can't tell you whether those bytes are UTF-8, UTF-16, Latin-1, etc. etc. (or none of the above). I also found myself wanting to understand what was going in the region of the matched bytes and not just in the matched bytes. In other words I wanted to scope the bytes immediately before and after whatever got matched.
Enter The Yaralyzer, which lets you quickly scan the regions around matches while also showing you what those regions would look like if they were forced into various character encodings.
It's important to note that The Yaralyzer isn't a full on malware reversing tool. It can't do all the things a tool like CyberChef does and it doesn't try to. It's more intended to give you a quick visual overview of suspect regions in the binary so you can hone in on the areas you might want to inspect with a more serious tool like CyberChef.
Install it with pipx
or pip3
. pipx
is a marginally better solution as it guarantees any packages installed with it will be isolated from the rest of your local python environment. Of course if you don't really have a local python environment this is a moot point and you can feel free to install with pip
/pip3
.
pipx install yaralyzer
Run yaralyze -h
to see the command line options (screenshot below).
For info on exporting SVG images, HTML, etc., see Example Output.
If you place a filed called .yaralyzer
in your home directory or the current working directory then environment variables specified in that .yaralyzer
file will be added to the environment each time yaralyzer is invoked. This provides a mechanism for permanently configuring various command line options so you can avoid typing them over and over. See the example file .yaralyzer.example
to see which options can be configured this way.
Only one .yaralyzer
file will be loaded and the working directory's .yaralyzer
takes precedence over the home directory's .yaralyzer
.
Yaralyzer
is the main class. It has a variety of constructors supporting:
.yara
file in a directorybytes
Should you want to iterate over the BytesMatch
(like a re.Match
object for a YARA match) and BytesDecoder
(tracks decoding attempt stats) objects returned by The Yaralyzer, you can do so like this:
from yaralyzer.yaralyzer import Yaralyzer
yaralyzer = Yaralyzer.for_rules_files(['/secret/rule.yara'], 'lacan_buys_the_dip.pdf')
for bytes_match, bytes_decoder in yaralyzer.match_iterator():
do_stuff()
The Yaralyzer can export visualizations to HTML, ANSI colored text, and SVG vector images using the file export functionality that comes with Rich. SVGs can be turned into png
format images with a tool like Inkscape or cairosvg
. In our experience they both work though we've seen some glitchiness with cairosvg
.
PyPi Users: If you are reading this document on PyPi be aware that it renders a lot better over on GitHub. Pretty pictures, footnotes that work, etc.
chardet.detect()
thinks about the likelihood your bytes are in a given encoding/language:chardet
s behest
PersistenceSniper is a Powershell script that can be used by Blue Teams, Incident Responders and System Administrators to hunt persistences implanted in Windows machines. The script is also available on Powershell Gallery. |
Why writing such a tool, you might ask. Well, for starters, I tried looking around and I did not find a tool which suited my particular use case, which was looking for known persistence techniques, automatically, across multiple machines, while also being able to quickly and easily parse and compare results. Sure, Sysinternals' Autoruns is an amazing tool and it's definitely worth using, but, given it outputs results in non-standard formats and can't be run remotely unless you do some shenanigans with its command line equivalent, I did not find it a good fit for me. Plus, some of the techniques I implemented so far in PersistenceSniper have not been implemented into Autoruns yet, as far as I know. Anyway, if what you need is an easy to use, GUI based tool with lots of already implemented features, Autoruns is the way to go, otherwise let PersistenceSniper have a shot, it won't miss it :)
Using PersistenceSniper is as simple as:
PS C:\> git clone https://github.com/last-byte/PersistenceSniper
PS C:\> Import-Module .\PersistenceSniper\PersistenceSniper\PersistenceSniper.psd1
PS C:\> Find-AllPersistence
If you need a detailed explanation of how to use the tool or which parameters are available and how they work, PersistenceSniper's Find-AllPersistence
supports Powershell's help features, so you can get detailed, updated help by using the following command after importing the module:
Get-Help -Name Find-AllPersistence -Full
PersistenceSniper's Find-AllPersistence
returns an array of objects of type PSCustomObject with the following properties:
PS C:\> Find-AllPersistence | Where-Object "Access Gained" -EQ "System"
Of course, being PersistenceSniper a Powershell-based tool, some cool tricks can be performed, like passing its output to Out-GridView
in order to have a GUI-based table to interact with.
As already introduced, Find-AllPersistence
outputs an array of Powershell Custom Objects. Each object has the following properties, which can be used to filter, sort and better understand the different techniques the function looks for:
Find-AllPersistence
without a -ComputerName
parameter, PersistenceSniper will run only on the local machine. Otherwise it will run on the remote computer(s) you specify;Let's face it, hunting for persistence techniques also comes with having to deal with a lot of false positives. This happens because, while some techniques are almost never legimately used, many indeed are by legit software which needs to autorun on system boot or user login.
This poses a challenge, which in many environments can be tackled by creating a CSV file containing known false positives. If your organization deploys systems using something like a golden image, you can run PersistenceSniper on a system you just created, get a CSV of the results and use it to filter out results on other machines. This approach comes with the following benefits:
Find-AllPersistence
comes with parameters allowing direct output of the findings to a CSV file, while also being able to take a CSV file as input and diffing the results.
PS C:\> Find-AllPersistence -DiffCSV false_positives.csv
Β
One cool way to use PersistenceSniper my mate Riccardo suggested is to use it in an incremental way: you could setup a Scheduled Task which runs every X hours, takes in the output of the previous iteration through the -DiffCSV
parameter and outputs the results to a new CSV. By keeping track of the incremental changes, you should be able to spot within a reasonably small time frame new persistences implanted on the machine you are monitoring.
The topic of persistence, especially on Windows machines, is one of those which see new discoveries basically every other week. Given the sheer amount of persistence techniques found so far by researchers, I am still in the process of implementing them. So far the following 31 techniques have been implemented successfully:
The techniques implemented in this script have already been published by skilled researchers around the globe, so it's right to give credit where credit's due. This project wouldn't be around if it weren't for:
I'd also like to give credits to my fellow mates at @APTortellini, in particular Riccardo Ancarani, for the flood of ideas that helped it grow from a puny text-oriented script to a full-fledged Powershell tool.
This project is under the CC0 1.0 Universal license. TL;DR: you can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.