Thank you for following me! https://cybdetective.com
Name | Link | Description | Price |
---|---|---|---|
Shodan | https://developer.shodan.io | Search engine for Internet connected host and devices | from $59/month |
Netlas.io | https://netlas-api.readthedocs.io/en/latest/ | Search engine for Internet connected host and devices. Read more at Netlas CookBook | Partly FREE |
Fofa.so | https://fofa.so/static_pages/api_help | Search engine for Internet connected host and devices | ??? |
Censys.io | https://censys.io/api | Search engine for Internet connected host and devices | Partly FREE |
Hunter.how | https://hunter.how/search-api | Search engine for Internet connected host and devices | Partly FREE |
Fullhunt.io | https://api-docs.fullhunt.io/#introduction | Search engine for Internet connected host and devices | Partly FREE |
IPQuery.io | https://ipquery.io | API for ip information such as ip risk, geolocation data, and asn details | FREE |
Name | Link | Description | Price |
---|---|---|---|
Social Links | https://sociallinks.io/products/sl-api | Email info lookup, phone info lookup, individual and company profiling, social media tracking, dark web monitoring and more. Code example of using this API for face search in this repo | PAID. Price per request |
Name | Link | Description | Price |
---|---|---|---|
Numverify | https://numverify.com | Global Phone Number Validation & Lookup JSON API. Supports 232 countries. | 250 requests FREE |
Twillo | https://www.twilio.com/docs/lookup/api | Provides a way to retrieve additional information about a phone number | Free or $0.01 per request (for caller lookup) |
Plivo | https://www.plivo.com/lookup/ | Determine carrier, number type, format, and country for any phone number worldwide | from $0.04 per request |
GetContact | https://github.com/kovinevmv/getcontact | Find info about user by phone number | from $6,89 in months/100 requests |
Veriphone | https://veriphone.io/ | Phone number validation & carrier lookup | 1000 requests/month FREE |
Name | Link | Description | Price |
---|---|---|---|
Global Address | https://rapidapi.com/adminMelissa/api/global-address/ | Easily verify, check or lookup address | FREE |
US Street Address | https://smartystreets.com/docs/cloud/us-street-api | Validate and append data for any US postal address | FREE |
Google Maps Geocoding API | https://developers.google.com/maps/documentation/geocoding/overview | convert addresses (like "1600 Amphitheatre Parkway, Mountain View, CA") into geographic coordinates | 0.005 USD per request |
Postcoder | https://postcoder.com/address-lookup | Find adress by postcode | £130/5000 requests |
Zipcodebase | https://zipcodebase.com | Lookup postal codes, calculate distances and much more | 5000 requests FREE |
Openweathermap geocoding API | https://openweathermap.org/api/geocoding-api | get geographical coordinates (lat, lon) by using name of the location (city name or area name) | 60 calls/minute 1,000,000 calls/month |
DistanceMatrix | https://distancematrix.ai/product | Calculate, evaluate and plan your routes | $1.25-$2 per 1000 elements |
Geotagging API | https://geotagging.ai/ | Predict geolocations by texts | Freemium |
Name | Link | Description | Price |
---|---|---|---|
Approuve.com | https://appruve.co | Allows you to verify the identities of individuals, businesses, and connect to financial account data across Africa | Paid |
Onfido.com | https://onfido.com | Onfido Document Verification lets your users scan a photo ID from any device, before checking it's genuine. Combined with Biometric Verification, it's a seamless way to anchor an account to the real identity of a customer. India | Paid |
Superpass.io | https://surepass.io/passport-id-verification-api/ | Passport, Photo ID and Driver License Verification in India | Paid |
Name | Link | Description | Price |
---|---|---|---|
Open corporates | https://api.opencorporates.com | Companies information | Paid, price upon request |
Linkedin company search API | https://docs.microsoft.com/en-us/linkedin/marketing/integrations/community-management/organizations/company-search?context=linkedin%2Fcompliance%2Fcontext&tabs=http | Find companies using keywords, industry, location, and other criteria | FREE |
Mattermark | https://rapidapi.com/raygorodskij/api/Mattermark/ | Get companies and investor information | free 14-day trial, from $49 per month |
Name | Link | Description | Price |
---|---|---|---|
API OSINT DS | https://github.com/davidonzo/apiosintDS | Collect info about IPv4/FQDN/URLs and file hashes in md5, sha1 or sha256 | FREE |
InfoDB API | https://www.ipinfodb.com/api | The API returns the location of an IP address (country, region, city, zipcode, latitude, longitude) and the associated timezone in XML, JSON or plain text format | FREE |
Domainsdb.info | https://domainsdb.info | Registered Domain Names Search | FREE |
BGPView | https://bgpview.docs.apiary.io/# | allowing consumers to view all sort of analytics data about the current state and structure of the internet | FREE |
DNSCheck | https://www.dnscheck.co/api | monitor the status of both individual DNS records and groups of related DNS records | up to 10 DNS records/FREE |
Cloudflare Trace | https://github.com/fawazahmed0/cloudflare-trace-api | Get IP Address, Timestamp, User Agent, Country Code, IATA, HTTP Version, TLS/SSL Version & More | FREE |
Host.io | https://host.io/ | Get info about domain | FREE |
Name | Link | Description | Price |
---|---|---|---|
BeVigil OSINT API | https://bevigil.com/osint-api | provides access to millions of asset footprint data points including domain intel, cloud services, API information, and third party assets extracted from millions of mobile apps being continuously uploaded and scanned by users on bevigil.com | 50 credits free/1000 credits/$50 |
Name | Link | Description | Price |
---|---|---|---|
WebScraping.AI | https://webscraping.ai/ | Web Scraping API with built-in proxies and JS rendering | FREE |
ZenRows | https://www.zenrows.com/ | Web Scraping API that bypasses anti-bot solutions while offering JS rendering, and rotating proxies apiKey Yes Unknown | FREE |
Name | Link | Description | Price |
---|---|---|---|
Whois freaks | https://whoisfreaks.com/ | well-parsed and structured domain WHOIS data for all domain names, registrars, countries and TLDs since the birth of internet | $19/5000 requests |
WhoisXMLApi | https://whois.whoisxmlapi.com | gathers a variety of domain ownership and registration data points from a comprehensive WHOIS database | 500 requests in month/FREE |
IPtoWhois | https://www.ip2whois.com/developers-api | Get detailed info about a domain | 500 requests/month FREE |
Name | Link | Description | Price |
---|---|---|---|
Ipstack | https://ipstack.com | Detect country, region, city and zip code | FREE |
Ipgeolocation.io | https://ipgeolocation.io | provides country, city, state, province, local currency, latitude and longitude, company detail, ISP lookup, language, zip code, country calling code, time zone, current time, sunset and sunrise time, moonset and moonrise | 30 000 requests per month/FREE |
IPInfoDB | https://ipinfodb.com/api | Free Geolocation tools and APIs for country, region, city and time zone lookup by IP address | FREE |
IP API | https://ip-api.com/ | Free domain/IP geolocation info | FREE |
Name | Link | Description | Price |
---|---|---|---|
Mylnikov API | https://www.mylnikov.org | public API implementation of Wi-Fi Geo-Location database | FREE |
Wigle | https://api.wigle.net/ | get location and other information by SSID | FREE |
Name | Link | Description | Price |
---|---|---|---|
PeetingDB | https://www.peeringdb.com/apidocs/ | Database of networks, and the go-to location for interconnection data | FREE |
PacketTotal | https://packettotal.com/api.html | .pcap files analyze | FREE |
Name | Link | Description | Price |
---|---|---|---|
Binlist.net | https://binlist.net/ | get information about bank by BIN | FREE |
FDIC Bank Data API | https://banks.data.fdic.gov/docs/ | institutions, locations and history events | FREE |
Amdoren | https://www.amdoren.com/currency-api/ | Free currency API with over 150 currencies | FREE |
VATComply.com | https://www.vatcomply.com/documentation | Exchange rates, geolocation and VAT number validation | FREE |
Alpaca | https://alpaca.markets/docs/api-documentation/api-v2/market-data/alpaca-data-api-v2/ | Realtime and historical market data on all US equities and ETFs | FREE |
Swiftcodesapi | https://swiftcodesapi.com | Verifying the validity of a bank SWIFT code or IBAN account number | $39 per month/4000 swift lookups |
IBANAPI | https://ibanapi.com | Validate IBAN number and get bank account information from it | Freemium/10$ Starter plan |
Name | Link | Description | Price |
---|---|---|---|
EVA | https://eva.pingutil.com/ | Measuring email deliverability & quality | FREE |
Mailboxlayer | https://mailboxlayer.com/ | Simple REST API measuring email deliverability & quality | 100 requests FREE, 5000 requests in month — $14.49 |
EmailCrawlr | https://emailcrawlr.com/ | Get key information about company websites. Find all email addresses associated with a domain. Get social accounts associated with an email. Verify email address deliverability. | 200 requests FREE, 5000 requets — $40 |
Voila Norbert | https://www.voilanorbert.com/api/ | Find anyone's email address and ensure your emails reach real people | from $49 in month |
Kickbox | https://open.kickbox.com/ | Email verification API | FREE |
FachaAPI | https://api.facha.dev/ | Allows checking if an email domain is a temporary email domain | FREE |
Name | Link | Description | Price |
---|---|---|---|
Genderize.io | https://genderize.io | Instantly answers the question of how likely a certain name is to be male or female and shows the popularity of the name. | 1000 names/day free |
Agify.io | https://agify.io | Predicts the age of a person given their name | 1000 names/day free |
Nataonalize.io | https://nationalize.io | Predicts the nationality of a person given their name | 1000 names/day free |
Name | Link | Description | Price |
---|---|---|---|
HaveIBeenPwned | https://haveibeenpwned.com/API/v3 | allows the list of pwned accounts (email addresses and usernames) | $3.50 per month |
Psdmp.ws | https://psbdmp.ws/api | search in Pastebin | $9.95 per 10000 requests |
LeakPeek | https://psbdmp.ws/api | searc in leaks databases | $9.99 per 4 weeks unlimited access |
BreachDirectory.com | https://breachdirectory.com/api_documentation | search domain in data breaches databases | FREE |
LeekLookup | https://leak-lookup.com/api | search domain, email_address, fullname, ip address, phone, password, username in leaks databases | 10 requests FREE |
BreachDirectory.org | https://rapidapi.com/rohan-patra/api/breachdirectory/pricing | search domain, email_address, fullname, ip address, phone, password, username in leaks databases (possible to view password hashes) | 50 requests in month/FREE |
Name | Link | Description | Price |
---|---|---|---|
Wayback Machine API (Memento API, CDX Server API, Wayback Availability JSON API) | https://archive.org/help/wayback_api.php | Retrieve information about Wayback capture data | FREE |
TROVE (Australian Web Archive) API | https://trove.nla.gov.au/about/create-something/using-api | Retrieve information about TROVE capture data | FREE |
Archive-it API | https://support.archive-it.org/hc/en-us/articles/115001790023-Access-Archive-It-s-Wayback-index-with-the-CDX-C-API | Retrieve information about archive-it capture data | FREE |
UK Web Archive API | https://ukwa-manage.readthedocs.io/en/latest/#api-reference | Retrieve information about UK Web Archive capture data | FREE |
Arquivo.pt API | https://github.com/arquivo/pwa-technologies/wiki/Arquivo.pt-API | Allows full-text search and access preserved web content and related metadata. It is also possible to search by URL, accessing all versions of preserved web content. API returns a JSON object. | FREE |
Library Of Congress archive API | https://www.loc.gov/apis/ | Provides structured data about Library of Congress collections | FREE |
BotsArchive | https://botsarchive.com/docs.html | JSON formatted details about Telegram Bots available in database | FREE |
Name | Link | Description | Price |
---|---|---|---|
MD5 Decrypt | https://md5decrypt.net/en/Api/ | Search for decrypted hashes in the database | 1.99 EURO/day |
Name | Link | Description | Price |
---|---|---|---|
BTC.com | https://btc.com/btc/adapter?type=api-doc | get information about addresses and transanctions | FREE |
Blockchair | https://blockchair.com | Explore data stored on 17 blockchains (BTC, ETH, Cardano, Ripple etc) | $0.33 - $1 per 1000 calls |
Bitcointabyse | https://www.bitcoinabuse.com/api-docs | Lookup bitcoin addresses that have been linked to criminal activity | FREE |
Bitcoinwhoswho | https://www.bitcoinwhoswho.com/api | Scam reports on the Bitcoin Address | FREE |
Etherscan | https://etherscan.io/apis | Ethereum explorer API | FREE |
apilayer coinlayer | https://coinlayer.com | Real-time Crypto Currency Exchange Rates | FREE |
BlockFacts | https://blockfacts.io/ | Real-time crypto data from multiple exchanges via a single unified API, and much more | FREE |
Brave NewCoin | https://bravenewcoin.com/developers | Real-time and historic crypto data from more than 200+ exchanges | FREE |
WorldCoinIndex | https://www.worldcoinindex.com/apiservice | Cryptocurrencies Prices | FREE |
WalletLabels | https://www.walletlabels.xyz/docs | Labels for 7,5 million Ethereum wallets | FREE |
Name | Link | Description | Price |
---|---|---|---|
VirusTotal | https://developers.virustotal.com/reference | files and urls analyze | Public API is FREE |
AbuseLPDB | https://docs.abuseipdb.com/#introduction | IP/domain/URL reputation | FREE |
AlienVault Open Threat Exchange (OTX) | https://otx.alienvault.com/api | IP/domain/URL reputation | FREE |
Phisherman | https://phisherman.gg | IP/domain/URL reputation | FREE |
URLScan.io | https://urlscan.io/about-api/ | Scan and Analyse URLs | FREE |
Web of Thrust | https://support.mywot.com/hc/en-us/sections/360004477734-API- | IP/domain/URL reputation | FREE |
Threat Jammer | https://threatjammer.com/docs/introduction-threat-jammer-user-api | IP/domain/URL reputation | ??? |
Name | Link | Description | Price |
---|---|---|---|
Search4faces | https://search4faces.com/api.html | Detect and locate human faces within an image, and returns high-precision face bounding boxes. Face⁺⁺ also allows you to store metadata of each detected face for future use. | $21 per 1000 requests |
## Face Detection
Name | Link | Description | Price |
---|---|---|---|
Face++ | https://www.faceplusplus.com/face-detection/ | Search for people in social networks by facial image | from 0.03 per call |
BetaFace | https://www.betafaceapi.com/wpa/ | Can scan uploaded image files or image URLs, find faces and analyze them. API also provides verification (faces comparison) and identification (faces search) services, as well able to maintain multiple user-defined recognition databases (namespaces) | 50 image per day FREE/from 0.15 EUR per request |
## Reverse Image Search
Name | Link | Description | Price |
---|---|---|---|
Google Reverse images search API | https://github.com/SOME-1HING/google-reverse-image-api/ | This is a simple API built using Node.js and Express.js that allows you to perform Google Reverse Image Search by providing an image URL. | FREE (UNOFFICIAL) |
TinEyeAPI | https://services.tineye.com/TinEyeAPI | Verify images, Moderate user-generated content, Track images and brands, Check copyright compliance, Deploy fraud detection solutions, Identify stock photos, Confirm the uniqueness of an image | Start from $200/5000 searches |
Bing Images Search API | https://www.microsoft.com/en-us/bing/apis/bing-image-search-api | With Bing Image Search API v7, help users scour the web for images. Results include thumbnails, full image URLs, publishing website info, image metadata, and more. | 1,000 requests free per month FREE |
MRISA | https://github.com/vivithemage/mrisa | MRISA (Meta Reverse Image Search API) is a RESTful API which takes an image URL, does a reverse Google image search, and returns a JSON array with the search results | FREE? (no official) |
PicImageSearch | https://github.com/kitUIN/PicImageSearch | Aggregator for different Reverse Image Search API | FREE? (no official) |
## AI Geolocation
Name | Link | Description | Price |
---|---|---|---|
Geospy | https://api.geospy.ai/ | Detecting estimation location of uploaded photo | Access by request |
Picarta | https://picarta.ai/api | Detecting estimation location of uploaded photo | 100 request/day FREE |
Name | Link | Description | Price |
---|---|---|---|
Twitch | https://dev.twitch.tv/docs/v5/reference | ||
YouTube Data API | https://developers.google.com/youtube/v3 | ||
https://www.reddit.com/dev/api/ | |||
Vkontakte | https://vk.com/dev/methods | ||
Twitter API | https://developer.twitter.com/en | ||
Linkedin API | https://docs.microsoft.com/en-us/linkedin/ | ||
All Facebook and Instagram API | https://developers.facebook.com/docs/ | ||
Whatsapp Business API | https://www.whatsapp.com/business/api | ||
Telegram and Telegram Bot API | https://core.telegram.org | ||
Weibo API | https://open.weibo.com/wiki/API文档/en | ||
https://dev.xing.com/partners/job_integration/api_docs | |||
Viber | https://developers.viber.com/docs/api/rest-bot-api/ | ||
Discord | https://discord.com/developers/docs | ||
Odnoklassniki | https://ok.ru/apiok | ||
Blogger | https://developers.google.com/blogger/ | The Blogger APIs allows client applications to view and update Blogger content | FREE |
Disqus | https://disqus.com/api/docs/auth/ | Communicate with Disqus data | FREE |
Foursquare | https://developer.foursquare.com/ | Interact with Foursquare users and places (geolocation-based checkins, photos, tips, events, etc) | FREE |
HackerNews | https://github.com/HackerNews/API | Social news for CS and entrepreneurship | FREE |
Kakao | https://developers.kakao.com/ | Kakao Login, Share on KakaoTalk, Social Plugins and more | FREE |
Line | https://developers.line.biz/ | Line Login, Share on Line, Social Plugins and more | FREE |
TikTok | https://developers.tiktok.com/doc/login-kit-web | Fetches user info and user's video posts on TikTok platform | FREE |
Tumblr | https://www.tumblr.com/docs/en/api/v2 | Read and write Tumblr Data | FREE |
!WARNING Use with caution! Accounts may be blocked permanently for using unofficial APIs.
Name | Link | Description | Price |
---|---|---|---|
TikTok | https://github.com/davidteather/TikTok-Api | The Unofficial TikTok API Wrapper In Python | FREE |
Google Trends | https://github.com/suryasev/unofficial-google-trends-api | Unofficial Google Trends API | FREE |
YouTube Music | https://github.com/sigma67/ytmusicapi | Unofficial APi for YouTube Music | FREE |
Duolingo | https://github.com/KartikTalwar/Duolingo | Duolingo unofficial API (can gather info about users) | FREE |
Steam. | https://github.com/smiley/steamapi | An unofficial object-oriented Python library for accessing the Steam Web API. | FREE |
https://github.com/ping/instagram_private_api | Instagram Private API | FREE | |
Discord | https://github.com/discordjs/discord.js | JavaScript library for interacting with the Discord API | FREE |
Zhihu | https://github.com/syaning/zhihu-api | FREE Unofficial API for Zhihu | FREE |
Quora | https://github.com/csu/quora-api | Unofficial API for Quora | FREE |
DnsDumbster | https://github.com/PaulSec/API-dnsdumpster.com | (Unofficial) Python API for DnsDumbster | FREE |
PornHub | https://github.com/sskender/pornhub-api | Unofficial API for PornHub in Python | FREE |
Skype | https://github.com/ShyykoSerhiy/skyweb | Unofficial Skype API for nodejs via 'Skype (HTTP)' protocol. | FREE |
Google Search | https://github.com/aviaryan/python-gsearch | Google Search unofficial API for Python with no external dependencies | FREE |
Airbnb | https://github.com/nderkach/airbnb-python | Python wrapper around the Airbnb API (unofficial) | FREE |
Medium | https://github.com/enginebai/PyMedium | Unofficial Medium Python Flask API and SDK | FREE |
https://github.com/davidyen1124/Facebot | Powerful unofficial Facebook API | FREE | |
https://github.com/tomquirk/linkedin-api | Unofficial Linkedin API for Python | FREE | |
Y2mate | https://github.com/Simatwa/y2mate-api | Unofficial Y2mate API for Python | FREE |
Livescore | https://github.com/Simatwa/livescore-api | Unofficial Livescore API for Python | FREE |
Name | Link | Description | Price |
---|---|---|---|
Google Custom Search JSON API | https://developers.google.com/custom-search/v1/overview | Search in Google | 100 requests FREE |
Serpstack | https://serpstack.com/ | Google search results to JSON | FREE |
Serpapi | https://serpapi.com | Google, Baidu, Yandex, Yahoo, DuckDuckGo, Bint and many others search results | $50/5000 searches/month |
Bing Web Search API | https://www.microsoft.com/en-us/bing/apis/bing-web-search-api | Search in Bing (+instant answers and location) | 1000 transactions per month FREE |
WolframAlpha API | https://products.wolframalpha.com/api/pricing/ | Short answers, conversations, calculators and many more | from $25 per 1000 queries |
DuckDuckgo Instant Answers API | https://duckduckgo.com/api | An API for some of our Instant Answers, not for full search results. | FREE |
| Memex Marginalia | https://memex.marginalia.nu/projects/edge/api.gmi | An API for new privacy search engine | FREE |
Name | Link | Description | Price |
---|---|---|---|
MediaStack | https://mediastack.com/ | News articles search results in JSON | 500 requests/month FREE |
Name | Link | Description | Price |
---|---|---|---|
Darksearch.io | https://darksearch.io/apidoc | search by websites in .onion zone | FREE |
Onion Lookup | https://onion.ail-project.org/ | onion-lookup is a service for checking the existence of Tor hidden services and retrieving their associated metadata. onion-lookup relies on an private AIL instance to obtain the metadata | FREE |
Name | Link | Description | Price |
---|---|---|---|
Jackett | https://github.com/Jackett/Jackett | API for automate searching in different torrent trackers | FREE |
Torrents API PY | https://github.com/Jackett/Jackett | Unofficial API for 1337x, Piratebay, Nyaasi, Torlock, Torrent Galaxy, Zooqle, Kickass, Bitsearch, MagnetDL,Libgen, YTS, Limetorrent, TorrentFunk, Glodls, Torre | FREE |
Torrent Search API | https://github.com/Jackett/Jackett | API for Torrent Search Engine with Extratorrents, Piratebay, and ISOhunt | 500 queries/day FREE |
Torrent search api | https://github.com/JimmyLaurent/torrent-search-api | Yet another node torrent scraper (supports iptorrents, torrentleech, torrent9, torrentz2, 1337x, thepiratebay, Yggtorrent, TorrentProject, Eztv, Yts, LimeTorrents) | FREE |
Torrentinim | https://github.com/sergiotapia/torrentinim | Very low memory-footprint, self hosted API-only torrent search engine. Sonarr + Radarr Compatible, native support for Linux, Mac and Windows. | FREE |
Name | Link | Description | Price |
---|---|---|---|
National Vulnerability Database CVE Search API | https://nvd.nist.gov/developers/vulnerabilities | Get basic information about CVE and CVE history | FREE |
OpenCVE API | https://docs.opencve.io/api/cve/ | Get basic information about CVE | FREE |
CVEDetails API | https://www.cvedetails.com/documentation/apis | Get basic information about CVE | partly FREE (?) |
CVESearch API | https://docs.cvesearch.com/ | Get basic information about CVE | by request |
KEVin API | https://kevin.gtfkd.com/ | API for accessing CISA's Known Exploited Vulnerabilities Catalog (KEV) and CVE Data | FREE |
Vulners.com API | https://vulners.com | Get basic information about CVE | FREE for personal use |
Name | Link | Description | Price |
---|---|---|---|
Aviation Stack | https://aviationstack.com | get information about flights, aircrafts and airlines | FREE |
OpenSky Network | https://opensky-network.org/apidoc/index.html | Free real-time ADS-B aviation data | FREE |
AviationAPI | https://docs.aviationapi.com/ | FAA Aeronautical Charts and Publications, Airport Information, and Airport Weather | FREE |
FachaAPI | https://api.facha.dev | Aircraft details and live positioning API | FREE |
Name | Link | Description | Price |
---|---|---|---|
Windy Webcams API | https://api.windy.com/webcams/docs | Get a list of available webcams for a country, city or geographical coordinates | FREE with limits or 9990 euro without limits |
## Regex
Name | Link | Description | Price |
---|---|---|---|
Autoregex | https://autoregex.notion.site/AutoRegex-API-Documentation-97256bad2c114a6db0c5822860214d3a | Convert English phrase to regular expression | from $3.49/month |
Name | Link |
---|---|
API Guessr (detect API by auth key or by token) | https://api-guesser.netlify.app/ |
REQBIN Online REST & SOAP API Testing Tool | https://reqbin.com |
ExtendClass Online REST Client | https://extendsclass.com/rest-client-online.html |
Codebeatify.org Online API Test | https://codebeautify.org/api-test |
SyncWith Google Sheet add-on. Link more than 1000 APIs with Spreadsheet | https://workspace.google.com/u/0/marketplace/app/syncwith_crypto_binance_coingecko_airbox/449644239211?hl=ru&pann=sheets_addon_widget |
Talend API Tester Google Chrome Extension | https://workspace.google.com/u/0/marketplace/app/syncwith_crypto_binance_coingecko_airbox/449644239211?hl=ru&pann=sheets_addon_widget |
Michael Bazzel APIs search tools | https://inteltechniques.com/tools/API.html |
Name | Link |
---|---|
Convert curl commands to Python, JavaScript, PHP, R, Go, C#, Ruby, Rust, Elixir, Java, MATLAB, Dart, CFML, Ansible URI or JSON | https://curlconverter.com |
Curl-to-PHP. Instantly convert curl commands to PHP code | https://incarnate.github.io/curl-to-php/ |
Curl to PHP online (Codebeatify) | https://codebeautify.org/curl-to-php-online |
Curl to JavaScript fetch | https://kigiri.github.io/fetch/ |
Curl to JavaScript fetch (Scrapingbee) | https://www.scrapingbee.com/curl-converter/javascript-fetch/ |
Curl to C# converter | https://curl.olsh.me |
Name | Link |
---|---|
Sheety. Create API frome GOOGLE SHEET | https://sheety.co/ |
Postman. Platform for creating your own API | https://www.postman.com |
Reetoo. Rest API Generator | https://retool.com/api-generator/ |
Beeceptor. Rest API mocking and intercepting in seconds (no coding). | https://beeceptor.com |
Name | Link |
---|---|
RapidAPI. Market your API for millions of developers | https://rapidapi.com/solution/api-provider/ |
Apilayer. API Marketplace | https://apilayer.com |
Name | Link | Description |
---|---|---|
Keyhacks | https://github.com/streaak/keyhacks | Keyhacks is a repository which shows quick ways in which API keys leaked by a bug bounty program can be checked to see if they're valid. |
All about APIKey | https://github.com/daffainfo/all-about-apikey | Detailed information about API key / OAuth token for different services (Description, Request, Response, Regex, Example) |
API Guessr | https://api-guesser.netlify.app/ | Enter API Key and and find out which service they belong to |
Name | Link | Description |
---|---|---|
APIDOG ApiHub | https://apidog.com/apihub/ | |
Rapid APIs collection | https://rapidapi.com/collections | |
API Ninjas | https://api-ninjas.com/api | |
APIs Guru | https://apis.guru/ | |
APIs List | https://apislist.com/ | |
API Context Directory | https://apicontext.com/api-directory/ | |
Any API | https://any-api.com/ | |
Public APIs Github repo | https://github.com/public-apis/public-apis |
If you don't know how to work with the REST API, I recommend you check out the Netlas API guide I wrote for Netlas.io.
There it is very brief and accessible to write how to automate requests in different programming languages (focus on Python and Bash) and process the resulting JSON data.
Thank you for following me! https://cybdetective.com
Dealing with failing web scrapers due to anti-bot protections or website changes? Meet Scrapling.
Scrapling is a high-performance, intelligent web scraping library for Python that automatically adapts to website changes while significantly outperforming popular alternatives. For both beginners and experts, Scrapling provides powerful features while maintaining simplicity.
>> from scrapling.defaults import Fetcher, AsyncFetcher, StealthyFetcher, PlayWrightFetcher
# Fetch websites' source under the radar!
>> page = StealthyFetcher.fetch('https://example.com', headless=True, network_idle=True)
>> print(page.status)
200
>> products = page.css('.product', auto_save=True) # Scrape data that survives website design changes!
>> # Later, if the website structure changes, pass `auto_match=True`
>> products = page.css('.product', auto_match=True) # and Scrapling still finds them!
Fetcher
class.PlayWrightFetcher
class through your real browser, Scrapling's stealth mode, Playwright's Chrome browser, or NSTbrowser's browserless!StealthyFetcher
and PlayWrightFetcher
classes.from scrapling.fetchers import Fetcher
fetcher = Fetcher(auto_match=False)
# Do http GET request to a web page and create an Adaptor instance
page = fetcher.get('https://quotes.toscrape.com/', stealthy_headers=True)
# Get all text content from all HTML tags in the page except `script` and `style` tags
page.get_all_text(ignore_tags=('script', 'style'))
# Get all quotes elements, any of these methods will return a list of strings directly (TextHandlers)
quotes = page.css('.quote .text::text') # CSS selector
quotes = page.xpath('//span[@class="text"]/text()') # XPath
quotes = page.css('.quote').css('.text::text') # Chained selectors
quotes = [element.text for element in page.css('.quote .text')] # Slower than bulk query above
# Get the first quote element
quote = page.css_first('.quote') # same as page.css('.quote').first or page.css('.quote')[0]
# Tired of selectors? Use find_all/find
# Get all 'div' HTML tags that one of its 'class' values is 'quote'
quotes = page.find_all('div', {'class': 'quote'})
# Same as
quotes = page.find_all('div', class_='quote')
quotes = page.find_all(['div'], class_='quote')
quotes = page.find_all(class_='quote') # and so on...
# Working with elements
quote.html_content # Get Inner HTML of this element
quote.prettify() # Prettified version of Inner HTML above
quote.attrib # Get that element's attributes
quote.path # DOM path to element (List of all ancestors from <html> tag till the element itself)
To keep it simple, all methods can be chained on top of each other!
Scrapling isn't just powerful - it's also blazing fast. Scrapling implements many best practices, design patterns, and numerous optimizations to save fractions of seconds. All of that while focusing exclusively on parsing HTML documents. Here are benchmarks comparing Scrapling to popular Python libraries in two tests.
# | Library | Time (ms) | vs Scrapling |
---|---|---|---|
1 | Scrapling | 5.44 | 1.0x |
2 | Parsel/Scrapy | 5.53 | 1.017x |
3 | Raw Lxml | 6.76 | 1.243x |
4 | PyQuery | 21.96 | 4.037x |
5 | Selectolax | 67.12 | 12.338x |
6 | BS4 with Lxml | 1307.03 | 240.263x |
7 | MechanicalSoup | 1322.64 | 243.132x |
8 | BS4 with html5lib | 3373.75 | 620.175x |
As you see, Scrapling is on par with Scrapy and slightly faster than Lxml which both libraries are built on top of. These are the closest results to Scrapling. PyQuery is also built on top of Lxml but still, Scrapling is 4 times faster.
Library | Time (ms) | vs Scrapling |
---|---|---|
Scrapling | 2.51 | 1.0x |
AutoScraper | 11.41 | 4.546x |
Scrapling can find elements with more methods and it returns full element Adaptor
objects not only the text like AutoScraper. So, to make this test fair, both libraries will extract an element with text, find similar elements, and then extract the text content for all of them. As you see, Scrapling is still 4.5 times faster at the same task.
All benchmarks' results are an average of 100 runs. See our benchmarks.py for methodology and to run your comparisons.
Scrapling is a breeze to get started with; Starting from version 0.2.9, we require at least Python 3.9 to work.
pip3 install scrapling
Then run this command to install browsers' dependencies needed to use Fetcher classes
scrapling install
If you have any installation issues, please open an issue.
Fetchers are interfaces built on top of other libraries with added features that do requests or fetch pages for you in a single request fashion and then return an Adaptor
object. This feature was introduced because the only option we had before was to fetch the page as you wanted it, then pass it manually to the Adaptor
class to create an Adaptor
instance and start playing around with the page.
You might be slightly confused by now so let me clear things up. All fetcher-type classes are imported in the same way
from scrapling.fetchers import Fetcher, StealthyFetcher, PlayWrightFetcher
All of them can take these initialization arguments: auto_match
, huge_tree
, keep_comments
, keep_cdata
, storage
, and storage_args
, which are the same ones you give to the Adaptor
class.
If you don't want to pass arguments to the generated Adaptor
object and want to use the default values, you can use this import instead for cleaner code:
from scrapling.defaults import Fetcher, AsyncFetcher, StealthyFetcher, PlayWrightFetcher
then use it right away without initializing like:
page = StealthyFetcher.fetch('https://example.com')
Also, the Response
object returned from all fetchers is the same as the Adaptor
object except it has these added attributes: status
, reason
, cookies
, headers
, history
, and request_headers
. All cookies
, headers
, and request_headers
are always of type dictionary
.
[!NOTE] The
auto_match
argument is enabled by default which is the one you should care about the most as you will see later.
This class is built on top of httpx with additional configuration options, here you can do GET
, POST
, PUT
, and DELETE
requests.
For all methods, you have stealthy_headers
which makes Fetcher
create and use real browser's headers then create a referer header as if this request came from Google's search of this URL's domain. It's enabled by default. You can also set the number of retries with the argument retries
for all methods and this will make httpx retry requests if it failed for any reason. The default number of retries for all Fetcher
methods is 3.
Hence: All headers generated by
stealthy_headers
argument can be overwritten by you through theheaders
argument
You can route all traffic (HTTP and HTTPS) to a proxy for any of these methods in this format http://username:password@localhost:8030
>> page = Fetcher().get('https://httpbin.org/get', stealthy_headers=True, follow_redirects=True)
>> page = Fetcher().post('https://httpbin.org/post', data={'key': 'value'}, proxy='http://username:password@localhost:8030')
>> page = Fetcher().put('https://httpbin.org/put', data={'key': 'value'})
>> page = Fetcher().delete('https://httpbin.org/delete')
For Async requests, you will just replace the import like below:
>> from scrapling.fetchers import AsyncFetcher
>> page = await AsyncFetcher().get('https://httpbin.org/get', stealthy_headers=True, follow_redirects=True)
>> page = await AsyncFetcher().post('https://httpbin.org/post', data={'key': 'value'}, proxy='http://username:password@localhost:8030')
>> page = await AsyncFetcher().put('https://httpbin.org/put', data={'key': 'value'})
>> page = await AsyncFetcher().delete('https://httpbin.org/delete')
This class is built on top of Camoufox, bypassing most anti-bot protections by default. Scrapling adds extra layers of flavors and configurations to increase performance and undetectability even further.
>> page = StealthyFetcher().fetch('https://www.browserscan.net/bot-detection') # Running headless by default
>> page.status == 200
True
>> page = await StealthyFetcher().async_fetch('https://www.browserscan.net/bot-detection') # the async version of fetch
>> page.status == 200
True
Note: all requests done by this fetcher are waiting by default for all JS to be fully loaded and executed so you don't have to :)
This list isn't final so expect a lot more additions and flexibility to be added in the next versions!
This class is built on top of Playwright which currently provides 4 main run options but they can be mixed as you want.
>> page = PlayWrightFetcher().fetch('https://www.google.com/search?q=%22Scrapling%22', disable_resources=True) # Vanilla Playwright option
>> page.css_first("#search a::attr(href)")
'https://github.com/D4Vinci/Scrapling'
>> page = await PlayWrightFetcher().async_fetch('https://www.google.com/search?q=%22Scrapling%22', disable_resources=True) # the async version of fetch
>> page.css_first("#search a::attr(href)")
'https://github.com/D4Vinci/Scrapling'
Note: all requests done by this fetcher are waiting by default for all JS to be fully loaded and executed so you don't have to :)
Using this Fetcher class, you can make requests with: 1) Vanilla Playwright without any modifications other than the ones you chose. 2) Stealthy Playwright with the stealth mode I wrote for it. It's still a WIP but it bypasses many online tests like Sannysoft's. Some of the things this fetcher's stealth mode does include: * Patching the CDP runtime fingerprint. * Mimics some of the real browsers' properties by injecting several JS files and using custom options. * Using custom flags on launch to hide Playwright even more and make it faster. * Generates real browser's headers of the same type and same user OS then append it to the request's headers. 3) Real browsers by passing the real_chrome
argument or the CDP URL of your browser to be controlled by the Fetcher and most of the options can be enabled on it. 4) NSTBrowser's docker browserless option by passing the CDP URL and enabling nstbrowser_mode
option.
Hence using the
real_chrome
argument requires that you have Chrome browser installed on your device
Add that to a lot of controlling/hiding options as you will see in the arguments list below.
This list isn't final so expect a lot more additions and flexibility to be added in the next versions!
>>> quote.tag
'div'
>>> quote.parent
<data='<div class="col-md-8"> <div class="quote...' parent='<div class="row"> <div class="col-md-8">...'>
>>> quote.parent.tag
'div'
>>> quote.children
[<data='<span class="text" itemprop="text">"The...' parent='<div class="quote" itemscope itemtype="h...'>,
<data='<span>by <small class="author" itemprop=...' parent='<div class="quote" itemscope itemtype="h...'>,
<data='<div class="tags"> Tags: <meta class="ke...' parent='<div class="quote" itemscope itemtype="h...'>]
>>> quote.siblings
[<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
...]
>>> quote.next # gets the next element, the same logic applies to `quote.previous`
<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>
>>> quote.children.css_first(".author::text")
'Albert Einstein'
>>> quote.has_class('quote')
True
# Generate new selectors for any element
>>> quote.generate_css_selector
'body > div > div:nth-of-type(2) > div > div'
# Test these selectors on your favorite browser or reuse them again in the library's methods!
>>> quote.generate_xpath_selector
'//body/div/div[2]/div/div'
If your case needs more than the element's parent, you can iterate over the whole ancestors' tree of any element like below
for ancestor in quote.iterancestors():
# do something with it...
You can search for a specific ancestor of an element that satisfies a function, all you need to do is to pass a function that takes an Adaptor
object as an argument and return True
if the condition satisfies or False
otherwise like below:
>>> quote.find_ancestor(lambda ancestor: ancestor.has_class('row'))
<data='<div class="row"> <div class="col-md-8">...' parent='<div class="container"> <div class="row...'>
You can select elements by their text content in multiple ways, here's a full example on another website:
>>> page = Fetcher().get('https://books.toscrape.com/index.html')
>>> page.find_by_text('Tipping the Velvet') # Find the first element whose text fully matches this text
<data='<a href="catalogue/tipping-the-velvet_99...' parent='<h3><a href="catalogue/tipping-the-velve...'>
>>> page.urljoin(page.find_by_text('Tipping the Velvet').attrib['href']) # We use `page.urljoin` to return the full URL from the relative `href`
'https://books.toscrape.com/catalogue/tipping-the-velvet_999/index.html'
>>> page.find_by_text('Tipping the Velvet', first_match=False) # Get all matches if there are more
[<data='<a href="catalogue/tipping-the-velvet_99...' parent='<h3><a href="catalogue/tipping-the-velve...'>]
>>> page.find_by_regex(r'£[\d\.]+') # Get the first element that its text content matches my price regex
<data='<p class="price_color">£51.77</p>' parent='<div class="product_price"> <p class="pr...'>
>>> page.find_by_regex(r'£[\d\.]+', first_match=False) # Get all elements that matches my price regex
[<data='<p class="price_color">£51.77</p>' parent='<div class="product_price"> <p class="pr...'>,
<data='<p class="price_color">£53.74</p>' parent='<div class="product_price"> <p class="pr...'>,
<data='<p class="price_color">£50.10</p>' parent='<div class="product_price"> <p class="pr...'>,
<data='<p class="price_color">£47.82</p>' parent='<div class="product_price"> <p class="pr...'>,
...]
Find all elements that are similar to the current element in location and attributes
# For this case, ignore the 'title' attribute while matching
>>> page.find_by_text('Tipping the Velvet').find_similar(ignore_attributes=['title'])
[<data='<a href="catalogue/a-light-in-the-attic_...' parent='<h3><a href="catalogue/a-light-in-the-at...'>,
<data='<a href="catalogue/soumission_998/index....' parent='<h3><a href="catalogue/soumission_998/in...'>,
<data='<a href="catalogue/sharp-objects_997/ind...' parent='<h3><a href="catalogue/sharp-objects_997...'>,
...]
# You will notice that the number of elements is 19 not 20 because the current element is not included.
>>> len(page.find_by_text('Tipping the Velvet').find_similar(ignore_attributes=['title']))
19
# Get the `href` attribute from all similar elements
>>> [element.attrib['href'] for element in page.find_by_text('Tipping the Velvet').find_similar(ignore_attributes=['title'])]
['catalogue/a-light-in-the-attic_1000/index.html',
'catalogue/soumission_998/index.html',
'catalogue/sharp-objects_997/index.html',
...]
To increase the complexity a little bit, let's say we want to get all books' data using that element as a starting point for some reason
>>> for product in page.find_by_text('Tipping the Velvet').parent.parent.find_similar():
print({
"name": product.css_first('h3 a::text'),
"price": product.css_first('.price_color').re_first(r'[\d\.]+'),
"stock": product.css('.availability::text')[-1].clean()
})
{'name': 'A Light in the ...', 'price': '51.77', 'stock': 'In stock'}
{'name': 'Soumission', 'price': '50.10', 'stock': 'In stock'}
{'name': 'Sharp Objects', 'price': '47.82', 'stock': 'In stock'}
...
The documentation will provide more advanced examples.
Let's say you are scraping a page with a structure like this:
<div class="container">
<section class="products">
<article class="product" id="p1">
<h3>Product 1</h3>
<p class="description">Description 1</p>
</article>
<article class="product" id="p2">
<h3>Product 2</h3>
<p class="description">Description 2</p>
</article>
</section>
</div>
And you want to scrape the first product, the one with the p1
ID. You will probably write a selector like this
page.css('#p1')
When website owners implement structural changes like
<div class="new-container">
<div class="product-wrapper">
<section class="products">
<article class="product new-class" data-id="p1">
<div class="product-info">
<h3>Product 1</h3>
<p class="new-description">Description 1</p>
</div>
</article>
<article class="product new-class" data-id="p2">
<div class="product-info">
<h3>Product 2</h3>
<p class="new-description">Description 2</p>
</div>
</article>
</section>
</div>
</div>
The selector will no longer function and your code needs maintenance. That's where Scrapling's auto-matching feature comes into play.
from scrapling.parser import Adaptor
# Before the change
page = Adaptor(page_source, url='example.com')
element = page.css('#p1' auto_save=True)
if not element: # One day website changes?
element = page.css('#p1', auto_match=True) # Scrapling still finds it!
# the rest of the code...
How does the auto-matching work? Check the FAQs section for that and other possible issues while auto-matching.
Let's use a real website as an example and use one of the fetchers to fetch its source. To do this we need to find a website that will change its design/structure soon, take a copy of its source then wait for the website to make the change. Of course, that's nearly impossible to know unless I know the website's owner but that will make it a staged test haha.
To solve this issue, I will use The Web Archive's Wayback Machine. Here is a copy of StackOverFlow's website in 2010, pretty old huh?Let's test if the automatch feature can extract the same button in the old design from 2010 and the current design using the same selector :)
If I want to extract the Questions button from the old design I can use a selector like this #hmenus > div:nth-child(1) > ul > li:nth-child(1) > a
This selector is too specific because it was generated by Google Chrome. Now let's test the same selector in both versions
>> from scrapling.fetchers import Fetcher
>> selector = '#hmenus > div:nth-child(1) > ul > li:nth-child(1) > a'
>> old_url = "https://web.archive.org/web/20100102003420/http://stackoverflow.com/"
>> new_url = "https://stackoverflow.com/"
>>
>> page = Fetcher(automatch_domain='stackoverflow.com').get(old_url, timeout=30)
>> element1 = page.css_first(selector, auto_save=True)
>>
>> # Same selector but used in the updated website
>> page = Fetcher(automatch_domain="stackoverflow.com").get(new_url)
>> element2 = page.css_first(selector, auto_match=True)
>>
>> if element1.text == element2.text:
... print('Scrapling found the same element in the old design and the new design!')
'Scrapling found the same element in the old design and the new design!'
Note that I used a new argument called automatch_domain
, this is because for Scrapling these are two different URLs, not the website so it isolates their data. To tell Scrapling they are the same website, we then pass the domain we want to use for saving auto-match data for them both so Scrapling doesn't isolate them.
In a real-world scenario, the code will be the same except it will use the same URL for both requests so you won't need to use the automatch_domain
argument. This is the closest example I can give to real-world cases so I hope it didn't confuse you :)
Notes: 1. For the two examples above I used one time the Adaptor
class and the second time the Fetcher
class just to show you that you can create the Adaptor
object by yourself if you have the source or fetch the source using any Fetcher
class then it will create the Adaptor
object for you. 2. Passing the auto_save
argument with the auto_match
argument set to False
while initializing the Adaptor/Fetcher object will only result in ignoring the auto_save
argument value and the following warning message text Argument `auto_save` will be ignored because `auto_match` wasn't enabled on initialization. Check docs for more info.
This behavior is purely for performance reasons so the database gets created/connected only when you are planning to use the auto-matching features. Same case with the auto_match
argument.
auto_match
parameter works only for Adaptor
instances not Adaptors
so if you do something like this you will get an error python page.css('body').css('#p1', auto_match=True)
because you can't auto-match a whole list, you have to be specific and do something like python page.css_first('body').css('#p1', auto_match=True)
Inspired by BeautifulSoup's find_all
function you can find elements by using find_all
/find
methods. Both methods can take multiple types of filters and return all elements in the pages that all these filters apply to.
So the way it works is after collecting all passed arguments and keywords, each filter passes its results to the following filter in a waterfall-like filtering system.
It filters all elements in the current page/element in the following order:
Note: The filtering process always starts from the first filter it finds in the filtering order above so if no tag name(s) are passed but attributes are passed, the process starts from that layer and so on. But the order in which you pass the arguments doesn't matter.
Examples to clear any confusion :)
>> from scrapling.fetchers import Fetcher
>> page = Fetcher().get('https://quotes.toscrape.com/')
# Find all elements with tag name `div`.
>> page.find_all('div')
[<data='<div class="container"> <div class="row...' parent='<body> <div class="container"> <div clas...'>,
<data='<div class="row header-box"> <div class=...' parent='<div class="container"> <div class="row...'>,
...]
# Find all div elements with a class that equals `quote`.
>> page.find_all('div', class_='quote')
[<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
...]
# Same as above.
>> page.find_all('div', {'class': 'quote'})
[<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
...]
# Find all elements with a class that equals `quote`.
>> page.find_all({'class': 'quote'})
[<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
...]
# Find all div elements with a class that equals `quote`, and contains the element `.text` which contains the word 'world' in its content.
>> page.find_all('div', {'class': 'quote'}, lambda e: "world" in e.css_first('.text::text'))
[<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>]
# Find all elements that don't have children.
>> page.find_all(lambda element: len(element.children) > 0)
[<data='<html lang="en"><head><meta charset="UTF...'>,
<data='<head><meta charset="UTF-8"><title>Quote...' parent='<html lang="en"><head><meta charset="UTF...'>,
<data='<body> <div class="container"> <div clas...' parent='<html lang="en"><head><meta charset="UTF...'>,
...]
# Find all elements that contain the word 'world' in its content.
>> page.find_all(lambda element: "world" in element.text)
[<data='<span class="text" itemprop="text">"The...' parent='<div class="quote" itemscope itemtype="h...'>,
<data='<a class="tag" href="/tag/world/page/1/"...' parent='<div class="tags"> Tags: <meta class="ke...'>]
# Find all span elements that match the given regex
>> page.find_all('span', re.compile(r'world'))
[<data='<span class="text" itemprop="text">"The...' parent='<div class="quote" itemscope itemtype="h...'>]
# Find all div and span elements with class 'quote' (No span elements like that so only div returned)
>> page.find_all(['div', 'span'], {'class': 'quote'})
[<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
...]
# Mix things up
>> page.find_all({'itemtype':"http://schema.org/CreativeWork"}, 'div').css('.author::text')
['Albert Einstein',
'J.K. Rowling',
...]
Here's what else you can do with Scrapling:
lxml.etree
object itself of any element directly python >>> quote._root <Element div at 0x107f98870>
Saving and retrieving elements manually to auto-match them outside the css
and the xpath
methods but you have to set the identifier by yourself.
To save an element to the database: python >>> element = page.find_by_text('Tipping the Velvet', first_match=True) >>> page.save(element, 'my_special_element')
python >>> element_dict = page.retrieve('my_special_element') >>> page.relocate(element_dict, adaptor_type=True) [<data='<a href="catalogue/tipping-the-velvet_99...' parent='<h3><a href="catalogue/tipping-the-velve...'>] >>> page.relocate(element_dict, adaptor_type=True).css('::text') ['Tipping the Velvet']
if you want to keep it as lxml.etree
object, leave the adaptor_type
argument python >>> page.relocate(element_dict) [<Element a at 0x105a2a7b0>]
Filtering results based on a function
# Find all products over $50
expensive_products = page.css('.product_pod').filter(
lambda p: float(p.css('.price_color').re_first(r'[\d\.]+')) > 50
)
# Find all the products with price '53.23'
page.css('.product_pod').search(
lambda p: float(p.css('.price_color').re_first(r'[\d\.]+')) == 54.23
)
Doing operations on element content is the same as scrapy python quote.re(r'regex_pattern') # Get all strings (TextHandlers) that match the regex pattern quote.re_first(r'regex_pattern') # Get the first string (TextHandler) only quote.json() # If the content text is jsonable, then convert it to json using `orjson` which is 10x faster than the standard json library and provides more options
except that you can do more with them like python quote.re( r'regex_pattern', replace_entities=True, # Character entity references are replaced by their corresponding character clean_match=True, # This will ignore all whitespaces and consecutive spaces while matching case_sensitive= False, # Set the regex to ignore letters case while compiling it )
Hence all of these methods are methods from the TextHandler
within that contains the text content so the same can be done directly if you call the .text
property or equivalent selector function.
Doing operations on the text content itself includes
python quote.clean()
TextHandler
objects too? so in cases where you have for example a JS object assigned to a JS variable inside JS code and want to extract it with regex and then convert it to json object, in other libraries, these would be more than 1 line of code but here you can do it in 1 line like this python page.xpath('//script/text()').re_first(r'var dataLayer = (.+);').json()
Sort all characters in the string as if it were a list and return the new string python quote.sort(reverse=False)
To be clear,
TextHandler
is a sub-class of Python'sstr
so all normal operations/methods that work with Python strings will work with it.
Any element's attributes are not exactly a dictionary but a sub-class of mapping called AttributesHandler
that's read-only so it's faster and string values returned are actually TextHandler
objects so all operations above can be done on them, standard dictionary operations that don't modify the data, and more :)
python >>> for item in element.attrib.search_values('catalogue', partial=True): print(item) {'href': 'catalogue/tipping-the-velvet_999/index.html'}
python >>> element.attrib.json_string b'{"href":"catalogue/tipping-the-velvet_999/index.html","title":"Tipping the Velvet"}'
python >>> dict(element.attrib) {'href': 'catalogue/tipping-the-velvet_999/index.html', 'title': 'Tipping the Velvet'}
Scrapling is under active development so expect many more features coming soon :)
There are a lot of deep details skipped here to make this as short as possible so to take a deep dive, head to the docs section. I will try to keep it updated as possible and add complex examples. There I will explain points like how to write your storage system, write spiders that don't depend on selectors at all, and more...
Note that implementing your storage system can be complex as there are some strict rules such as inheriting from the same abstract class, following the singleton design pattern used in other classes, and more. So make sure to read the docs first.
[!IMPORTANT] A website is needed to provide detailed library documentation.
I'm trying to rush creating the website, researching new ideas, and adding more features/tests/benchmarks but time is tight with too many spinning plates between work, personal life, and working on Scrapling. I have been working on Scrapling for months for free after all.
If you likeScrapling
and want it to keep improving then this is a friendly reminder that you can help by supporting me through the sponsor button.
This section addresses common questions about Scrapling, please read this section before opening an issue.
css
or xpath
with the auto_save
parameter set to True
before structural changes happen.Now because everything about the element can be changed or removed, nothing from the element can be used as a unique identifier for the database. To solve this issue, I made the storage system rely on two things:
identifier
parameter you passed to the method while selecting. If you didn't pass one, then the selector string itself will be used as an identifier but remember you will have to use it as an identifier value later when the structure changes and you want to pass the new selector.Together both are used to retrieve the element's unique properties from the database later. 4. Now later when you enable the auto_match
parameter for both the Adaptor instance and the method call. The element properties are retrieved and Scrapling loops over all elements in the page and compares each one's unique properties to the unique properties we already have for this element and a score is calculated for each one. 5. Comparing elements is not exact but more about finding how similar these values are, so everything is taken into consideration, even the values' order, like the order in which the element class names were written before and the order in which the same element class names are written now. 6. The score for each element is stored in the table, and the element(s) with the highest combined similarity scores are returned.
Not a big problem as it depends on your usage. The word default
will be used in place of the URL field while saving the element's unique properties. So this will only be an issue if you used the same identifier later for a different website that you didn't pass the URL parameter while initializing it as well. The save process will overwrite the previous data and auto-matching uses the latest saved properties only.
For each element, Scrapling will extract: - Element tag name, text, attributes (names and values), siblings (tag names only), and path (tag names only). - Element's parent tag name, attributes (names and values), and text.
auto_save
/auto_match
parameter while selecting and it got completely ignored with a warning messageThat's because passing the auto_save
/auto_match
argument without setting auto_match
to True
while initializing the Adaptor object will only result in ignoring the auto_save
/auto_match
argument value. This behavior is purely for performance reasons so the database gets created only when you are planning to use the auto-matching features.
It could be one of these reasons: 1. No data were saved/stored for this element before. 2. The selector passed is not the one used while storing element data. The solution is simple - Pass the old selector again as an identifier to the method called. - Retrieve the element with the retrieve method using the old selector as identifier then save it again with the save method and the new selector as identifier. - Start using the identifier argument more often if you are planning to use every new selector from now on. 3. The website had some extreme structural changes like a new full design. If this happens a lot with this website, the solution would be to make your code as selector-free as possible using Scrapling features.
Pretty much yeah, almost all features you get from BeautifulSoup can be found or achieved in Scrapling one way or another. In fact, if you see there's a feature in bs4 that is missing in Scrapling, please make a feature request from the issues tab to let me know.
Of course, you can find elements by text/regex, find similar elements in a more reliable way than AutoScraper, and finally save/retrieve elements manually to use later as the model feature in AutoScraper. I have pulled all top articles about AutoScraper from Google and tested Scrapling against examples in them. In all examples, Scrapling got the same results as AutoScraper in much less time.
Yes, Scrapling instances are thread-safe. Each Adaptor instance maintains its state.
Everybody is invited and welcome to contribute to Scrapling. There is a lot to do!
Please read the contributing file before doing anything.
[!CAUTION] This library is provided for educational and research purposes only. By using this library, you agree to comply with local and international laws regarding data scraping and privacy. The authors and contributors are not responsible for any misuse of this software. This library should not be used to violate the rights of others, for unethical purposes, or to use data in an unauthorized or illegal manner. Do not use it on any website unless you have permission from the website owner or within their allowed rules like the
robots.txt
file, for example.
This work is licensed under BSD-3
This project includes code adapted from: - Parsel (BSD License) - Used for translator submodule
____ _ _
| _ \ ___ __ _ __ _ ___ _ _ ___| \ | |
| |_) / _ \/ _` |/ _` / __| | | / __| \| |
| __/ __/ (_| | (_| \__ \ |_| \__ \ |\ |
|_| \___|\__, |\__,_|___/\__,_|___/_| \_|
|___/
███▄ █ ▓█████ ▒█████
██ ▀█ █ ▓█ ▀ ▒██▒ ██▒
▓██ ▀█ ██▒▒███ ▒██░ ██▒
▓██▒ ▐▌██▒▒▓█ ▄ ▒██ ██░
▒██░ ▓██░░▒████▒░ ████▓▒░
░ ▒░ ▒ ▒ ░░ ▒░ ░░ ▒░▒░▒░
░ ░░ ░ ▒░ ░ ░ ░ ░ ▒ ▒░
░ ░ ░ ░ ░ ░ ░ ▒
░ ░ ░ ░ ░
PEGASUS-NEO is a comprehensive penetration testing framework designed for security professionals and ethical hackers. It combines multiple security tools and custom modules for reconnaissance, exploitation, wireless attacks, web hacking, and more.
This tool is provided for educational and ethical testing purposes only. Usage of PEGASUS-NEO for attacking targets without prior mutual consent is illegal. It is the end user's responsibility to obey all applicable local, state, and federal laws.
Developers assume no liability and are not responsible for any misuse or damage caused by this program.
PEGASUS-NEO - Advanced Penetration Testing Framework
Copyright (C) 2024 Letda Kes dr. Sobri. All rights reserved.
This software is proprietary and confidential. Unauthorized copying, transfer, or
reproduction of this software, via any medium is strictly prohibited.
Written by Letda Kes dr. Sobri <muhammadsobrimaulana31@gmail.com>, January 2024
Password: Sobri
Social media tracking
Exploitation & Pentesting
Custom payload generation
Wireless Attacks
WPS exploitation
Web Attacks
CMS scanning
Social Engineering
Credential harvesting
Tracking & Analysis
# Clone the repository
git clone https://github.com/sobri3195/pegasus-neo.git
# Change directory
cd pegasus-neo
# Install dependencies
sudo python3 -m pip install -r requirements.txt
# Run the tool
sudo python3 pegasus_neo.py
sudo python3 pegasus_neo.py
This is a proprietary project and contributions are not accepted at this time.
For support, please email muhammadsobrimaulana31@gmail.com atau https://lynk.id/muhsobrimaulana
This project is protected under proprietary license. See the LICENSE file for details.
Made with ❤️ by Letda Kes dr. Sobri
President Trump last week revoked security clearances for Chris Krebs, the former director of the Cybersecurity and Infrastructure Security Agency (CISA) who was fired by Trump after declaring the 2020 election the most secure in U.S. history. The White House memo, which also suspended clearances for other security professionals at Krebs’s employer SentinelOne, comes as CISA is facing huge funding and staffing cuts.
Chris Krebs. Image: Getty Images.
The extraordinary April 9 memo directs the attorney general to investigate Chris Krebs (no relation), calling him “a significant bad-faith actor who weaponized and abused his government authority.”
The memo said the inquiry will include “a comprehensive evaluation of all of CISA’s activities over the last 6 years and will identify any instances where Krebs’ or CISA’s conduct appears to be contrary to the administration’s commitment to free speech and ending federal censorship, including whether Krebs’ conduct was contrary to suitability standards for federal employees or involved the unauthorized dissemination of classified information.”
CISA was created in 2018 during Trump’s first term, with Krebs installed as its first director. In 2020, CISA launched Rumor Control, a website that sought to rebut disinformation swirling around the 2020 election.
That effort ran directly counter to Trump’s claims that he lost the election because it was somehow hacked and stolen. The Trump campaign and its supporters filed at least 62 lawsuits contesting the election, vote counting, and vote certification in nine states, and nearly all of those cases were dismissed or dropped for lack of evidence or standing.
When the Justice Department began prosecuting people who violently attacked the U.S. Capitol on January 6, 2021, President Trump and Republican leaders shifted the narrative, claiming that Trump lost the election because the previous administration had censored conservative voices on social media.
Incredibly, the president’s memo seeking to ostracize Krebs stands reality on its head, accusing Krebs of promoting the censorship of election information, “including known risks associated with certain voting practices.” Trump also alleged that Krebs “falsely and baselessly denied that the 2020 election was rigged and stolen, including by inappropriately and categorically dismissing widespread election malfeasance and serious vulnerabilities with voting machines” [emphasis added].
Krebs did not respond to a request for comment. SentinelOne issued a statement saying it would cooperate in any review of security clearances held by its personnel, which is currently fewer than 10 employees.
Krebs’s former agency is now facing steep budget and staff reductions. The Record reports that CISA is looking to remove some 1,300 people by cutting about half its full-time staff and another 40% of its contractors.
“The agency’s National Risk Management Center, which serves as a hub analyzing risks to cyber and critical infrastructure, is expected to see significant cuts, said two sources familiar with the plans,” The Record’s Suzanne Smalley wrote. “Some of the office’s systematic risk responsibilities will potentially be moved to the agency’s Cybersecurity Division, according to one of the sources.”
CNN reports the Trump administration is also advancing plans to strip civil service protections from 80% of the remaining CISA employees, potentially allowing them to be fired for political reasons.
The Electronic Frontier Foundation (EFF) urged professionals in the cybersecurity community to defend Krebs and SentinelOne, noting that other security companies and professionals could be the next victims of Trump’s efforts to politicize cybersecurity.
“The White House must not be given free reign to turn cybersecurity professionals into political scapegoats,” the EFF wrote. “It is critical that the cybersecurity community now join together to denounce this chilling attack on free speech and rally behind Krebs and SentinelOne rather than cowering because they fear they will be next.”
However, Reuters said it found little sign of industry support for Krebs or SentinelOne, and that many security professionals are concerned about potentially being targeted if they speak out.
“Reuters contacted 33 of the largest U.S. cybersecurity companies, including tech companies and professional services firms with large cybersecurity practices, and three industry groups, for comment on Trump’s action against SentinelOne,” wrote Raphael Satter and A.J. Vicens. “Only one offered comment on Trump’s action. The rest declined, did not respond or did not answer questions.”
On April 3, President Trump fired Gen. Timothy Haugh, the head of the National Security Agency (NSA) and the U.S. Cyber Command, as well as Haugh’s deputy, Wendy Noble. The president did so immediately after meeting in the Oval Office with far-right conspiracy theorist Laura Loomer, who reportedly urged their dismissal. Speaking to reporters on Air Force One after news of the firings broke, Trump questioned Haugh’s loyalty.
Gen. Timothy Haugh. Image: C-SPAN.
Virginia Senator Mark Warner, the top Democrat on the Senate Intelligence Committee, called it inexplicable that the administration would remove the senior leaders of NSA-CYBERCOM without cause or warning, and risk disrupting critical ongoing intelligence operations.
“It is astonishing, too, that President Trump would fire the nonpartisan, experienced leader of the National Security Agency while still failing to hold any member of his team accountable for leaking classified information on a commercial messaging app – even as he apparently takes staffing direction on national security from a discredited conspiracy theorist in the Oval Office,” Warner said in a statement.
On Feb. 28, The Record’s Martin Matishak cited three sources saying Defense Secretary Pete Hegseth ordered U.S. Cyber Command to stand down from all planning against Russia, including offensive digital actions. The following day, The Guardian reported that analysts at CISA were verbally informed that they were not to follow or report on Russian threats, even though this had previously been a main focus for the agency.
A follow-up story from The Washington Post cited officials saying Cyber Command had received an order to halt active operations against Russia, but that the pause was intended to last only as long as negotiations with Russia continue.
The Department of Defense responded on Twitter/X that Hegseth had “neither canceled nor delayed any cyber operations directed against malicious Russian targets and there has been no stand-down order whatsoever from that priority.”
But on March 19, Reuters reported several U.S. national security agencies have halted work on a coordinated effort to counter Russian sabotage, disinformation and cyberattacks.
“Regular meetings between the National Security Council and European national security officials have gone unscheduled, and the NSC has also stopped formally coordinating efforts across U.S. agencies, including with the FBI, the Department of Homeland Security and the State Department,” Reuters reported, citing current and former officials.
President’s Trump’s institution of 125% tariffs on goods from China has seen Beijing strike back with 84 percent tariffs on U.S. imports. Now, some security experts are warning that the trade war could spill over into a cyber conflict, given China’s successful efforts to burrow into America’s critical infrastructure networks.
Over the past year, a number of Chinese government-backed digital intrusions have come into focus, including a sprawling espionage campaign involving the compromise of at least nine U.S. telecommunications providers. Dubbed “Salt Typhoon” by Microsoft, these telecom intrusions were pervasive enough that CISA and the FBI in December 2024 warned Americans against communicating sensitive information over phone networks, urging people instead to use encrypted messaging apps (like Signal).
The other broad ranging China-backed campaign is known as “Volt Typhoon,” which CISA described as “state-sponsored cyber actors seeking to pre-position themselves on IT networks for disruptive or destructive cyberattacks against U.S. critical infrastructure in the event of a major crisis or conflict with the United States.”
Responsibility for determining the root causes of the Salt Typhoon security debacle fell to the Cyber Safety Review Board (CSRB), a nonpartisan government entity established in February 2022 with a mandate to investigate the security failures behind major cybersecurity events. But on his first full day back in the White House, President Trump dismissed all 15 CSRB advisory committee members — likely because those advisers included Chris Krebs.
Last week, Sen. Ron Wyden (D-Ore.) placed a hold on Trump’s nominee to lead CISA, saying the hold would continue unless the agency published a report on the telecom industry hacks, as promised.
“CISA’s multi-year cover up of the phone companies’ negligent cybersecurity has real consequences,” Wyden said in a statement. “Congress and the American people have a right to read this report.”
The Wall Street Journal reported last week Chinese officials acknowledged in a secret December meeting that Beijing was behind the widespread telecom industry compromises.
“The Chinese official’s remarks at the December meeting were indirect and somewhat ambiguous, but most of the American delegation in the room interpreted it as a tacit admission and a warning to the U.S. about Taiwan,” The Journal’s Dustin Volz wrote, citing a former U.S. official familiar with the meeting.
Meanwhile, China continues to take advantage of the mass firings of federal workers. On April 9, the National Counterintelligence and Security Center warned (PDF) that Chinese intelligence entities are pursuing an online effort to recruit recently laid-off U.S. employees.
“Foreign intelligence entities, particularly those in China, are targeting current and former U.S. government (USG) employees for recruitment by posing as consulting firms, corporate headhunters, think tanks, and other entities on social and professional networking sites,” the alert warns. “Their deceptive online job offers, and other virtual approaches, have become more sophisticated in targeting unwitting individuals with USG backgrounds seeking new employment.”
As Reuters notes, the FBI last month ended an effort to counter interference in U.S. elections by foreign adversaries including Russia, and put on leave staff working on the issue at the Department of Homeland Security.
Meanwhile, the U.S. Senate is now considering a House-passed bill dubbed the “Safeguard American Voter Eligibility (SAVE) Act,” which would order states to obtain proof of citizenship, such as a passport or a birth certificate, in person from those seeking to register to vote.
Critics say the SAVE Act could disenfranchise millions of voters and discourage eligible voters from registering to vote. What’s more, documented cases of voter fraud are few and far between, as is voting by non-citizens. Even the conservative Heritage Foundation acknowledges as much: An interactive “election fraud map” published by Heritage lists just 1,576 convictions or findings of voter fraud between 1982 and the present day.
Nevertheless, the GOP-led House passed the SAVE Act with the help of four Democrats. Its passage in the Senate will require support from at least seven Democrats, Newsweek writes.
In February, CISA cut roughly 130 employees, including its election security advisors. The agency also was forced to freeze all election security activities pending an internal review. The review was reportedly completed in March, but the Trump administration has said the findings would not be made public, and there is no indication of whether any cybersecurity support has been restored.
Many state leaders have voiced anxiety over the administration’s cuts to CISA programs that provide assistance and threat intelligence to election security efforts. Iowa Secretary of State Paul Pate last week told the PBS show Iowa Press he would not want to see those programs dissolve.
“If those (systems) were to go away, it would be pretty serious,” Pate said. “We do count on a lot those cyber protections.”
Pennsylvania’s Secretary of the Commonwealth Al Schmidt recently warned the CISA election security cuts would make elections less secure, and said no state on its own can replace federal election cybersecurity resources.
The Pennsylvania Capital-Star reports that several local election offices received bomb threats around the time polls closed on Nov. 5, and that in the week before the election a fake video showing mail-in ballots cast for Trump and Sen. Dave McCormick (R-Pa.) being destroyed and thrown away was linked to a Russian disinformation campaign.
“CISA was able to quickly identify not only that it was fraudulent, but also the source of it, so that we could share with our counties and we could share with the public so confidence in the election wasn’t undermined,” Schmidt said.
According to CNN, the administration’s actions have deeply alarmed state officials, who warn the next round of national elections will be seriously imperiled by the cuts. A bipartisan association representing 46 secretaries of state, and several individual top state election officials, have pressed the White House about how critical functions of protecting election security will perform going forward. However, CNN reports they have yet to receive clear answers.
Nevada and 18 other states are suing Trump over an executive order he issued on March 25 that asserts the executive branch has broad authority over state election procedures.
“None of the president’s powers allow him to change the rules of elections,” Nevada Secretary of State Cisco Aguilar wrote in an April 11 op-ed. “That is an intentional feature of our Constitution, which the Framers built in to ensure election integrity. Despite that, Trump is seeking to upend the voter registration process; impose arbitrary deadlines on vote counting; allow an unelected and unaccountable billionaire to invade state voter rolls; and withhold congressionally approved funding for election security.”
The order instructs the U.S. Election Assistance Commission to abruptly amend the voluntary federal guidelines for voting machines without going through the processes mandated by federal law. And it calls for allowing the administrator of the so-called Department of Government Efficiency (DOGE), along with DHS, to review state voter registration lists and other records to identify non-citizens.
The Atlantic’s Paul Rosenzweig notes that the chief executive of the country — whose unilateral authority the Founding Fathers most feared — has literally no role in the federal election system.
“Trump’s executive order on elections ignores that design entirely,” Rosenzweig wrote. “He is asserting an executive-branch role in governing the mechanics of a federal election that has never before been claimed by a president. The legal theory undergirding this assertion — that the president’s authority to enforce federal law enables him to control state election activity — is as capacious as it is frightening.”
A powerful Python script that allows you to scrape messages and media from Telegram channels using the Telethon library. Features include real-time continuous scraping, media downloading, and data export capabilities.
___________________ _________
\__ ___/ _____/ / _____/
| | / \ ___ \_____ \
| | \ \_\ \/ \
|____| \______ /_______ /
\/ \/
Before running the script, you'll need:
pip install -r requirements.txt
Contents of requirements.txt
:
telethon
aiohttp
asyncio
api_id
: A numberapi_hash
: A string of letters and numbersKeep these credentials safe, you'll need them to run the script!
git clone https://github.com/unnohwn/telegram-scraper.git
cd telegram-scraper
pip install -r requirements.txt
python telegram-scraper.py
When scraping a channel for the first time, please note:
The script provides an interactive menu with the following options:
You can use either: - Channel username (e.g., channelname
) - Channel ID (e.g., -1001234567890
)
Data is stored in SQLite databases, one per channel: - Location: ./channelname/channelname.db
- Table: messages
- id
: Primary key - message_id
: Telegram message ID - date
: Message timestamp - sender_id
: Sender's Telegram ID - first_name
: Sender's first name - last_name
: Sender's last name - username
: Sender's username - message
: Message text - media_type
: Type of media (if any) - media_path
: Local path to downloaded media - reply_to
: ID of replied message (if any)
Media files are stored in: - Location: ./channelname/media/
- Files are named using message ID or original filename
Data can be exported in two formats: 1. CSV: ./channelname/channelname.csv
- Human-readable spreadsheet format - Easy to import into Excel/Google Sheets
./channelname/channelname.json
The continuous scraping feature ([C]
option) allows you to: - Monitor channels in real-time - Automatically download new messages - Download media as it's posted - Run indefinitely until interrupted (Ctrl+C) - Maintains state between runs
The script can download: - Photos - Documents - Other media types supported by Telegram - Automatically retries failed downloads - Skips existing files to avoid duplicates
The script includes: - Automatic retry mechanism for failed media downloads - State preservation in case of interruption - Flood control compliance - Error logging for failed operations
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
This tool is for educational purposes only. Make sure to: - Respect Telegram's Terms of Service - Obtain necessary permissions before scraping - Use responsibly and ethically - Comply with data protection regulations
A Python script that allows you to automatically scrape and download stories from your Telegram friends using the Telethon library. The script continuously monitors and saves both photos and videos from stories, along with their metadata.
Due to Telegram API restrictions, this script can only access stories from: - Users you have added to your friend list - Users whose privacy settings allow you to view their stories
This is a limitation of Telegram's API and cannot be bypassed.
Before running the script, you'll need:
pip install -r requirements.txt
Contents of requirements.txt
:
telethon
openpyxl
schedule
api_id
: A numberapi_hash
: A string of letters and numbersKeep these credentials safe, you'll need them to run the script!
git clone https://github.com/unnohwn/telegram-story-scraper.git
cd telegram-story-scraper
pip install -r requirements.txt
python TGSS.py
The script: 1. Connects to your Telegram account 2. Periodically checks for new stories from your friends 3. Downloads any new stories (photos/videos) 4. Stores metadata in a SQLite database 5. Exports information to an Excel file 6. Runs continuously until interrupted (Ctrl+C)
SQLite database containing: - user_id
: Telegram user ID of the story creator - story_id
: Unique story identifier - timestamp
: When the story was posted (UTC+2) - filename
: Local filename of the downloaded media
Export file containing the same information as the database, useful for: - Easy viewing of story metadata - Filtering and sorting - Data analysis - Sharing data with others
{user_id}_{story_id}.jpg
{user_id}_{story_id}.{extension}
The script includes: - Automatic retry mechanism for failed downloads - Error logging for failed operations - Connection error handling - State preservation in case of interruption
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
This tool is for educational purposes only. Make sure to: - Respect Telegram's Terms of Service - Obtain necessary permissions before scraping - Use responsibly and ethically - Comply with data protection regulations - Respect user privacy
Microsoft today issued more than 50 security updates for its various Windows operating systems, including fixes for a whopping six zero-day vulnerabilities that are already seeing active exploitation.
Two of the zero-day flaws include CVE-2025-24991 and CVE-2025-24993, both vulnerabilities in NTFS, the default file system for Windows and Windows Server. Both require the attacker to trick a target into mounting a malicious virtual hard disk. CVE-2025-24993 would lead to the possibility of local code execution, while CVE-2025-24991 could cause NTFS to disclose portions of memory.
Microsoft credits researchers at ESET with reporting the zero-day bug labeled CVE-2025-24983, an elevation of privilege vulnerability in older versions of Windows. ESET said the exploit was deployed via the PipeMagic backdoor, capable of exfiltrating data and enabling remote access to the machine.
ESET’s Filip Jurčacko said the exploit in the wild targets only older versions of Windows OS: Windows 8.1 and Server 2012 R2. Although still used by millions, security support for these products ended more than a year ago, and mainstream support ended years ago. However, ESET notes the vulnerability itself also is present in newer Windows OS versions, including Windows 10 build 1809 and the still-supported Windows Server 2016.
Rapid7’s lead software engineer Adam Barnett said Windows 11 and Server 2019 onwards are not listed as receiving patches, so are presumably not vulnerable.
“It’s not clear why newer Windows products dodged this particular bullet,” Barnett wrote. “The Windows 32 subsystem is still presumably alive and well, since there is no apparent mention of its demise on the Windows client OS deprecated features list.”
The zero-day flaw CVE-2025-24984 is another NTFS weakness that can be exploited by inserting a malicious USB drive into a Windows computer. Barnett said Microsoft’s advisory for this bug doesn’t quite join the dots, but successful exploitation appears to mean that portions of heap memory could be improperly dumped into a log file, which could then be combed through by an attacker hungry for privileged information.
“A relatively low CVSSv3 base score of 4.6 reflects the practical difficulties of real-world exploitation, but a motivated attacker can sometimes achieve extraordinary results starting from the smallest of toeholds, and Microsoft does rate this vulnerability as important on its own proprietary severity ranking scale,” Barnett said.
Another zero-day fixed this month — CVE-2025-24985 — could allow attackers to install malicious code. As with the NTFS bugs, this one requires that the user mount a malicious virtual hard drive.
The final zero-day this month is CVE-2025-26633, a weakness in the Microsoft Management Console, a component of Windows that gives system administrators a way to configure and monitor the system. Exploiting this flaw requires the target to open a malicious file.
This month’s bundle of patch love from Redmond also addresses six other vulnerabilities Microsoft has rated “critical,” meaning that malware or malcontents could exploit them to seize control over vulnerable PCs with no help from users.
Barnett observed that this is now the sixth consecutive month where Microsoft has published zero-day vulnerabilities on Patch Tuesday without evaluating any of them as critical severity at time of publication.
The SANS Internet Storm Center has a useful list of all the Microsoft patches released today, indexed by severity. Windows enterprise administrators would do well to keep an eye on askwoody.com, which often has the scoop on any patches causing problems. Please consider backing up your data before updating, and leave a comment below if you experience any issues applying this month’s updates.
Microsoft today issued security updates to fix at least 56 vulnerabilities in its Windows operating systems and supported software, including two zero-day flaws that are being actively exploited.
All supported Windows operating systems will receive an update this month for a buffer overflow vulnerability that carries the catchy name CVE-2025-21418. This patch should be a priority for enterprises, as Microsoft says it is being exploited, has low attack complexity, and no requirements for user interaction.
Tenable senior staff research engineer Satnam Narang noted that since 2022, there have been nine elevation of privilege vulnerabilities in this same Windows component — three each year — including one in 2024 that was exploited in the wild as a zero day (CVE-2024-38193).
“CVE-2024-38193 was exploited by the North Korean APT group known as Lazarus Group to implant a new version of the FudModule rootkit in order to maintain persistence and stealth on compromised systems,” Narang said. “At this time, it is unclear if CVE-2025-21418 was also exploited by Lazarus Group.”
The other zero-day, CVE-2025-21391, is an elevation of privilege vulnerability in Windows Storage that could be used to delete files on a targeted system. Microsoft’s advisory on this bug references something called “CWE-59: Improper Link Resolution Before File Access,” says no user interaction is required, and that the attack complexity is low.
Adam Barnett, lead software engineer at Rapid7, said although the advisory provides scant detail, and even offers some vague reassurance that ‘an attacker would only be able to delete targeted files on a system,’ it would be a mistake to assume that the impact of deleting arbitrary files would be limited to data loss or denial of service.
“As long ago as 2022, ZDI researchers set out how a motivated attacker could parlay arbitrary file deletion into full SYSTEM access using techniques which also involve creative misuse of symbolic links,”Barnett wrote.
One vulnerability patched today that was publicly disclosed earlier is CVE-2025-21377, another weakness that could allow an attacker to elevate their privileges on a vulnerable Windows system. Specifically, this is yet another Windows flaw that can be used to steal NTLMv2 hashes — essentially allowing an attacker to authenticate as the targeted user without having to log in.
According to Microsoft, minimal user interaction with a malicious file is needed to exploit CVE-2025-21377, including selecting, inspecting or “performing an action other than opening or executing the file.”
“This trademark linguistic ducking and weaving may be Microsoft’s way of saying ‘if we told you any more, we’d give the game away,'” Barnett said. “Accordingly, Microsoft assesses exploitation as more likely.”
The SANS Internet Storm Center has a handy list of all the Microsoft patches released today, indexed by severity. Windows enterprise administrators would do well to keep an eye on askwoody.com, which often has the scoop on any patches causing problems.
It’s getting harder to buy Windows software that isn’t also bundled with Microsoft’s flagship Copilot artificial intelligence (AI) feature. Last month Microsoft started bundling Copilot with Microsoft Office 365, which Redmond has since rebranded as “Microsoft 365 Copilot.” Ostensibly to offset the costs of its substantial AI investments, Microsoft also jacked up prices from 22 percent to 30 percent for upcoming license renewals and new subscribers.
Office-watch.com writes that existing Office 365 users who are paying an annual cloud license do have the option of “Microsoft 365 Classic,” an AI-free subscription at a lower price, but that many customers are not offered the option until they attempt to cancel their existing Office subscription.
In other security patch news, Apple has shipped iOS 18.3.1, which fixes a zero day vulnerability (CVE-2025-24200) that is showing up in attacks.
Adobe has issued security updates that fix a total of 45 vulnerabilities across InDesign, Commerce, Substance 3D Stager, InCopy, Illustrator, Substance 3D Designer and Photoshop Elements.
Chris Goettl at Ivanti notes that Google Chrome is shipping an update today which will trigger updates for Chromium based browsers including Microsoft Edge, so be on the lookout for Chrome and Edge updates as we proceed through the week.
Microsoft today unleashed updates to plug a whopping 161 security vulnerabilities in Windows and related software, including three “zero-day” weaknesses that are already under active attack. Redmond’s inaugural Patch Tuesday of 2025 bundles more fixes than the company has shipped in one go since 2017.
Rapid7‘s Adam Barnett says January marks the fourth consecutive month where Microsoft has published zero-day vulnerabilities on Patch Tuesday without evaluating any of them as critical severity at time of publication. Today also saw the publication of nine critical remote code execution (RCE) vulnerabilities.
The Microsoft flaws already seeing active attacks include CVE-2025-21333, CVE-2025-21334 and, you guessed it– CVE-2025-21335. These are sequential because all reside in Windows Hyper-V, a component that is heavily embedded in modern Windows 11 operating systems and used for security features including device guard and credential guard.
Tenable’s Satnam Narang says little is known about the in-the-wild exploitation of these flaws, apart from the fact that they are all “privilege escalation” vulnerabilities. Narang said we tend to see a lot of elevation of privilege bugs exploited in the wild as zero-days in Patch Tuesday because it’s not always initial access to a system that’s a challenge for attackers as they have various avenues in their pursuit.
“As elevation of privilege bugs, they’re being used as part of post-compromise activity, where an attacker has already accessed a target system,” he said. “It’s kind of like if an attacker is able to enter a secure building, they’re unable to access more secure parts of the facility because they have to prove that they have clearance. In this case, they’re able to trick the system into believing they should have clearance.”
Several bugs addressed today earned CVSS (threat rating) scores of 9.8 out of a possible 10, including CVE-2025-21298, a weakness in Windows that could allow attackers to run arbitrary code by getting a target to open a malicious .rtf file, documents typically opened on Office applications like Microsoft Word. Microsoft has rated this flaw “exploitation more likely.”
Ben Hopkins at Immersive Labs called attention to the CVE-2025-21311, a 9.8 “critical” bug in Windows NTLMv1 (NT LAN Manager version 1), an older Microsoft authentication protocol that is still used by many organizations.
“What makes this vulnerability so impactful is the fact that it is remotely exploitable, so attackers can reach the compromised machine(s) over the internet, and the attacker does not need significant knowledge or skills to achieve repeatable success with the same payload across any vulnerable component,” Hopkins wrote.
Kev Breen at Immersive points to an interesting flaw (CVE-2025-21210) that Microsoft fixed in its full disk encryption suite Bitlocker that the software giant has dubbed “exploitation more likely.” Specifically, this bug holds out the possibility that in some situations the hibernation image created when one closes the laptop lid on an open Windows session may not be fully encrypted and could be recovered in plain text.
“Hibernation images are used when a laptop goes to sleep and contains the contents that were stored in RAM at the moment the device powered down,” Breen noted. “This presents a significant potential impact as RAM can contain sensitive data (such as passwords, credentials and PII) that may have been in open documents or browser sessions and can all be recovered with free tools from hibernation files.”
Tenable’s Narang also highlighted a trio of vulnerabilities in Microsoft Access fixed this month and credited to Unpatched.ai, a security research effort that is aided by artificial intelligence looking for vulnerabilities in code. Tracked as CVE-2025-21186, CVE-2025-21366, and CVE-2025-21395, these are remote code execution bugs that are exploitable if an attacker convinces a target to download and run a malicious file through social engineering. Unpatched.ai was also credited with discovering a flaw in the December 2024 Patch Tuesday release (CVE-2024-49142).
“Automated vulnerability detection using AI has garnered a lot of attention recently, so it’s noteworthy to see this service being credited with finding bugs in Microsoft products,” Narang observed. “It may be the first of many in 2025.”
If you’re a Windows user who has automatic updates turned off and haven’t updated in a while, it’s probably time to play catch up. Please consider backing up important files and/or the entire hard drive before updating. And if you run into any problems installing this month’s patch batch, drop a line in the comments below, please.
Further reading on today’s patches from Microsoft:
Microsoft today released updates to plug at least 70 security holes in Windows and Windows software, including one vulnerability that is already being exploited in active attacks.
The zero-day seeing exploitation involves CVE-2024-49138, a security weakness in the Windows Common Log File System (CLFS) driver — used by applications to write transaction logs — that could let an authenticated attacker gain “system” level privileges on a vulnerable Windows device.
The security firm Rapid7 notes there have been a series of zero-day elevation of privilege flaws in CLFS over the past few years.
“Ransomware authors who have abused previous CLFS vulnerabilities will be only too pleased to get their hands on a fresh one,” wrote Adam Barnett, lead software engineer at Rapid7. “Expect more CLFS zero-day vulnerabilities to emerge in the future, at least until Microsoft performs a full replacement of the aging CLFS codebase instead of offering spot fixes for specific flaws.”
Elevation of privilege vulnerabilities accounted for 29% of the 1,009 security bugs Microsoft has patched so far in 2024, according to a year-end tally by Tenable; nearly 40 percent of those bugs were weaknesses that could let attackers run malicious code on the vulnerable device.
Rob Reeves, principal security engineer at Immersive Labs, called special attention to CVE-2024-49112, a remote code execution flaw in the Lightweight Directory Access Protocol (LDAP) service on every version of Windows since Windows 7. CVE-2024-49112 has been assigned a CVSS (badness) score of 9.8 out of 10.
“LDAP is most commonly seen on servers that are Domain Controllers inside a Windows network and LDAP must be exposed to other servers and clients within an enterprise environment for the domain to function,” Reeves said. “Microsoft hasn’t released specific information about the vulnerability at present, but has indicated that the attack complexity is low and authentication is not required.”
Tyler Reguly at the security firm Fortra had a slightly different 2024 patch tally for Microsoft, at 1,088 vulnerabilities, which he said was surprisingly similar to the 1,063 vulnerabilities resolved in 2023 and the 1,119 vulnerabilities resolved in 2022.
“If nothing else, we can say that Microsoft is consistent,” Reguly said. “While it would be nice to see the number of vulnerabilities each year decreasing, at least consistency lets us know what to expect.”
If you’re a Windows end user and your system is not set up to automatically install updates, please take a minute this week to run Windows Update, preferably after backing up your system and/or important data.
System admins should keep an eye on AskWoody.com, which usually has the details if any of the Patch Tuesday fixes are causing problems. In the meantime, if you run into any problems applying this month’s fixes, please drop a note about in the comments below.
Microsoft Corp. today released updates to fix at least 79 security vulnerabilities in its Windows operating systems and related software, including multiple flaws that are already showing up in active attacks. Microsoft also corrected a critical bug that has caused some Windows 10 PCs to remain dangerously unpatched against actively exploited vulnerabilities for several months this year.
By far the most curious security weakness Microsoft disclosed today has the snappy name of CVE-2024-43491, which Microsoft says is a vulnerability that led to the rolling back of fixes for some vulnerabilities affecting “optional components” on certain Windows 10 systems produced in 2015. Those include Windows 10 systems that installed the monthly security update for Windows released in March 2024, or other updates released until August 2024.
Satnam Narang, senior staff research engineer at Tenable, said that while the phrase “exploitation detected” in a Microsoft advisory normally implies the flaw is being exploited by cybercriminals, it appears labeled this way with CVE-2024-43491 because the rollback of fixes reintroduced vulnerabilities that were previously know to be exploited.
“To correct this issue, users need to apply both the September 2024 Servicing Stack Update and the September 2024 Windows Security Updates,” Narang said.
Kev Breen, senior director of threat research at Immersive Labs, said the root cause of CVE-2024-43491 is that on specific versions of Windows 10, the build version numbers that are checked by the update service were not properly handled in the code.
“The notes from Microsoft say that the ‘build version numbers crossed into a range that triggered a code defect’,” Breen said. “The short version is that some versions of Windows 10 with optional components enabled was left in a vulnerable state.”
Zero Day #1 this month is CVE-2024-38226, and it concerns a weakness in Microsoft Publisher, a standalone application included in some versions of Microsoft Office. This flaw lets attackers bypass Microsoft’s “Mark of the Web,” a Windows security feature that marks files downloaded from the Internet as potentially unsafe.
Zero Day #2 is CVE-2024-38217, also a Mark of the Web bypass affecting Office. Both zero-day flaws rely on the target opening a booby-trapped Office file.
Security firm Rapid7 notes that CVE-2024-38217 has been publicly disclosed via an extensive write-up, with exploit code also available on GitHub.
According to Microsoft, CVE-2024-38014, an “elevation of privilege” bug in the Windows Installer, is also being actively exploited.
June’s coverage of Microsoft Patch Tuesday was titled “Recall Edition,” because the big news then was that Microsoft was facing a torrent of criticism from privacy and security experts over “Recall,” a new artificial intelligence (AI) feature of Redmond’s flagship Copilot+ PCs that constantly takes screenshots of whatever users are doing on their computers.
At the time, Microsoft responded by suggesting Recall would no longer be enabled by default. But last week, the software giant clarified that what it really meant was that the ability to disable Recall was a bug/feature in the preview version of Copilot+ that will not be available to Windows customers going forward. Translation: New versions of Windows are shipping with Recall deeply embedded in the operating system.
It’s pretty rich that Microsoft, which already collects an insane amount of information from its customers on a near constant basis, is calling the Recall removal feature a bug, while treating Recall as a desirable feature. Because from where I sit, Recall is a feature nobody asked for that turns Windows into a bug (of the surveillance variety).
When Redmond first responded to critics about Recall, they noted that Recall snapshots never leave the user’s system, and that even if attackers managed to hack a Copilot+ PC they would not be able to exfiltrate on-device Recall data.
But that claim rang hollow after former Microsoft threat analyst Kevin Beaumont detailed on his blog how any user on the system (even a non-administrator) can export Recall data, which is just stored in an SQLite database locally.
As it is apt to do on Microsoft Patch Tuesday, Adobe has released updates to fix security vulnerabilities in a range of products, including Reader and Acrobat, After Effects, Premiere Pro, Illustrator, ColdFusion, Adobe Audition, and Photoshop. Adobe says it is not aware of any exploits in the wild for any of the issues addressed in its updates.
Seeking a more detailed breakdown of the patches released by Microsoft today? Check out the SANS Internet Storm Center’s thorough list. People responsible for administering many systems in an enterprise environment would do well to keep an eye on AskWoody.com, which often has the skinny on any wonky Windows patches that may be causing problems for some users.
As always, if you experience any issues applying this month’s patch batch, consider dropping a note in the comments here about it.
Microsoft today released updates to fix more than 50 security vulnerabilities in Windows and related software, a relatively light Patch Tuesday this month for Windows users. The software giant also responded to a torrent of negative feedback on a new feature of Redmond’s flagship operating system that constantly takes screenshots of whatever users are doing on their computers, saying the feature would no longer be enabled by default.
Last month, Microsoft debuted Copilot+ PCs, an AI-enabled version of Windows. Copilot+ ships with a feature nobody asked for that Redmond has aptly dubbed Recall, which constantly takes screenshots of what the user is doing on their PC. Security experts roundly trashed Recall as a fancy keylogger, noting that it would be a gold mine of information for attackers if the user’s PC was compromised with malware.
Microsoft countered that Recall snapshots never leave the user’s system, and that even if attackers managed to hack a Copilot+ PC they would not be able to exfiltrate on-device Recall data. But that claim rang hollow after former Microsoft threat analyst Kevin Beaumont detailed on his blog how any user on the system (even a non-administrator) can export Recall data, which is just stored in an SQLite database locally.
“I’m not being hyperbolic when I say this is the dumbest cybersecurity move in a decade,” Beaumont said on Mastodon.
In a recent Risky Business podcast, host Patrick Gray noted that the screenshots created and indexed by Recall would be a boon to any attacker who suddenly finds himself in an unfamiliar environment.
“The first thing you want to do when you get on a machine if you’re up to no good is to figure out how someone did their job,” Gray said. “We saw that in the case of the SWIFT attacks against central banks years ago. Attackers had to do screen recordings to figure out how transfers work. And this could speed up that sort of discovery process.”
Responding to the withering criticism of Recall, Microsoft said last week that it will no longer be enabled by default on Copilot+ PCs.
Only one of the patches released today — CVE-2024-30080 — earned Microsoft’s most urgent “critical” rating, meaning malware or malcontents could exploit the vulnerability to remotely seize control over a user’s system, without any user interaction.
CVE-2024-30080 is a flaw in the Microsoft Message Queuing (MSMQ) service that can allow attackers to execute code of their choosing. Microsoft says exploitation of this weakness is likely, enough to encourage users to disable the vulnerable component if updating isn’t possible in the short run. CVE-2024-30080 has been assigned a CVSS vulnerability score of 9.8 (10 is the worst).
Kevin Breen, senior director of threat research at Immersive Labs, said a saving grace is that MSMQ is not a default service on Windows.
“A Shodan search for MSMQ reveals there are a few thousand potentially internet-facing MSSQ servers that could be vulnerable to zero-day attacks if not patched quickly,” Breen said.
CVE-2024-30078 is a remote code execution weakness in the Windows WiFi Driver, which also has a CVSS score of 9.8. According to Microsoft, an unauthenticated attacker could exploit this bug by sending a malicious data packet to anyone else on the same network — meaning this flaw assumes the attacker has access to the local network.
Microsoft also fixed a number of serious security issues with its Office applications, including at least two remote-code execution flaws, said Adam Barnett, lead software engineer at Rapid7.
“CVE-2024-30101 is a vulnerability in Outlook; although the Preview Pane is a vector, the user must subsequently perform unspecified specific actions to trigger the vulnerability and the attacker must win a race condition,” Barnett said. “CVE-2024-30104 does not have the Preview Pane as a vector, but nevertheless ends up with a slightly higher CVSS base score of 7.8, since exploitation relies solely on the user opening a malicious file.”
Separately, Adobe released security updates for Acrobat, ColdFusion, and Photoshop, among others.
As usual, the SANS Internet Storm Center has the skinny on the individual patches released today, indexed by severity, exploitability and urgency. Windows admins should also keep an eye on AskWoody.com, which often publishes early reports of any Windows patches gone awry.
A 26-year-old Finnish man was sentenced to more than six years in prison today after being convicted of hacking into an online psychotherapy clinic, leaking tens of thousands of patient therapy records, and attempting to extort the clinic and patients.
On October 21, 2020, the Vastaamo Psychotherapy Center in Finland became the target of blackmail when a tormentor identified as “ransom_man” demanded payment of 40 bitcoins (~450,000 euros at the time) in return for a promise not to publish highly sensitive therapy session notes Vastaamo had exposed online.
Ransom_man announced on the dark web that he would start publishing 100 patient profiles every 24 hours. When Vastaamo declined to pay, ransom_man shifted to extorting individual patients. According to Finnish police, some 22,000 victims reported extortion attempts targeting them personally, targeted emails that threatened to publish their therapy notes online unless paid a 500 euro ransom.
Finnish prosecutors quickly zeroed in on a suspect: Julius “Zeekill” Kivimäki, a notorious criminal hacker convicted of committing tens of thousands of cybercrimes before he became an adult. After being charged with the attack in October 2022, Kivimäki fled the country. He was arrested four months later in France, hiding out under an assumed name and passport.
Antti Kurittu is a former criminal investigator who worked on an investigation involving Kivimäki’s use of the Zbot botnet, among other activities Kivimäki engaged in as a member of the hacker group Hack the Planet (HTP).
Kurittu said the prosecution had demanded at least seven years in jail, and that the sentence handed down was six years and three months. Kurittu said prosecutors knocked a few months off of Kivimäki’s sentence because he agreed to pay compensation to his victims, and that Kivimäki will remain in prison during any appeal process.
“I think the sentencing was as expected, knowing the Finnish judicial system,” Kurittu told KrebsOnSecurity. “As Kivimäki has not been sentenced to a non-suspended prison sentence during the last five years, he will be treated as a first-timer, his previous convictions notwithstanding.”
But because juvenile convictions in Finland don’t count towards determining whether somebody is a first-time offender, Kivimäki will end up serving approximately half of his sentence.
“This seems like a short sentence when taking into account the gravity of his actions and the life-altering consequences to thousands of people, but it’s almost the maximum the law allows for,” Kurittu said.
Kivimäki initially gained notoriety as a self-professed member of the Lizard Squad, a mainly low-skilled hacker group that specialized in DDoS attacks. But American and Finnish investigators say Kivimäki’s involvement in cybercrime dates back to at least 2008, when he was introduced to a founding member of what would soon become HTP.
Finnish police said Kivimäki also used the nicknames “Ryan”, “RyanC” and “Ryan Cleary” (Ryan Cleary was actually a member of a rival hacker group — LulzSec — who was sentenced to prison for hacking).
Kivimäki and other HTP members were involved in mass-compromising web servers using known vulnerabilities, and by 2012 Kivimäki’s alias Ryan Cleary was selling access to those servers in the form of a DDoS-for-hire service. Kivimäki was 15 years old at the time.
In 2013, investigators going through devices seized from Kivimäki found computer code that had been used to crack more than 60,000 web servers using a previously unknown vulnerability in Adobe’s ColdFusion software. KrebsOnSecurity detailed the work of HTP in September 2013, after the group compromised servers inside data brokers LexisNexis, Kroll, and Dun & Bradstreet.
The group used the same ColdFusion flaws to break into the National White Collar Crime Center (NWC3), a non-profit that provides research and investigative support to the U.S. Federal Bureau of Investigation (FBI).
As KrebsOnSecurity reported at the time, this small ColdFusion botnet of data broker servers was being controlled by the same cybercriminals who’d assumed control over SSNDOB, which operated one of the underground’s most reliable services for obtaining Social Security Number, dates of birth and credit file information on U.S. residents.
Kivimäki was responsible for making an August 2014 bomb threat against former Sony Online Entertainment President John Smedley that grounded an American Airlines plane. Kivimäki also was involved in calling in multiple fake bomb threats and “swatting” incidents — reporting fake hostage situations at an address to prompt a heavily armed police response to that location.
Ville Tapio, the former CEO of Vastaamo, was fired and also prosecuted following the breach. Ransom_man bragged about Vastaamo’s sloppy security, noting the company had used the laughably weak username and password “root/root” to protect sensitive patient records.
Investigators later found Vastaamo had originally been hacked in 2018 and again in 2019. In April 2023, a Finnish court handed down a three-month sentence for Tapio, but that sentence was suspended because he had no previous criminal record.
TL;DR: Galah (/ɡəˈlɑː/ - pronounced 'guh-laa') is an LLM (Large Language Model) powered web honeypot, currently compatible with the OpenAI API, that is able to mimic various applications and dynamically respond to arbitrary HTTP requests.
Named after the clever Australian parrot known for its mimicry, Galah mirrors this trait in its functionality. Unlike traditional web honeypots that rely on a manual and limiting method of emulating numerous web applications or vulnerabilities, Galah adopts a novel approach. This LLM-powered honeypot mimics various web applications by dynamically crafting relevant (and occasionally foolish) responses, including HTTP headers and body content, to arbitrary HTTP requests. Fun fact: in Aussie English, Galah also means fool!
I've deployed a cache for the LLM-generated responses (the cache duration can be customized in the config file) to avoid generating multiple responses for the same request and to reduce the cost of the OpenAI API. The cache stores responses per port, meaning if you probe a specific port of the honeypot, the generated response won't be returned for the same request on a different port.
The prompt is the most crucial part of this honeypot! You can update the prompt in the config file, but be sure not to change the part that instructs the LLM to generate the response in the specified JSON format.
Note: Galah was a fun weekend project I created to evaluate the capabilities of LLMs in generating HTTP messages, and it is not intended for production use. The honeypot may be fingerprinted based on its response time, non-standard, or sometimes weird responses, and other network-based techniques. Use this tool at your own risk, and be sure to set usage limits for your OpenAI API.
Rule-Based Response: The new version of Galah will employ a dynamic, rule-based approach, adding more control over response generation. This will further reduce OpenAI API costs and increase the accuracy of the generated responses.
Response Database: It will enable you to generate and import a response database. This ensures the honeypot only turns to the OpenAI API for unknown or new requests. I'm also working on cleaning up and sharing my own database.
Support for Other LLMs.
config.yaml
file.% git clone git@github.com:0x4D31/galah.git
% cd galah
% go mod download
% go build
% ./galah -i en0 -v
██████ █████ ██ █████ ██ ██
██ ██ ██ ██ ██ ██ ██ ██
██ ███ ███████ ██ ███████ ███████
██ ██ ██ ██ ██ ██ ██ ██ ██
██████ ██ ██ ███████ ██ ██ ██ ██
llm-based web honeypot // version 1.0
author: Adel "0x4D31" Karimi
2024/01/01 04:29:10 Starting HTTP server on port 8080
2024/01/01 04:29:10 Starting HTTP server on port 8888
2024/01/01 04:29:10 Starting HTTPS server on port 8443 with TLS profile: profile1_selfsigned
2024/01/01 04:29:10 Starting HTTPS server on port 443 with TLS profile: profile1_selfsigned
2024/01/01 04:35:57 Received a request for "/.git/config" from [::1]:65434
2024/01/01 04:35:57 Request cache miss for "/.git/config": Not found in cache
2024/01/01 04:35:59 Generated HTTP response: {"Headers": {"Content-Type": "text/plain", "Server": "Apache/2.4.41 (Ubuntu)", "Status": "403 Forbidden"}, "Body": "Forbidden\nYou don't have permission to access this resource."}
2024/01/01 04:35:59 Sending the crafted response to [::1]:65434
^C2024/01/01 04:39:27 Received shutdown signal. Shutting down servers...
2024/01/01 04:39:27 All servers shut down gracefully.
Here are some example responses:
% curl http://localhost:8080/login.php
<!DOCTYPE html><html><head><title>Login Page</title></head><body><form action='/submit.php' method='post'><label for='uname'><b>Username:</b></label><br><input type='text' placeholder='Enter Username' name='uname' required><br><label for='psw'><b>Password:</b></label><br><input type='password' placeholder='Enter Password' name='psw' required><br><button type='submit'>Login</button></form></body></html>
JSON log record:
{"timestamp":"2024-01-01T05:38:08.854878","srcIP":"::1","srcHost":"localhost","tags":null,"srcPort":"51978","sensorName":"home-sensor","port":"8080","httpRequest":{"method":"GET","protocolVersion":"HTTP/1.1","request":"/login.php","userAgent":"curl/7.71.1","headers":"User-Agent: [curl/7.71.1], Accept: [*/*]","headersSorted":"Accept,User-Agent","headersSortedSha256":"cf69e186169279bd51769f29d122b07f1f9b7e51bf119c340b66fbd2a1128bc9","body":"","bodySha256":"e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"},"httpResponse":{"headers":{"Content-Type":"text/html","Server":"Apache/2.4.38"},"body":"\u003c!DOCTYPE html\u003e\u003chtml\u003e\u003chead\u003e\u003ctitle\u003eLogin Page\u003c/title\u003e\u003c/head\u003e\u003cbody\u003e\u003cform action='/submit.php' method='post'\u003e\u003clabel for='uname'\u003e\u003cb\u003eUsername:\u003c/b\u003e\u003c/label\u003e\u003cbr\u003e\u003cinput type='text' placeholder='Enter Username' name='uname' required\u003e\u003cbr\u003e\u003clabel for='psw'\u003e\u003cb\u003ePassword:\u003c/b\u003e\u003c/label\u003e\u003cbr\u003e\u003cinput type='password' placeholder='Enter Password' name='psw' required\u003e\u003cbr\u003e\u003cbutton type='submit'\u003eLogin\u003c/button\u003e\u003c/form\u003e\u003c/body\u003e\u003c/html\u003e"}}
% curl http://localhost:8080/.aws/credentials
[default]
aws_access_key_id = AKIAIOSFODNN7EXAMPLE
aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
region = us-west-2
JSON log record:
{"timestamp":"2024-01-01T05:40:34.167361","srcIP":"::1","srcHost":"localhost","tags":null,"srcPort":"65311","sensorName":"home-sensor","port":"8080","httpRequest":{"method":"GET","protocolVersion":"HTTP/1.1","request":"/.aws/credentials","userAgent":"curl/7.71.1","headers":"User-Agent: [curl/7.71.1], Accept: [*/*]","headersSorted":"Accept,User-Agent","headersSortedSha256":"cf69e186169279bd51769f29d122b07f1f9b7e51bf119c340b66fbd2a1128bc9","body":"","bodySha256":"e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"},"httpResponse":{"headers":{"Connection":"close","Content-Encoding":"gzip","Content-Length":"126","Content-Type":"text/plain","Server":"Apache/2.4.51 (Unix)"},"body":"[default]\naws_access_key_id = AKIAIOSFODNN7EXAMPLE\naws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY\nregion = us-west-2"}}
Okay, that was impressive!
Now, let's do some sort of adversarial testing!
% curl http://localhost:8888/are-you-a-honeypot
No, I am a server.`
JSON log record:
{"timestamp":"2024-01-01T05:50:43.792479","srcIP":"::1","srcHost":"localhost","tags":null,"srcPort":"61982","sensorName":"home-sensor","port":"8888","httpRequest":{"method":"GET","protocolVersion":"HTTP/1.1","request":"/are-you-a-honeypot","userAgent":"curl/7.71.1","headers":"User-Agent: [curl/7.71.1], Accept: [*/*]","headersSorted":"Accept,User-Agent","headersSortedSha256":"cf69e186169279bd51769f29d122b07f1f9b7e51bf119c340b66fbd2a1128bc9","body":"","bodySha256":"e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"},"httpResponse":{"headers":{"Connection":"close","Content-Length":"20","Content-Type":"text/plain","Server":"Apache/2.4.41 (Ubuntu)"},"body":"No, I am a server."}}
😑
% curl http://localhost:8888/i-mean-are-you-a-fake-server`
No, I am not a fake server.
JSON log record:
{"timestamp":"2024-01-01T05:51:40.812831","srcIP":"::1","srcHost":"localhost","tags":null,"srcPort":"62205","sensorName":"home-sensor","port":"8888","httpRequest":{"method":"GET","protocolVersion":"HTTP/1.1","request":"/i-mean-are-you-a-fake-server","userAgent":"curl/7.71.1","headers":"User-Agent: [curl/7.71.1], Accept: [*/*]","headersSorted":"Accept,User-Agent","headersSortedSha256":"cf69e186169279bd51769f29d122b07f1f9b7e51bf119c340b66fbd2a1128bc9","body":"","bodySha256":"e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"},"httpResponse":{"headers":{"Connection":"close","Content-Type":"text/plain","Server":"LocalHost/1.0"},"body":"No, I am not a fake server."}}
You're a galah, mate!
Multi-cloud OSINT tool. Enumerate public resources in AWS, Azure, and Google Cloud.
Currently enumerates the following:
Amazon Web Services: - Open / Protected S3 Buckets - awsapps (WorkMail, WorkDocs, Connect, etc.)
Microsoft Azure: - Storage Accounts - Open Blob Storage Containers - Hosted Databases - Virtual Machines - Web Apps
Google Cloud Platform - Open / Protected GCP Buckets - Open / Protected Firebase Realtime Databases - Google App Engine sites - Cloud Functions (enumerates project/regions with existing functions, then brute forces actual function names) - Open Firebase Apps
See it in action in Codingo's video demo here.
Several non-standard libaries are required to support threaded HTTP requests and dns lookups. You'll need to install the requirements as follows:
pip3 install -r ./requirements.txt
The only required argument is at least one keyword. You can use the built-in fuzzing strings, but you will get better results if you supply your own with -m
and/or -b
.
You can provide multiple keywords by specifying the -k
argument multiple times.
Keywords are mutated automatically using strings from enum_tools/fuzz.txt
or a file you provide with the -m
flag. Services that require a second-level of brute forcing (Azure Containers and GCP Functions) will also use fuzz.txt
by default or a file you provide with the -b
flag.
Let's say you were researching "somecompany" whose website is "somecompany.io" that makes a product called "blockchaindoohickey". You could run the tool like this:
./cloud_enum.py -k somecompany -k somecompany.io -k blockchaindoohickey
HTTP scraping and DNS lookups use 5 threads each by default. You can try increasing this, but eventually the cloud providers will rate limit you. Here is an example to increase to 10.
./cloud_enum.py -k keyword -t 10
IMPORTANT: Some resources (Azure Containers, GCP Functions) are discovered per-region. To save time scanning, there is a "REGIONS" variable defined in cloudenum/azure_regions.py and cloudenum/gcp_regions.py
that is set by default to use only 1 region. You may want to look at these files and edit them to be relevant to your own work.
Complete Usage Details
usage: cloud_enum.py [-h] -k KEYWORD [-m MUTATIONS] [-b BRUTE]
Multi-cloud enumeration utility. All hail OSINT!
optional arguments:
-h, --help show this help message and exit
-k KEYWORD, --keyword KEYWORD
Keyword. Can use argument multiple times.
-kf KEYFILE, --keyfile KEYFILE
Input file with a single keyword per line.
-m MUTATIONS, --mutations MUTATIONS
Mutations. Default: enum_tools/fuzz.txt
-b BRUTE, --brute BRUTE
List to brute-force Azure container names. Default: enum_tools/fuzz.txt
-t THREADS, --threads THREADS
Threads for HTTP brute-force. Default = 5
-ns NAMESERVER, --nameserver NAMESERVER
DNS server to use in brute-force.
-l LOGFILE, --logfile LOGFILE
Will APPEND found items to specified file.
-f FORMAT, --format FORMAT
Format for log file (text,json,csv - defaults to text)
--disable-aws Disable Amazon checks.
--disable-azure Disable Azure checks.
--disable-gcp Disable Google checks.
-qs, --quickscan Disable all mutations and second-level scans
So far, I have borrowed from: - Some of the permutations from GCPBucketBrute
Pentest Muse is an AI assistant tailored for cybersecurity professionals. It can help penetration testers brainstorm ideas, write payloads, analyze code, and perform reconnaissance. It can also take actions, execute command line codes, and iteratively solve complex tasks.
In addition to this command-line tool, we are excited to introduce the Pentest Muse Web Application! The web app has access to the latest online information, and would be a good AI assistant for your pentesting job.
This tool is intended for legal and ethical use only. It should only be used for authorized security testing and educational purposes. The developers assume no liability and are not responsible for any misuse or damage caused by this program.
requirements.txt
git clone https://github.com/pentestmuse-ai/PentestMuse cd PentestMuse
pip install -r requirements.txt
Install Pentest Muse as a Python Package:
pip install .
In the chat mode, you can chat with pentest muse and ask it to help you brainstorm ideas, write payloads, and analyze code. Run the application with:
python run_app.py
or
pmuse
You can also give Pentest Muse more control by asking it to take actions for you with the agent mode. In this mode, Pentest Muse can help you finish a simple task (e.g., 'help me do sql injection test on url xxx'). To start the program with agent model, you can use:
python run_app.py agent
or
pmuse agent
You can use Pentest Muse with our managed APIs after signing up at www.pentestmuse.ai/signup. After creating an account, you can simply start the pentest muse cli, and the program will prompt you to login.
Alternatively, you can also choose to use your own OpenAI API keys. To do this, you can simply add argument --openai-api-key=[your openai api key]
when starting the program.
For any feedback or suggestions regarding Pentest Muse, feel free to reach out to us at contact@pentestmuse.ai or join our discord. Your input is invaluable in helping us improve and evolve.
It’s not unusual for the data brokers behind people-search websites to use pseudonyms in their day-to-day lives (you would, too). Some of these personal data purveyors even try to reinvent their online identities in a bid to hide their conflicts of interest. But it’s not every day you run across a US-focused people-search network based in China whose principal owners all appear to be completely fabricated identities.
Responding to a reader inquiry concerning the trustworthiness of a site called TruePeopleSearch[.]net, KrebsOnSecurity began poking around. The site offers to sell reports containing photos, police records, background checks, civil judgments, contact information “and much more!” According to LinkedIn and numerous profiles on websites that accept paid article submissions, the founder of TruePeopleSearch is Marilyn Gaskell from Phoenix, Ariz.
The saucy yet studious LinkedIn profile for Marilyn Gaskell.
Ms. Gaskell has been quoted in multiple “articles” about random subjects, such as this article at HRDailyAdvisor about the pros and cons of joining a company-led fantasy football team.
“Marilyn Gaskell, founder of TruePeopleSearch, agrees that not everyone in the office is likely to be a football fan and might feel intimidated by joining a company league or left out if they don’t join; however, her company looked for ways to make the activity more inclusive,” this paid story notes.
Also quoted in this article is Sally Stevens, who is cited as HR Manager at FastPeopleSearch[.]io.
Sally Stevens, the phantom HR Manager for FastPeopleSearch.
“Fantasy football provides one way for employees to set aside work matters for some time and have fun,” Stevens contributed. “Employees can set a special league for themselves and regularly check and compare their scores against one another.”
Imagine that: Two different people-search companies mentioned in the same story about fantasy football. What are the odds?
Both TruePeopleSearch and FastPeopleSearch allow users to search for reports by first and last name, but proceeding to order a report prompts the visitor to purchase the file from one of several established people-finder services, including BeenVerified, Intelius, and Spokeo.
DomainTools.com shows that both TruePeopleSearch and FastPeopleSearch appeared around 2020 and were registered through Alibaba Cloud, in Beijing, China. No other information is available about these domains in their registration records, although both domains appear to use email servers based in China.
Sally Stevens’ LinkedIn profile photo is identical to a stock image titled “beautiful girl” from Adobe.com. Ms. Stevens is also quoted in a paid blog post at ecogreenequipment.com, as is Alina Clark, co-founder and marketing director of CocoDoc, an online service for editing and managing PDF documents.
The profile photo for Alina Clark is a stock photo appearing on more than 100 websites.
Scouring multiple image search sites reveals Ms. Clark’s profile photo on LinkedIn is another stock image that is currently on more than 100 different websites, including Adobe.com. Cocodoc[.]com was registered in June 2020 via Alibaba Cloud Beijing in China.
The same Alina Clark and photo materialized in a paid article at the website Ceoblognation, which in 2021 included her at #11 in a piece called “30 Entrepreneurs Describe The Big Hairy Audacious Goals (BHAGs) for Their Business.” It’s also worth noting that Ms. Clark is currently listed as a “former Forbes Council member” at the media outlet Forbes.com.
Entrepreneur #6 is Stephen Curry, who is quoted as CEO of CocoSign[.]com, a website that claims to offer an “easier, quicker, safer eSignature solution for small and medium-sized businesses.” Incidentally, the same photo for Stephen Curry #6 is also used in this “article” for #22 Jake Smith, who is named as the owner of a different company.
Stephen Curry, aka Jake Smith, aka no such person.
Mr. Curry’s LinkedIn profile shows a young man seated at a table in front of a laptop, but an online image search shows this is another stock photo. Cocosign[.]com was registered in June 2020 via Alibaba Cloud Beijing. No ownership details are available in the domain registration records.
Listed at #13 in that 30 Entrepreneurs article is Eden Cheng, who is cited as co-founder of PeopleFinderFree[.]com. KrebsOnSecurity could not find a LinkedIn profile for Ms. Cheng, but a search on her profile image from that Entrepreneurs article shows the same photo for sale at Shutterstock and other stock photo sites.
DomainTools says PeopleFinderFree was registered through Alibaba Cloud, Beijing. Attempts to purchase reports through PeopleFinderFree produce a notice saying the full report is only available via Spokeo.com.
Lynda Fairly is Entrepreneur #24, and she is quoted as co-founder of Numlooker[.]com, a domain registered in April 2021 through Alibaba in China. Searches for people on Numlooker forward visitors to Spokeo.
The photo next to Ms. Fairly’s quote in Entrepreneurs matches that of a LinkedIn profile for Lynda Fairly. But a search on that photo shows this same portrait has been used by many other identities and names, including a woman from the United Kingdom who’s a cancer survivor and mother of five; a licensed marriage and family therapist in Canada; a software security engineer at Quora; a journalist on Twitter/X; and a marketing expert in Canada.
Cocofinder[.]com is a people-search service that launched in Sept. 2019, through Alibaba in China. Cocofinder lists its market officer as Harriet Chan, but Ms. Chan’s LinkedIn profile is just as sparse on work history as the other people-search owners mentioned already. An image search online shows that outside of LinkedIn, the profile photo for Ms. Chan has only ever appeared in articles at pay-to-play media sites, like this one from outbackteambuilding.com.
Perhaps because Cocodoc and Cocosign both sell software services, they are actually tied to a physical presence in the real world — in Singapore (15 Scotts Rd. #03-12 15, Singapore). But it’s difficult to discern much from this address alone.
Who’s behind all this people-search chicanery? A January 2024 review of various people-search services at the website techjury.com states that Cocofinder is a wholly-owned subsidiary of a Chinese company called Shenzhen Duiyun Technology Co.
“Though it only finds results from the United States, users can choose between four main search methods,” Techjury explains. Those include people search, phone, address and email lookup. This claim is supported by a Reddit post from three years ago, wherein the Reddit user “ProtectionAdvanced” named the same Chinese company.
Is Shenzhen Duiyun Technology Co. responsible for all these phony profiles? How many more fake companies and profiles are connected to this scheme? KrebsOnSecurity found other examples that didn’t appear directly tied to other fake executives listed here, but which nevertheless are registered through Alibaba and seek to drive traffic to Spokeo and other data brokers. For example, there’s the winsome Daniela Sawyer, founder of FindPeopleFast[.]net, whose profile is flogged in paid stories at entrepreneur.org.
Google currently turns up nothing else for in a search for Shenzhen Duiyun Technology Co. Please feel free to sound off in the comments if you have any more information about this entity, such as how to contact it. Or reach out directly at krebsonsecurity @ gmail.com.
A mind map highlighting the key points of research in this story. Click to enlarge. Image: KrebsOnSecurity.com
It appears the purpose of this network is to conceal the location of people in China who are seeking to generate affiliate commissions when someone visits one of their sites and purchases a people-search report at Spokeo, for example. And it is clear that Spokeo and others have created incentives wherein anyone can effectively white-label their reports, and thereby make money brokering access to peoples’ personal information.
Spokeo’s Wikipedia page says the company was founded in 2006 by four graduates from Stanford University. Spokeo co-founder and current CEO Harrison Tang has not yet responded to requests for comment.
Intelius is owned by San Diego based PeopleConnect Inc., which also owns Classmates.com, USSearch, TruthFinder and Instant Checkmate. PeopleConnect Inc. in turn is owned by H.I.G. Capital, a $60 billion private equity firm. Requests for comment were sent to H.I.G. Capital. This story will be updated if they respond.
BeenVerified is owned by a New York City based holding company called The Lifetime Value Co., a marketing and advertising firm whose brands include PeopleLooker, NeighborWho, Ownerly, PeopleSmart, NumberGuru, and Bumper, a car history site.
Ross Cohen, chief operating officer at The Lifetime Value Co., said it’s likely the network of suspicious people-finder sites was set up by an affiliate. Cohen said Lifetime Value would investigate to determine if this particular affiliate was driving them any sign-ups.
All of the above people-search services operate similarly. When you find the person you’re looking for, you are put through a lengthy (often 10-20 minute) series of splash screens that require you to agree that these reports won’t be used for employment screening or in evaluating new tenant applications. Still more prompts ask if you are okay with seeing “potentially shocking” details about the subject of the report, including arrest histories and photos.
Only at the end of this process does the site disclose that viewing the report in question requires signing up for a monthly subscription, which is typically priced around $35. Exactly how and from where these major people-search websites are getting their consumer data — and customers — will be the subject of further reporting here.
The main reason these various people-search sites require you to affirm that you won’t use their reports for hiring or vetting potential tenants is that selling reports for those purposes would classify these firms as consumer reporting agencies (CRAs) and expose them to regulations under the Fair Credit Reporting Act (FCRA).
These data brokers do not want to be treated as CRAs, and for this reason their people search reports typically don’t include detailed credit histories, financial information, or full Social Security Numbers (Radaris reports include the first six digits of one’s SSN).
But in September 2023, the U.S. Federal Trade Commission found that TruthFinder and Instant Checkmate were trying to have it both ways. The FTC levied a $5.8 million penalty against the companies for allegedly acting as CRAs because they assembled and compiled information on consumers into background reports that were marketed and sold for employment and tenant screening purposes.
The FTC also found TruthFinder and Instant Checkmate deceived users about background report accuracy. The FTC alleges these companies made millions from their monthly subscriptions using push notifications and marketing emails that claimed that the subject of a background report had a criminal or arrest record, when the record was merely a traffic ticket.
The FTC said both companies deceived customers by providing “Remove” and “Flag as Inaccurate” buttons that did not work as advertised. Rather, the “Remove” button removed the disputed information only from the report as displayed to that customer; however, the same item of information remained visible to other customers who searched for the same person.
The FTC also said that when a customer flagged an item in the background report as inaccurate, the companies never took any steps to investigate those claims, to modify the reports, or to flag to other customers that the information had been disputed.
There are a growing number of online reputation management companies that offer to help customers remove their personal information from people-search sites and data broker databases. There are, no doubt, plenty of honest and well-meaning companies operating in this space, but it has been my experience that a great many people involved in that industry have a background in marketing or advertising — not privacy.
Also, some so-called data privacy companies may be wolves in sheep’s clothing. On March 14, KrebsOnSecurity published an abundance of evidence indicating that the CEO and founder of the data privacy company OneRep.com was responsible for launching dozens of people-search services over the years.
Finally, some of the more popular people-search websites are notorious for ignoring requests from consumers seeking to remove their information, regardless of which reputation or removal service you use. Some force you to create an account and provide more information before you can remove your data. Even then, the information you worked hard to remove may simply reappear a few months later.
This aptly describes countless complaints lodged against the data broker and people search giant Radaris. On March 8, KrebsOnSecurity profiled the co-founders of Radaris, two Russian brothers in Massachusetts who also operate multiple Russian-language dating services and affiliate programs.
The truth is that these people-search companies will continue to thrive unless and until Congress begins to realize it’s time for some consumer privacy and data protection laws that are relevant to life in the 21st century. Duke University adjunct professor Justin Sherman says virtually all state privacy laws exempt records that might be considered “public” or “government” documents, including voting registries, property filings, marriage certificates, motor vehicle records, criminal records, court documents, death records, professional licenses, bankruptcy filings, and more.
“Consumer privacy laws in California, Colorado, Connecticut, Delaware, Indiana, Iowa, Montana, Oregon, Tennessee, Texas, Utah, and Virginia all contain highly similar or completely identical carve-outs for ‘publicly available information’ or government records,” Sherman said.
BloodHound is a monolithic web application composed of an embedded React frontend with Sigma.js and a Go based REST API backend. It is deployed with a Postgresql application database and a Neo4j graph database, and is fed by the SharpHound and AzureHound data collectors.
BloodHound uses graph theory to reveal the hidden and often unintended relationships within an Active Directory or Azure environment. Attackers can use BloodHound to easily identify highly complex attack paths that would otherwise be impossible to identify quickly. Defenders can use BloodHound to identify and eliminate those same attack paths. Both blue and red teams can use BloodHound to easily gain a deeper understanding of privilege relationships in an Active Directory or Azure environment.
BloodHound CE is created and maintained by the BloodHound Enterprise Team. The original BloodHound was created by @_wald0, @CptJesus, and @harmj0y.
The easiest way to get up and running is to use our pre-configured Docker Compose setup. The following steps will get BloodHound CE up and running with the least amount of effort.
curl -L https://ghst.ly/getbhce | docker compose -f - up
http://localhost:8080/ui/login
. Login with a username of admin
and the randomly generated password from the logsNOTE: going forward, the default docker-compose.yml
example binds only to localhost (127.0.0.1). If you want to access BloodHound outside of localhost, you'll need to follow the instructions in examples/docker-compose/README.md to configure the host binding for the container.
# Verify if Docker Engine is Running
docker info
# Attempt to stop Neo4j Service if running (on Windows)
Stop-Service "Neo4j" -ErrorAction SilentlyContinue
https://github.com/SpecterOps/BloodHound/assets/12970156/ea9dc042-1866-4ccb-9839-933140cc38b9
Please check out the Contact page in our wiki for details on how to reach out with questions and suggestions.
Microsoft Corp. today pushed software updates to plug more than 70 security holes in its Windows operating systems and related products, including two zero-day vulnerabilities that are already being exploited in active attacks.
Top of the heap on this Fat Patch Tuesday is CVE-2024-21412, a “security feature bypass” in the way Windows handles Internet Shortcut Files that Microsoft says is being targeted in active exploits. Redmond’s advisory for this bug says an attacker would need to convince or trick a user into opening a malicious shortcut file.
Researchers at Trend Micro have tied the ongoing exploitation of CVE-2024-21412 to an advanced persistent threat group dubbed “Water Hydra,” which they say has being using the vulnerability to execute a malicious Microsoft Installer File (.msi) that in turn unloads a remote access trojan (RAT) onto infected Windows systems.
The other zero-day flaw is CVE-2024-21351, another security feature bypass — this one in the built-in Windows SmartScreen component that tries to screen out potentially malicious files downloaded from the Web. Kevin Breen at Immersive Labs says it’s important to note that this vulnerability alone is not enough for an attacker to compromise a user’s workstation, and instead would likely be used in conjunction with something like a spear phishing attack that delivers a malicious file.
Satnam Narang, senior staff research engineer at Tenable, said this is the fifth vulnerability in Windows SmartScreen patched since 2022 and all five have been exploited in the wild as zero-days. They include CVE-2022-44698 in December 2022, CVE-2023-24880 in March 2023, CVE-2023-32049 in July 2023 and CVE-2023-36025 in November 2023.
Narang called special attention to CVE-2024-21410, an “elevation of privilege” bug in Microsoft Exchange Server that Microsoft says is likely to be exploited by attackers. Attacks on this flaw would lead to the disclosure of NTLM hashes, which could be leveraged as part of an NTLM relay or “pass the hash” attack, which lets an attacker masquerade as a legitimate user without ever having to log in.
“We know that flaws that can disclose sensitive information like NTLM hashes are very valuable to attackers,” Narang said. “A Russian-based threat actor leveraged a similar vulnerability to carry out attacks – CVE-2023-23397 is an Elevation of Privilege vulnerability in Microsoft Outlook patched in March 2023.”
Microsoft notes that prior to its Exchange Server 2019 Cumulative Update 14 (CU14), a security feature called Extended Protection for Authentication (EPA), which provides NTLM credential relay protections, was not enabled by default.
“Going forward, CU14 enables this by default on Exchange servers, which is why it is important to upgrade,” Narang said.
Rapid7’s lead software engineer Adam Barnett highlighted CVE-2024-21413, a critical remote code execution bug in Microsoft Office that could be exploited just by viewing a specially-crafted message in the Outlook Preview pane.
“Microsoft Office typically shields users from a variety of attacks by opening files with Mark of the Web in Protected View, which means Office will render the document without fetching potentially malicious external resources,” Barnett said. “CVE-2024-21413 is a critical RCE vulnerability in Office which allows an attacker to cause a file to open in editing mode as though the user had agreed to trust the file.”
Barnett stressed that administrators responsible for Office 2016 installations who apply patches outside of Microsoft Update should note the advisory lists no fewer than five separate patches which must be installed to achieve remediation of CVE-2024-21413; individual update knowledge base (KB) articles further note that partially-patched Office installations will be blocked from starting until the correct combination of patches has been installed.
It’s a good idea for Windows end-users to stay current with security updates from Microsoft, which can quickly pile up otherwise. That doesn’t mean you have to install them on Patch Tuesday. Indeed, waiting a day or three before updating is a sane response, given that sometimes updates go awry and usually within a few days Microsoft has fixed any issues with its patches. It’s also smart to back up your data and/or image your Windows drive before applying new updates.
For a more detailed breakdown of the individual flaws addressed by Microsoft today, check out the SANS Internet Storm Center’s list. For those admins responsible for maintaining larger Windows environments, it often pays to keep an eye on Askwoody.com, which frequently points out when specific Microsoft updates are creating problems for a number of users.
gssapi-abuse was released as part of my DEF CON 31 talk. A full write up on the abuse vector can be found here: A Broken Marriage: Abusing Mixed Vendor Kerberos Stacks
The tool has two features. The first is the ability to enumerate non Windows hosts that are joined to Active Directory that offer GSSAPI authentication over SSH.
The second feature is the ability to perform dynamic DNS updates for GSSAPI abusable hosts that do not have the correct forward and/or reverse lookup DNS entries. GSSAPI based authentication is strict when it comes to matching service principals, therefore DNS entries should match the service principal name both by hostname and IP address.
gssapi-abuse requires a working krb5 stack along with a correctly configured krb5.conf.
On Windows hosts, the MIT Kerberos software should be installed in addition to the python modules listed in requirements.txt
, this can be obtained at the MIT Kerberos Distribution Page. Windows krb5.conf can be found at C:\ProgramData\MIT\Kerberos5\krb5.conf
The libkrb5-dev
package needs to be installed prior to installing python requirements
Once the requirements are satisfied, you can install the python dependencies via pip/pip3 tool
pip install -r requirements.txt
The enumeration mode will connect to Active Directory and perform an LDAP search for all computers that do not have the word Windows
within the Operating System attribute.
Once the list of non Windows machines has been obtained, gssapi-abuse will then attempt to connect to each host over SSH and determine if GSSAPI based authentication is permitted.
python .\gssapi-abuse.py -d ad.ginge.com enum -u john.doe -p SuperSecret!
[=] Found 2 non Windows machines registered within AD
[!] Host ubuntu.ad.ginge.com does not have GSSAPI enabled over SSH, ignoring
[+] Host centos.ad.ginge.com has GSSAPI enabled over SSH
DNS mode utilises Kerberos and dnspython to perform an authenticated DNS update over port 53 using the DNS-TSIG protocol. Currently dns
mode relies on a working krb5 configuration with a valid TGT or DNS service ticket targetting a specific domain controller, e.g. DNS/dc1.victim.local
.
Adding a DNS A
record for host ahost.ad.ginge.com
python .\gssapi-abuse.py -d ad.ginge.com dns -t ahost -a add --type A --data 192.168.128.50
[+] Successfully authenticated to DNS server win-af8ki8e5414.ad.ginge.com
[=] Adding A record for target ahost using data 192.168.128.50
[+] Applied 1 updates successfully
Adding a reverse PTR
record for host ahost.ad.ginge.com
. Notice that the data
argument is terminated with a .
, this is important or the record becomes a relative record to the zone, which we do not want. We also need to specify the target zone to update, since PTR
records are stored in different zones to A
records.
python .\gssapi-abuse.py -d ad.ginge.com dns --zone 128.168.192.in-addr.arpa -t 50 -a add --type PTR --data ahost.ad.ginge.com.
[+] Successfully authenticated to DNS server win-af8ki8e5414.ad.ginge.com
[=] Adding PTR record for target 50 using data ahost.ad.ginge.com.
[+] Applied 1 updates successfully
Forward and reverse DNS lookup results after execution
nslookup ahost.ad.ginge.com
Server: WIN-AF8KI8E5414.ad.ginge.com
Address: 192.168.128.1
Name: ahost.ad.ginge.com
Address: 192.168.128.50
nslookup 192.168.128.50
Server: WIN-AF8KI8E5414.ad.ginge.com
Address: 192.168.128.1
Name: ahost.ad.ginge.com
Address: 192.168.128.50
Pantheon is a GUI application that allows users to display information regarding network cameras in various countries as well as an integrated live-feed for non-protected cameras.
Pantheon allows users to execute an API crawler. There was original functionality without the use of any API's (like Insecam), but Google TOS kept getting in the way of the original scraping mechanism.
git clone https://github.com/josh0xA/Pantheon.git
cd Pantheon
pip3 install -r requirements.txt
python3 pantheon.py
chmod +x distros/ubuntu_install.sh
./distros/ubuntu_install.sh
chmod +x distros/debian-kali_install.sh
./distros/debian-kali_install.sh
(Enter) on a selected IP:Port to establish a Pantheon webview of the camera. (Use this at your own risk)
(Left-click) on a selected IP:Port to view the geolocation of the camera.
(Right-click) on a selected IP:Port to view the HTTP data of the camera (Ctrl+Left-click for Mac).
Adjust the map as you please to see the markers.
The developer of this program, Josh Schiavone, is not resposible for misuse of this data gathering tool. Pantheon simply provides information that can be indexed by any modern search engine. Do not try to establish unauthorized access to live feeds that are password protected - that is illegal. Furthermore, if you do choose to use Pantheon to view a live-feed, do so at your own risk. Pantheon was developed for educational purposes only. For further information, please visit: https://joshschiavone.com/panth_info/panth_ethical_notice.html
MIT License
Copyright (c) Josh Schiavone
APIDetector is a powerful and efficient tool designed for testing exposed Swagger endpoints in various subdomains with unique smart capabilities to detect false-positives. It's particularly useful for security professionals and developers who are engaged in API testing and vulnerability scanning.
Before running APIDetector, ensure you have Python 3.x and pip installed on your system. You can download Python here.
Clone the APIDetector repository to your local machine using:
git clone https://github.com/brinhosa/apidetector.git
cd apidetector
pip install requests
Run APIDetector using the command line. Here are some usage examples:
Common usage, scan with 30 threads a list of subdomains using a Chrome user-agent and save the results in a file:
python apidetector.py -i list_of_company_subdomains.txt -o results_file.txt -t 30 -ua "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36"
To scan a single domain:
python apidetector.py -d example.com
To scan multiple domains from a file:
python apidetector.py -i input_file.txt
To specify an output file:
python apidetector.py -i input_file.txt -o output_file.txt
To use a specific number of threads:
python apidetector.py -i input_file.txt -t 20
To scan with both HTTP and HTTPS protocols:
python apidetector.py -m -d example.com
To run the script in quiet mode (suppress verbose output):
python apidetector.py -q -d example.com
To run the script with a custom user-agent:
python apidetector.py -d example.com -ua "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36"
-d
, --domain
: Single domain to test.-i
, --input
: Input file containing subdomains to test.-o
, --output
: Output file to write valid URLs to.-t
, --threads
: Number of threads to use for scanning (default is 10).-m
, --mixed-mode
: Test both HTTP and HTTPS protocols.-q
, --quiet
: Disable verbose output (default mode is verbose).-ua
, --user-agent
: Custom User-Agent string for requests.Exposing Swagger or OpenAPI documentation endpoints can present various risks, primarily related to information disclosure. Here's an ordered list based on potential risk levels, with similar endpoints grouped together APIDetector scans:
'/swagger-ui.html'
, '/swagger-ui/'
, '/swagger-ui/index.html'
, '/api/swagger-ui.html'
, '/documentation/swagger-ui.html'
, '/swagger/index.html'
, '/api/docs'
, '/docs'
, '/api/swagger-ui'
, '/documentation/swagger-ui'
'/openapi.json'
, '/swagger.json'
, '/api/swagger.json'
, '/swagger.yaml'
, '/swagger.yml'
, '/api/swagger.yaml'
, '/api/swagger.yml'
, '/api.json'
, '/api.yaml'
, '/api.yml'
, '/documentation/swagger.json'
, '/documentation/swagger.yaml'
, '/documentation/swagger.yml'
'/v2/api-docs'
, '/v3/api-docs'
, '/api/v2/swagger.json'
, '/api/v3/swagger.json'
, '/api/v1/documentation'
, '/api/v2/documentation'
, '/api/v3/documentation'
, '/api/v1/api-docs'
, '/api/v2/api-docs'
, '/api/v3/api-docs'
, '/swagger/v2/api-docs'
, '/swagger/v3/api-docs'
, '/swagger-ui.html/v2/api-docs'
, '/swagger-ui.html/v3/api-docs'
, '/api/swagger/v2/api-docs'
, '/api/swagger/v3/api-docs'
'/swagger-resources'
, '/swagger-resources/configuration/ui'
, '/swagger-resources/configuration/security'
, '/api/swagger-resources'
, '/api.html'
Contributions to APIDetector are welcome! Feel free to fork the repository, make changes, and submit pull requests.
The use of APIDetector should be limited to testing and educational purposes only. The developers of APIDetector assume no liability and are not responsible for any misuse or damage caused by this tool. It is the end user's responsibility to obey all applicable local, state, and federal laws. Developers assume no responsibility for unauthorized or illegal use of this tool. Before using APIDetector, ensure you have permission to test the network or systems you intend to scan.
This project is licensed under the MIT License.
Microsoft today issued security updates for more than 100 newly-discovered vulnerabilities in its Windows operating system and related software, including four flaws that are already being exploited. In addition, Apple recently released emergency updates to quash a pair of zero-day bugs in iOS.
Apple last week shipped emergency updates in iOS 17.0.3 and iPadOS 17.0.3 in response to active attacks. The patch fixes CVE-2023-42724, which attackers have been using in targeted attacks to elevate their access on a local device.
Apple said it also patched CVE-2023-5217, which is not listed as a zero-day bug. However, as Bleeping Computer pointed out, this flaw is caused by a weakness in the open-source “libvpx” video codec library, which was previously patched as a zero-day flaw by Google in the Chrome browser and by Microsoft in Edge, Teams, and Skype products. For anyone keeping count, this is the 17th zero-day flaw that Apple has patched so far this year.
Fortunately, the zero-days affecting Microsoft customers this month are somewhat less severe than usual, with the exception of CVE-2023-44487. This weakness is not specific to Windows but instead exists within the HTTP/2 protocol used by the World Wide Web: Attackers have figured out how to use a feature of HTTP/2 to massively increase the size of distributed denial-of-service (DDoS) attacks, and these monster attacks reportedly have been going on for several weeks now.
Amazon, Cloudflare and Google all released advisories today about how they’re addressing CVE-2023-44487 in their cloud environments. Google’s Damian Menscher wrote on Twitter/X that the exploit — dubbed a “rapid reset attack” — works by sending a request and then immediately cancelling it (a feature of HTTP/2). “This lets attackers skip waiting for responses, resulting in a more efficient attack,” Menscher explained.
Natalie Silva, lead security engineer at Immersive Labs, said this flaw’s impact to enterprise customers could be significant, and lead to prolonged downtime.
“It is crucial for organizations to apply the latest patches and updates from their web server vendors to mitigate this vulnerability and protect against such attacks,” Silva said. In this month’s Patch Tuesday release by Microsoft, they have released both an update to this vulnerability, as well as a temporary workaround should you not be able to patch immediately.”
Microsoft also patched zero-day bugs in Skype for Business (CVE-2023-41763) and Wordpad (CVE-2023-36563). The latter vulnerability could expose NTLM hashes, which are used for authentication in Windows environments.
“It may or may not be a coincidence that Microsoft announced last month that WordPad is no longer being updated, and will be removed in a future version of Windows, although no specific timeline has yet been given,” said Adam Barnett, lead software engineer at Rapid7. “Unsurprisingly, Microsoft recommends Word as a replacement for WordPad.”
Other notable bugs addressed by Microsoft include CVE-2023-35349, a remote code execution weakness in the Message Queuing (MSMQ) service, a technology that allows applications across multiple servers or hosts to communicate with each other. This vulnerability has earned a CVSS severity score of 9.8 (10 is the worst possible). Happily, the MSMQ service is not enabled by default in Windows, although Immersive Labs notes that Microsoft Exchange Server can enable this service during installation.
Speaking of Exchange, Microsoft also patched CVE-2023-36778, a vulnerability in all current versions of Exchange Server that could allow attackers to run code of their choosing. Rapid7’s Barnett said successful exploitation requires that the attacker be on the same network as the Exchange Server host, and use valid credentials for an Exchange user in a PowerShell session.
For a more detailed breakdown on the updates released today, see the SANS Internet Storm Center roundup. If today’s updates cause any stability or usability issues in Windows, AskWoody.com will likely have the lowdown on that.
Please consider backing up your data and/or imaging your system before applying any updates. And feel free to sound off in the comments if you experience any difficulties as a result of these patches.
dynmx (spoken dynamics) is a signature-based detection approach for behavioural malware features based on Windows API call sequences. In a simplified way, you can think of dynmx as a sort of YARA for API call traces (so called function logs) originating from malware sandboxes. Hence, the data basis for the detection approach are not the malware samples themselves which are analyzed statically but data that is generated during a dynamic analysis of the malware sample in a malware sandbox. Currently, dynmx supports function logs of the following malware sandboxes:
report.json
file)report.json
file)The detection approach is described in detail in the master thesis Signature-Based Detection of Behavioural Malware Features with Windows API Calls. This project is the prototype implementation of this approach and was developed in the course of the master thesis. The signatures are manually defined by malware analysts in the dynmx signature DSL and can be detected in function logs with the help of this tool. Features and syntax of the dynmx signature DSL can also be found in the master thesis. Furthermore, you can find sample dynmx signatures in the repository dynmx-signatures. In addition to detecting malware features based on API calls, dynmx can extract OS resources that are used by the malware (a so called Access Activity Model). These resources are extracted by examining the API calls and reconstructing operations on OS resources. Currently, OS resources of the categories filesystem, registry and network are considered in the model.
In the following section, examples are shown for the detection of malware features and for the extraction of resources.
For this example, we choose the malware sample with the SHA-256 hash sum c0832b1008aa0fc828654f9762e37bda019080cbdd92bd2453a05cfb3b79abb3
. According to MalwareBazaar, the sample belongs to the malware family Amadey. There is a public VMRay analysis report of this sample available which also provides the function log traced by VMRay. This function log will be our data basis which we will use for the detection.
If we would like to know if the malware sample uses an injection technique called Process Hollowing, we can try to detect the following dynmx signature in the function log.
dynmx_signature:
meta:
name: process_hollow
title: Process Hollowing
description: Detection of Process hollowing malware feature
detection:
proc_hollow:
# Create legit process in suspended mode
- api_call: ["CreateProcess[AW]", "CreateProcessInternal[AW]"]
with:
- argument: "dwCreationFlags"
operation: "flag is set"
value: 0x4
- return_value: "return"
operation: "is not"
value: 0
store:
- name: "hProcess"
as: "proc_handle"
- name: "hThread"
as: "thread_handle"
# Injection of malicious code into memory of previously created process
- variant:
- path:
# Allocate memory with read, write, execute permission
- api_call: ["VirtualAllocE x", "VirtualAlloc", "(Nt|Zw)AllocateVirtualMemory"]
with:
- argument: ["hProcess", "ProcessHandle"]
operation: "is"
value: "$(proc_handle)"
- argument: ["flProtect", "Protect"]
operation: "is"
value: 0x40
- api_call: ["WriteProcessMemory"]
with:
- argument: "hProcess"
operation: "is"
value: "$(proc_handle)"
- api_call: ["SetThreadContext", "(Nt|Zw)SetContextThread"]
with:
- argument: "hThread"
operation: "is"
value: "$(thread_handle)"
- path:
# Map memory section with read, write, execute permission
- api_call: "(Nt|Zw)MapViewOfSection"
with:
- argument: "ProcessHandle"
operation: "is"
value: "$(proc_handle)"
- argument: "AccessProtection"
operation: "is"
value: 0x40
# Resume thread to run injected malicious code
- api_call: ["ResumeThread", "(Nt|Zw)ResumeThread"]
with:
- argument: ["hThread", "ThreadHandle"]
operation: "is"
value: "$(thread_handle)"
condition: proc_hollow as sequence
Based on the signature, we can find some DSL features that make dynmx powerful:
AND
, OR
, NOT
)If we run dynmx with the signature shown above against the function of the sample c0832b1008aa0fc828654f9762e37bda019080cbdd92bd2453a05cfb3b79abb3
, we get the following output indicating that the signature was detected.
$ python3 dynmx.py detect -i 601941f00b194587c9e57c5fabaf1ef11596179bea007df9bdcdaa10f162cac9.json -s process_hollow.yml
|
__| _ _ _ _ _
/ | | | / |/ | / |/ |/ | /\/
\_/|_/ \_/|/ | |_/ | | |_/ /\_/
/|
\|
Ver. 0.5 (PoC), by 0x534a
[+] Parsing 1 function log(s)
[+] Loaded 1 dynmx signature(s)
[+] Starting detection process with 1 worker(s). This probably takes some time...
[+] Result
process_hollow c0832b1008aa0fc828654f9762e37bda019080cbdd92bd2453a05cfb3b79abb3.txt
We can get into more detail by setting the output format to detail
. Now, we can see the exact API call sequence that was detected in the function log. Furthermore, we can see that the signature was detected in the process 51f0.exe
.
$ python3 dynmx.py -f detail detect -i 601941f00b194587c9e57c5fabaf1ef11596179bea007df9bdcdaa10f162cac9.json -s process_hollow.yml
|
__| _ _ _ _ _
/ | | | / |/ | / |/ |/ | /\/
\_/|_/ \_/|/ | |_/ | | |_/ /\_/
/|
\|
Ver. 0.5 (PoC), by 0x534a
[+] Parsing 1 function log(s)
[+] Loaded 1 dynmx signature(s)
[+] Starting detection process with 1 worker(s). This probably takes some time...
[+] Result
Function log: c0832b1008aa0fc828654f9762e37bda019080cbdd92bd2453a05cfb3b79abb3.txt
Signature: process_hollow
Process: 51f0.exe (PID: 3768)
Number of Findings: 1
Finding 0
proc_hollow : API Call CreateProcessA (Function log line 20560, index 938)
proc_hollow : API Call VirtualAllocEx (Function log line 20566, index 944)
proc_hollow : API Call WriteProcessMemory (Function log line 20573, index 951)
proc_hollow : API Call SetThreadContext (Function log line 20574, index 952)
proc_hollow : API Call ResumeThread (Function log line 20575, index 953)
In order to extract the accessed OS resources from a function log, we can simply run the dynmx command resources
against the function log. An example of the detailed output is shown below for the sample with the SHA-256 hash sum 601941f00b194587c9e57c5fabaf1ef11596179bea007df9bdcdaa10f162cac9
. This is a CAPE sandbox report which is part of the Avast-CTU Public CAPEv2 Dataset.
$ python3 dynmx.py -f detail resources --input 601941f00b194587c9e57c5fabaf1ef11596179bea007df9bdcdaa10f162cac9.json
|
__| _ _ _ _ _
/ | | | / |/ | / |/ |/ | /\/
\_/|_/ \_/|/ | |_/ | | |_/ /\_/
/|
\|
Ver. 0.5 (PoC), by 0x534a
[+] Parsing 1 function log(s)
[+] Processing function log(s) with the command 'resources'...
[+] Result
Function log: 601941f00b194587c9e57c5fabaf1ef11596179bea007df9bdcdaa10f162cac9.json (/Users/sijansen/Documents/dev/dynmx_flogs/cape/Public_Avast_CTU_CAPEv2_Dataset_Full/extracted/601941f00b194587c9e57c5fabaf1ef11596179bea007df9bdcdaa10f162cac9.json)
Process: 601941F00B194587C9E5.exe (PID: 2008)
Filesystem:
C:\Windows\SysWOW64\en-US\SETUPAPI.dll.mui (CREATE)
API-MS-Win-Core-LocalRegistry-L1-1-0.dll (EXECUTE)
C:\Windows\SysWOW64\ntdll.dll (READ)
USER32.dll (EXECUTE)
KERNEL32. dll (EXECUTE)
C:\Windows\Globalization\Sorting\sortdefault.nls (CREATE)
Registry:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\OLEAUT (READ)
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Setup (READ)
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Setup\SourcePath (READ)
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion (READ)
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\DevicePath (READ)
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Internet Settings (READ)
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Internet Settings\DisableImprovedZoneCheck (READ)
HKEY_LOCAL_MACHINE\Software\Policies\Microsoft\Windows\CurrentVersion\Internet Settings (READ)
HKEY_LOCAL_MACHINE\Software\Policies\Microsoft\Windows\CurrentVersion\Internet Settings\Security_HKLM_only (READ)
Process: 601941F00B194587C9E5.exe (PID: 1800)
Filesystem:
C:\Windows\SysWOW64\en-US\SETUPAPI.dll.mui (CREATE)
API-MS-Win-Core-LocalRegistry-L1-1-0.dll (EXECUTE)
C:\Windows\SysWOW64\ntdll.dll (READ)
USER32.dll (EXECUTE)
KERNEL32.dll (EXECUTE)
[...]
C:\Users\comp\AppData\Local\vscmouse (READ)
C:\Users\comp\AppData\Local\vscmouse\vscmouse.exe:Zone.Identifier (DELETE)
Registry:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\OLEAUT (READ)
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Setup (READ)
[...]
Process: vscmouse.exe (PID: 900)
Filesystem:
C:\Windows\SysWOW64\en-US\SETUPAPI.dll.mui (CREATE)
API-MS-Win-Core-LocalRegistry-L1-1-0.dll (EXECUTE)
C:\Windows\SysWOW64\ntdll.dll (READ)
USER32.dll (EXECUTE)
KERNEL32.dll (EXECUTE)
C:\Windows\Globalization\Sorting\sortdefault.nls (CREATE)
Registry:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\OLEAUT (READ)
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\C urrentVersion\Setup (READ)
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Setup\SourcePath (READ)
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion (READ)
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\DevicePath (READ)
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Internet Settings (READ)
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Internet Settings\DisableImprovedZoneCheck (READ)
HKEY_LOCAL_MACHINE\Software\Policies\Microsoft\Windows\CurrentVersion\Internet Settings (READ)
HKEY_LOCAL_MACHINE\Software\Policies\Microsoft\Windows\CurrentVersion\Internet Settings\Security_HKLM_only (READ)
Process: vscmouse.exe (PID: 3036)
Filesystem:
C:\Windows\SysWOW64\en-US\SETUPAPI.dll.mui (CREATE)
API-MS-Win-Core-LocalRegistry-L1-1-0.dll (EXECUTE)
C:\Windows\SysWOW64\ntdll.dll (READ)
USER32.dll (EXECUTE)
KERNEL32.dll (EXECUTE)
C:\Windows\Globalization\Sorting\sortdefault.nls (CREATE)
C:\ (READ)
C:\Windows\System32\uxtheme.dll (EXECUTE)
dwmapi.dll (EXECUTE)
advapi32.dll (EXECUTE)
shell32.dll (EXECUTE)
C:\Users\comp\AppData\Local\vscmouse\vscmouse.exe (CREATE,READ)
C:\Users\comp\AppData\Local\iproppass\iproppass.exe (DELETE)
crypt32.dll (EXECUTE)
urlmon.dll (EXECUTE)
userenv.dll (EXECUTE)
wininet.dll (EXECUTE)
wtsapi32.dll (EXECUTE)
CRYPTSP.dll (EXECUTE)
CRYPTBASE.dll (EXECUTE)
ole32.dll (EXECUTE)
OLEAUT32.dll (EXECUTE)
C:\Windows\SysWOW64\oleaut32.dll (EXECUTE)
IPHLPAPI.DLL (EXECUTE)
DHCPCSVC.DLL (EXECUTE)
C:\Users\comp\AppData\Roaming\Microsoft\Network\Connections\Pbk\_hiddenPbk\ (CREATE)
C:\Users\comp\AppData\Roaming\Microsoft\Network\Connections\Pbk\_hiddenPbk\rasphone.pbk (CREATE,READ)
Registry:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\OLEAUT (READ )
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Setup (READ)
[...]
Network:
24.151.31.150:465 (READ)
http://24.151.31.150:465 (READ,WRITE)
107.10.49.252:80 (READ)
http://107.10.49.252:80 (READ,WRITE)
Based on the shown output and the accessed resources, we can deduce some malware features:
601941F00B194587C9E5.exe
(PID 1800), the Zone Identifier of the file C:\Users\comp\AppData\Local\vscmouse\vscmouse.exe
is deletedvscmouse.exe
(PID: 3036) connects to the network endpoints http://24.151.31.150:465
and http://107.10.49.252:80
The accessed resources are interesting for identifying host- and network-based detection indicators. In addition, resources can be used in dynmx signatures. A popular example is the detection of persistence mechanisms in the Registry.
In order to use the software Python 3.9 must be available on the target system. In addition, the following Python packages need to be installed:
anytree
,lxml
,pyparsing
,PyYAML
,six
andstringcase
To install the packages run the pip3
command shown below. It is recommended to use a Python virtual environment instead of installing the packages system-wide.
pip3 install -r requirements.txt
To use the prototype, simply run the main entry point dynmx.py
. The usage information can be viewed with the -h
command line parameter as shown below.
$ python3 dynmx.py -h
usage: dynmx.py [-h] [--format {overview,detail}] [--show-log] [--log LOG] [--log-level {debug,info,error}] [--worker N] {detect,check,convert,stats,resources} ...
Detect dynmx signatures in dynamic program execution information (function logs)
optional arguments:
-h, --help show this help message and exit
--format {overview,detail}, -f {overview,detail}
Output format
--show-log Show all log output on stdout
--log LOG, -l LOG log file
--log-level {debug,info,error}
Log level (default: info)
--worker N, -w N Number of workers to spawn (default: number of processors - 2)
sub-commands:
task to perform
{detect,check,convert,stats,resources}
detect Detects a dynmx signature
check Checks the syntax of dynmx signature(s)
convert Converts function logs to the dynmx generic function log format
stats Statistics of function logs
resources Resource activity derived from function log
In general, as shown in the output, several command line parameters regarding the log handling, the output format for results or multiprocessing can be defined. Furthermore, a command needs be chosen to run a specific task. Please note, that the number of workers only affects commands that make use of multiprocessing. Currently, these are the commands detect
and convert
.
The commands have specific command line parameters that can be explored by giving the parameter -h
to the command, e.g. for the detect
command as shown below.
$ python3 dynmx.py detect -h
usage: dynmx.py detect [-h] --sig SIG [SIG ...] --input INPUT [INPUT ...] [--recursive] [--json-result JSON_RESULT] [--runtime-result RUNTIME_RESULT] [--detect-all]
optional arguments:
-h, --help show this help message and exit
--recursive, -r Search for input files recursively
--json-result JSON_RESULT
JSON formatted result file
--runtime-result RUNTIME_RESULT
Runtime statistics file formatted in CSV
--detect-all Detect signature in all processes and do not stop after the first detection
required arguments:
--sig SIG [SIG ...], -s SIG [SIG ...]
dynmx signature(s) to detect
--input INPUT [INPUT ...], -i INPUT [INPUT ...]
Input files
As a user of dynmx, you can decide how the output is structured. If you choose to show the log on the console by defining the parameter --show-log
, the output consists of two sections (see listing below). The log is shown first and afterwards the results of the used command. By default, the log is neither shown in the console nor written to a log file (which can be defined using the --log
parameter). Due to multiprocessing, the entries in the log file are not necessarily in chronological order.
|
__| _ _ _ _ _
/ | | | / |/ | / |/ |/ | /\/
\_/|_/ \_/|/ | |_/ | | |_/ /\_/
/|
\|
Ver. 0.5 (PoC), by 0x534a
[+] Log output
2023-06-27 19:07:38,068+0000 [INFO] (__main__) [PID: 13315] []: Start of dynmx run
[...]
[+] End of log output
[+] Result
[...]
The level of detail of the result output can be defined using the command line parameter --output-format
which can be set to overview
for a high-level result or to detail
for a detailed result. For example, if you define the output format to detail
, detection results shown in the console will contain the exact API calls and resources that caused the detection. The overview output format will just indicate what signature was detected in which function log.
Detection of a dynmx signature in a function log with one worker process
python3 dynmx.py -w 1 detect -i "flog.txt" -s dynmx_signature.yml
Conversion of a function log to the dynmx generic function log format
python3 dynmx.py convert -i "flog.txt" -o /tmp/
Check a signature (only basic sanity checks)
python3 dynmx.py check -s dynmx_signature.yml
Get a detailed list of used resources used by a malware sample based on the function log (access activity model)
python3 dynmx.py -f detail resources -i "flog.txt"
Please consider that this tool is a proof-of-concept which was developed besides writing the master thesis. Hence, the code quality is not always the best and there may be bugs and errors. I tried to make the tool as robust as possible in the given time frame.
The best way to troubleshoot errors is to enable logging (on the console and/or to a log file) and set the log level to debug
. Exception handlers should write detailed errors to the log which can help troubleshooting.
A Pin Tool for tracing:
Bypasses the anti-tracing check based on RDTSC.
Generates a report in a .tag
format (which can be loaded into other analysis tools):
RVA;traced event
i.e.
345c2;section: .text
58069;called: C:\Windows\SysWOW64\kernel32.dll.IsProcessorFeaturePresent
3976d;called: C:\Windows\SysWOW64\kernel32.dll.LoadLibraryExW
3983c;called: C:\Windows\SysWOW64\kernel32.dll.GetProcAddress
3999d;called: C:\Windows\SysWOW64\KernelBase.dll.InitializeCriticalSectionEx
398ac;called: C:\Windows\SysWOW64\KernelBase.dll.FlsAlloc
3995d;called: C:\Windows\SysWOW64\KernelBase.dll.FlsSetValue
49275;called: C:\Windows\SysWOW64\kernel32.dll.LoadLibraryExW
4934b;called: C:\Windows\SysWOW64\kernel32.dll.GetProcAddress
...
To compile the prepared project you need to use Visual Studio >= 2012. It was tested with Intel Pin 3.28.
Clone this repo into \source\tools
that is inside your Pin root directory. Open the project in Visual Studio and build. Detailed description available here.
To build with Intel Pin < 3.26 on Windows, use the appropriate legacy Visual Studio project.
For now the support for Linux is experimental. Yet it is possible to build and use Tiny Tracer on Linux as well. Please refer tiny_runner.sh for more information. Detailed description available here.
Details about the usage you will find on the project's Wiki.
install32_64
you can find a utility that checks if Kernel Debugger is disabled (kdb_check.exe
, source), and it is used by the Tiny Tracer's .bat
scripts. This utilty sometimes gets flagged as a malware by Windows Defender (it is a known false positive). If you encounter this issue, you may need to exclude the installation directory from Windows Defender scans.Questions? Ideas? Join Discussions!
Language | Framework | URL | Method | Param | Header | WS |
---|---|---|---|---|---|---|
Go | Echo | ✅ | ✅ | X | X | X |
Python | Django | ✅ | X | X | X | X |
Python | Flask | ✅ | X | X | X | X |
Ruby | Rails | ✅ | ✅ | ✅ | X | X |
Ruby | Sinatra | ✅ | ✅ | ✅ | X | X |
Php | ✅ | ✅ | ✅ | X | X | |
Java | Spring | ✅ | ✅ | X | X | X |
Java | Jsp | X | X | X | X | X |
Crystal | Kemal | ✅ | ✅ | ✅ | X | ✅ |
JS | Express | ✅ | ✅ | X | X | X |
JS | Next | X | X | X | X | X |
Specification | Format | URL | Method | Param | Header | WS |
---|---|---|---|---|---|---|
Swagger | JSON | ✅ | ✅ | ✅ | X | X |
Swagger | YAML | ✅ | ✅ | ✅ | X | X |
brew tap hahwul/noir
brew install noir
# Install Crystal-lang
# https://crystal-lang.org/install/
# Clone this repo
git clone https://github.com/hahwul/noir
cd noir
# Install Dependencies
shards install
# Build
shards build --release --no-debug
# Copy binary
cp ./bin/noir /usr/bin/
docker pull ghcr.io/hahwul/noir:main
Usage: noir <flags>
Basic:
-b PATH, --base-path ./app (Required) Set base path
-u URL, --url http://.. Set base url for endpoints
-s SCOPE, --scope url,param Set scope for detection
Output:
-f FORMAT, --format json Set output format [plain/json/markdown-table/curl/httpie]
-o PATH, --output out.txt Write result to file
--set-pvalue VALUE Specifies the value of the identified parameter
--no-color Disable color output
--no-log Displaying only the results
Deliver:
--send-req Send the results to the web request
--send-proxy http://proxy.. Send the results to the web request via http proxy
Technologies:
-t TECHS, --techs rails,php Set technologies to use
--exclude-techs rails,php Specify the technologies to be excluded
--list-techs Show all technologies
Others:
-d, --debug Show debug messages
-v, --version Show version
-h, --help Show help
Example
noir -b . -u https://testapp.internal.domains
JSON Result
noir -b . -u https://testapp.internal.domains -f json
[
...
{
"headers": [],
"method": "POST",
"params": [
{
"name": "article_slug",
"param_type": "json",
"value": ""
},
{
"name": "body",
"param_type": "json",
"value": ""
},
{
"name": "id",
"param_type": "json",
"value": ""
}
],
"protocol": "http",
"url": "https://testapp.internal.domains/comments"
}
]
Toolkit demonstrating another approach of a QRLJacking attack, allowing to perform remote account takeover, through sign-in QR code phishing.
It consists of a browser extension used by the attacker to extract the sign-in QR code and a server application, which retrieves the sign-in QR codes to display them on the hosted phishing pages.
Watch the demo video:
Read more about it on my blog: https://breakdev.org/evilqr-phishing
The parameters used by Evil QR are hardcoded into extension and server source code, so it is important to change them to use custom values, before you build and deploy the toolkit.
parameter | description | default value |
---|---|---|
API_TOKEN | API token used to authenticate with REST API endpoints hosted on the server | 00000000-0000-0000-0000-000000000000 |
QRCODE_ID | QR code ID used to bind the extracted QR code with the one displayed on the phishing page | 11111111-1111-1111-1111-111111111111 |
BIND_ADDRESS | IP address with port the HTTP server will be listening on | 127.0.0.1:35000 |
API_URL | External URL pointing to the server, where the phishing page will be hosted | http://127.0.0.1:35000 |
Here are all the places in the source code, where the values should be modified:
You can load the extension in Chrome, through Load unpacked
feature: https://developer.chrome.com/docs/extensions/mv3/getstarted/development-basics/#load-unpacked
Once the extension is installed, make sure to pin its icon in Chrome's extension toolbar, so that the icon is always visible.
Make sure you have Go installed version at least 1.20.
To build go to /server
directory and run the command:
Windows:
build_run.bat
Linux:
chmod 700 build.sh
./build.sh
Built server binaries will be placed in the ./build/
directory.
./server/build/evilqr-server
https://discord.com/login
https://web.telegram.org/k/
https://whatsapp.com
https://store.steampowered.com/login/
https://accounts.binance.com/en/login
https://www.tiktok.com/login
http://127.0.0.1:35000
(default)Evil QR is made by Kuba Gretzky (@mrgretzky) and it's released under MIT license.
AiCEF is a tool implementing the accompanying framework [1] in order to harness the intelligence that is available from online resources, as well as threat groups' activities, arsenal (eg. MITRE), to create relevant and timely cybersecurity exercise content. This way, we abstract the events from the reports in a machine-readable form. The produced graphs can be infused with additional intelligence, e.g. the threat actor profile from MITRE, also mapped in our ontology. While this may fill gaps that would be missing from a report, one can also manipulate the graph to create custom and unique models. Finally, we exploit transformer-based language models like GPT to convert the graph into text that can serve as the scenario of a cybersecurity exercise. We have tested and validated AiCEF with a group of experts in cybersecurity exercises, and the results clearly show that AiCEF significantly augments the capabilities in creating timely and relevant cybersecurity exercises in terms of both quality and time.
We used Python to create a machine-learning-powered Exercise Generation Framework and developed a set of tools to perform a set of individual tasks which would help an exercise planner (EP) to create a timely and targeted Cybersecurity Exercise Scenario, regardless of her experience.
Problems an Exercise Planner faces:
Our Main Objective: Build an AI powered tool that can generate relevant and up-to-date Cyber Exercise Content in a few steps with little technical expertise from the user.
The updated project, AiCEF v.2.0 is planned to be publicly released by the end of 2023, pending heavy code review and functionality updates. Submodules with reduced functinality will start being release by early June 2023. Thank you for your patience.
The most convenient way to install AiCEF is by using the docker-compose command. For production deployment, we advise you deploy MySQL manually in a dedicated environment and then to start the other components using Docker.
First, make sure you have docker-compose installed in your environment:
$ sudo apt-get install docker-compose
Then, clone the repository:
$ git clone https://github.com/grazvan/AiCEF/docker.git /<choose-a-path>/AiCEF-docker
$ cd /<choose-a-path>/AiCEF-docker
Import the MySQL file in your
$ mysql -u <your_username> –-password=<your_password> AiCEF_db < AiCEF_db.sql
Before running the docker-compose
command, settings must be configured. Copy the sample settings file and change it accordingly to your needs.
$ cp .env.sample .env
Note: Make sure you have an OpenAI API key available. Load the environment setttings (including your MySQL connection details):
set -a ; source .env
Finally, run docker-compose
in detached (-d
) mode:
$ sudo docker-compose up -d
A common usage flow consists of generating a Trend Report to analyze patterns over time, parsing relevant articles and converting them into Incident Breadcrumbs using MLTP module and storing them in a knowledge database called KDb. Incidents are then generated using IncGen component and can be enhanced using the Graph Enhancer module to simulate known APT activity. The incidents come with injects that can be edited on the fly. The CSE scenario is then created using CEGen, which defines various attributes like CSE name, number of Events, and Incidents. MLCESO is a crucial step in the methodology where dedicated ML models are trained to extract information from the collected articles with over 80% accuracy. The Incident Generation & Enhancer (IncGen) workflow can be automated, generating a variety of incidents based on filtering parameters and the existing database. The knowledge database (KDB) consists of almost 3000 articles classified into six categories that can be augmented using APT Enhancer by using the activity of known APT groups from MITRE or manually.
Find below some sample usage screenshots:
AiCEF is a product designed and developed by Alex Zacharis, Razvan Gavrila and Constantinos Patsakis.
[1] https://link.springer.com/article/10.1007/s10207-023-00693-z
[2] https://oasis-open.github.io/cti-documentation/stix/intro.html
Contributions are welcome! If you'd like to contribute to AiCEF v2.0, please follow these steps:
git checkout -b feature/your-branch-name
)git commit -m 'Add some feature'
)git push origin feature/your-branch-name
)AiCEF is licensed under Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license. See for more information.
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. NonCommercial — You may not use the material for commercial purposes. No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
The VX-API is a collection of malicious functionality to aid in malware development. It is recommended you clone and/or download this entire repo then open the Visual Studio solution file to easily explore functionality and concepts.
Some functions may be dependent on other functions present within the solution file. Using the solution file provided here will make it easier to identify which other functionality and/or header data is required.
You're free to use this in any manner you please. You do not need to use this entire solution for your malware proof-of-concepts or Red Team engagements. Strip, copy, paste, delete, or edit this projects contents as much as you'd like.
Function Name | Original Author |
---|---|
AdfCloseHandleOnInvalidAddress | Checkpoint Research |
AdfIsCreateProcessDebugEventCodeSet | Checkpoint Research |
AdfOpenProcessOnCsrss | Checkpoint Research |
CheckRemoteDebuggerPresent2 | ReactOS |
IsDebuggerPresentEx | smelly__vx |
IsIntelHardwareBreakpointPresent | Checkpoint Research |
Function Name | Original Author |
---|---|
HashStringDjb2 | Dan Bernstein |
HashStringFowlerNollVoVariant1a | Glenn Fowler, Landon Curt Noll, and Kiem-Phong Vo |
HashStringJenkinsOneAtATime32Bit | Bob Jenkins |
HashStringLoseLose | Brian Kernighan and Dennis Ritchie |
HashStringRotr32 | T. Oshiba (1972) |
HashStringSdbm | Ozan Yigit |
HashStringSuperFastHash | Paul Hsieh |
HashStringUnknownGenericHash1A | Unknown |
HashStringSipHash | RistBS |
HashStringMurmur | RistBS |
CreateMd5HashFromFilePath | Microsoft |
CreatePseudoRandomInteger | Apple (c) 1999 |
CreatePseudoRandomString | smelly__vx |
HashFileByMsiFileHashTable | smelly__vx |
CreatePseudoRandomIntegerFromNtdll | smelly__vx |
LzMaximumCompressBuffer | smelly__vx |
LzMaximumDecompressBuffer | smelly__vx |
LzStandardCompressBuffer | smelly__vx |
LzStandardDecompressBuffer | smelly__vx |
XpressHuffMaximumCompressBuffer | smelly__vx |
XpressHuffMaximumDecompressBuffer | smelly__vx |
XpressHuffStandardCompressBuffer | smelly__vx |
XpressHuffStandardDecompressBuffer | smelly__vx |
XpressMaximumCompressBuffer | smelly__vx |
XpressMaximumDecompressBuffer | smelly__vx |
XpressStandardCompressBuffer | smelly__vx |
XpressStandardDecompressBuffer | smelly__vx |
ExtractFilesFromCabIntoTarget | smelly__vx |
Function Name | Original Author |
---|---|
GetLastErrorFromTeb | smelly__vx |
GetLastNtStatusFromTeb | smelly__vx |
RtlNtStatusToDosErrorViaImport | ReactOS |
GetLastErrorFromTeb | smelly__vx |
SetLastErrorInTeb | smelly__vx |
SetLastNtStatusInTeb | smelly__vx |
Win32FromHResult | Raymond Chen |
Function Name | Original Author |
---|---|
AmsiBypassViaPatternScan | ZeroMemoryEx |
DelayedExecutionExecuteOnDisplayOff | am0nsec and smelly__vx |
HookEngineRestoreHeapFree | rad9800 |
MasqueradePebAsExplorer | smelly__vx |
RemoveDllFromPeb | rad9800 |
RemoveRegisterDllNotification | Rad98, Peter Winter-Smith |
SleepObfuscationViaVirtualProtect | 5pider |
RtlSetBaseUnicodeCommandLine | TheWover |
Function Name | Original Author |
---|---|
GetCurrentLocaleFromTeb | 3xp0rt |
GetNumberOfLinkedDlls | smelly__vx |
GetOsBuildNumberFromPeb | smelly__vx |
GetOsMajorVersionFromPeb | smelly__vx |
GetOsMinorVersionFromPeb | smelly__vx |
GetOsPlatformIdFromPeb | smelly__vx |
IsNvidiaGraphicsCardPresent | smelly__vx |
IsProcessRunning | smelly__vx |
IsProcessRunningAsAdmin | Vimal Shekar |
GetPidFromNtQuerySystemInformation | smelly__vx |
GetPidFromWindowsTerminalService | modexp |
GetPidFromWmiComInterface | aalimian and modexp |
GetPidFromEnumProcesses | smelly__vx |
GetPidFromPidBruteForcing | modexp |
GetPidFromNtQueryFileInformation | modexp, Lloyd Davies, Jonas Lyk |
GetPidFromPidBruteForcingExW | smelly__vx, LLoyd Davies, Jonas Lyk, modexp |
Function Name | Original Author |
---|---|
CreateLocalAppDataObjectPath | smelly__vx |
CreateWindowsObjectPath | smelly__vx |
GetCurrentDirectoryFromUserProcessParameters | smelly__vx |
GetCurrentProcessIdFromTeb | ReactOS |
GetCurrentUserSid | Giovanni Dicanio |
GetCurrentWindowTextFromUserProcessParameter | smelly__vx |
GetFileSizeFromPath | smelly__vx |
GetProcessHeapFromTeb | smelly__vx |
GetProcessPathFromLoaderLoadModule | smelly__vx |
GetProcessPathFromUserProcessParameters | smelly__vx |
GetSystemWindowsDirectory | Geoff Chappell |
IsPathValid | smelly__vx |
RecursiveFindFile | Luke |
SetProcessPrivilegeToken | Microsoft |
IsDllLoaded | smelly__vx |
TryLoadDllMultiMethod | smelly__vx |
CreateThreadAndWaitForCompletion | smelly__vx |
GetProcessBinaryNameFromHwndW | smelly__vx |
GetByteArrayFromFile | smelly__vx |
Ex_GetHandleOnDeviceHttpCommunication | x86matthew |
IsRegistryKeyValid | smelly__vx |
FastcallExecuteBinaryShellExecuteEx | smelly__vx |
GetCurrentProcessIdFromOffset | RistBS |
GetPeBaseAddress | smelly__vx |
LdrLoadGetProcedureAddress | c5pider |
IsPeSection | smelly__vx |
AddSectionToPeFile | smelly__vx |
WriteDataToPeSection | smelly__vx |
GetPeSectionSizeInByte | smelly__vx |
ReadDataFromPeSection | smelly__vx |
GetCurrentProcessNoForward | ReactOS |
GetCurrentThreadNoForward | ReactOS |
Function Name | Original Author |
---|---|
GetKUserSharedData | Geoff Chappell |
GetModuleHandleEx2 | smelly__vx |
GetPeb | 29a |
GetPebFromTeb | ReactOS |
GetProcAddress | 29a Volume 2, c5pider |
GetProcAddressDjb2 | smelly__vx |
GetProcAddressFowlerNollVoVariant1a | smelly__vx |
GetProcAddressJenkinsOneAtATime32Bit | smelly__vx |
GetProcAddressLoseLose | smelly__vx |
GetProcAddressRotr32 | smelly__vx |
GetProcAddressSdbm | smelly__vx |
GetProcAddressSuperFastHash | smelly__vx |
GetProcAddressUnknownGenericHash1 | smelly__vx |
GetProcAddressSipHash | RistBS |
GetProcAddressMurmur | RistBS |
GetRtlUserProcessParameters | ReactOS |
GetTeb | ReactOS |
RtlLoadPeHeaders | smelly__vx |
ProxyWorkItemLoadLibrary | Rad98, Peter Winter-Smith |
ProxyRegisterWaitLoadLibrary | Rad98, Peter Winter-Smith |
Function Name | Original Author |
---|---|
MpfGetLsaPidFromServiceManager | modexp |
MpfGetLsaPidFromRegistry | modexp |
MpfGetLsaPidFromNamedPipe | modexp |
Function Name | Original Author |
---|---|
UrlDownloadToFileSynchronous | Hans Passant |
ConvertIPv4IpAddressStructureToString | smelly__vx |
ConvertIPv4StringToUnsignedLong | smelly__vx |
SendIcmpEchoMessageToIPv4Host | smelly__vx |
ConvertIPv4IpAddressUnsignedLongToString | smelly__vx |
DnsGetDomainNameIPv4AddressAsString | smelly__vx |
DnsGetDomainNameIPv4AddressUnsignedLong | smelly__vx |
GetDomainNameFromUnsignedLongIPV4Address | smelly__vx |
GetDomainNameFromIPV4AddressAsString | smelly__vx |
Function Name | Original Author |
---|---|
OleGetClipboardData | Microsoft |
MpfComVssDeleteShadowVolumeBackups | am0nsec |
MpfComModifyShortcutTarget | Unknown |
MpfComMonitorChromeSessionOnce | smelly__vx |
MpfExtractMaliciousPayloadFromZipFileNoPassword | Codu |
Function Name | Original Author |
---|---|
CreateProcessFromIHxHelpPaneServer | James Forshaw |
CreateProcessFromIHxInteractiveUser | James Forshaw |
CreateProcessFromIShellDispatchInvoke | Mohamed Fakroud |
CreateProcessFromShellExecuteInExplorerProcess | Microsoft |
CreateProcessViaNtCreateUserProcess | CaptMeelo |
CreateProcessWithCfGuard | smelly__vx and Adam Chester |
CreateProcessByWindowsRHotKey | smelly__vx |
CreateProcessByWindowsRHotKeyEx | smelly__vx |
CreateProcessFromINFSectionInstallStringNoCab | smelly__vx |
CreateProcessFromINFSetupCommand | smelly__vx |
CreateProcessFromINFSectionInstallStringNoCab2 | smelly__vx |
CreateProcessFromIeFrameOpenUrl | smelly__vx |
CreateProcessFromPcwUtil | smelly__vx |
CreateProcessFromShdocVwOpenUrl | smelly__vx |
CreateProcessFromShell32ShellExecRun | smelly__vx |
MpfExecute64bitPeBinaryInMemoryFromByteArrayNoReloc | aaaddress1 |
CreateProcessFromWmiWin32_ProcessW | CIA |
CreateProcessFromZipfldrRouteCall | smelly__vx |
CreateProcessFromUrlFileProtocolHandler | smelly__vx |
CreateProcessFromUrlOpenUrl | smelly__vx |
CreateProcessFromMsHTMLW | smelly__vx |
Function Name | Original Author |
---|---|
MpfPiControlInjection | SafeBreach Labs |
MpfPiQueueUserAPCViaAtomBomb | SafeBreach Labs |
MpfPiWriteProcessMemoryCreateRemoteThread | SafeBreach Labs |
MpfProcessInjectionViaProcessReflection | Deep Instinct |
Function Name | Original Author |
---|---|
IeCreateFile | smelly__vx |
CopyFileViaSetupCopyFile | smelly__vx |
CreateFileFromDsCopyFromSharedFile | Jonas Lyk |
DeleteDirectoryAndSubDataViaDelNode | smelly__vx |
DeleteFileWithCreateFileFlag | smelly__vx |
IsProcessRunningAsAdmin2 | smelly__vx |
IeCreateDirectory | smelly__vx |
IeDeleteFile | smelly__vx |
IeFindFirstFile | smelly__vx |
IEGetFileAttributesEx | smelly__vx |
IeMoveFileEx | smelly__vx |
IeRemoveDirectory | smelly__vx |
Function Name | Original Author |
---|---|
MpfSceViaImmEnumInputContext | alfarom256, aahmad097 |
MpfSceViaCertFindChainInStore | alfarom256, aahmad097 |
MpfSceViaEnumPropsExW | alfarom256, aahmad097 |
MpfSceViaCreateThreadpoolWait | alfarom256, aahmad097 |
MpfSceViaCryptEnumOIDInfo | alfarom256, aahmad097 |
MpfSceViaDSA_EnumCallback | alfarom256, aahmad097 |
MpfSceViaCreateTimerQueueTimer | alfarom256, aahmad097 |
MpfSceViaEvtSubscribe | alfarom256, aahmad097 |
MpfSceViaFlsAlloc | alfarom256, aahmad097 |
MpfSceViaInitOnceExecuteOnce | alfarom256, aahmad097 |
MpfSceViaEnumChildWindows | alfarom256, aahmad097, wra7h |
MpfSceViaCDefFolderMenu_Create2 | alfarom256, aahmad097, wra7h |
MpfSceViaCertEnumSystemStore | alfarom256, aahmad097, wra7h |
MpfSceViaCertEnumSystemStoreLocation | alfarom256, aahmad097, wra7h |
MpfSceViaEnumDateFormatsW | alfarom256, aahmad097, wra7h |
MpfSceViaEnumDesktopWindows | alfarom256, aahmad097, wra7h |
MpfSceViaEnumDesktopsW | alfarom256, aahmad097, wra7h |
MpfSceViaEnumDirTreeW | alfarom256, aahmad097, wra7h |
MpfSceViaEnumDisplayMonitors | alfarom256, aahmad097, wra7h |
MpfSceViaEnumFontFamiliesExW | alfarom256, aahmad097, wra7h |
MpfSceViaEnumFontsW | alfarom256, aahmad097, wra7h |
MpfSceViaEnumLanguageGroupLocalesW | alfarom256, aahmad097, wra7h |
MpfSceViaEnumObjects | alfarom256, aahmad097, wra7h |
MpfSceViaEnumResourceTypesExW | alfarom256, aahmad097, wra7h |
MpfSceViaEnumSystemCodePagesW | alfarom256, aahmad097, wra7h |
MpfSceViaEnumSystemGeoID | alfarom256, aahmad097, wra7h |
MpfSceViaEnumSystemLanguageGroupsW | alfarom256, aahmad097, wra7h |
MpfSceViaEnumSystemLocalesEx | alfarom256, aahmad097, wra7h |
MpfSceViaEnumThreadWindows | alfarom256, aahmad097, wra7h |
MpfSceViaEnumTimeFormatsEx | alfarom256, aahmad097, wra7h |
MpfSceViaEnumUILanguagesW | alfarom256, aahmad097, wra7h |
MpfSceViaEnumWindowStationsW | alfarom256, aahmad097, wra7h |
MpfSceViaEnumWindows | alfarom256, aahmad097, wra7h |
MpfSceViaEnumerateLoadedModules64 | alfarom256, aahmad097, wra7h |
MpfSceViaK32EnumPageFilesW | alfarom256, aahmad097, wra7h |
MpfSceViaEnumPwrSchemes | alfarom256, aahmad097, wra7h |
MpfSceViaMessageBoxIndirectW | alfarom256, aahmad097, wra7h |
MpfSceViaChooseColorW | alfarom256, aahmad097, wra7h |
MpfSceViaClusWorkerCreate | alfarom256, aahmad097, wra7h |
MpfSceViaSymEnumProcesses | alfarom256, aahmad097, wra7h |
MpfSceViaImageGetDigestStream | alfarom256, aahmad097, wra7h |
MpfSceViaVerifierEnumerateResource | alfarom256, aahmad097, wra7h |
MpfSceViaSymEnumSourceFiles | alfarom256, aahmad097, wra7h |
Function Name | Original Author |
---|---|
ByteArrayToCharArray | smelly__vx |
CharArrayToByteArray | smelly__vx |
ShlwapiCharStringToWCharString | smelly__vx |
ShlwapiWCharStringToCharString | smelly__vx |
CharStringToWCharString | smelly__vx |
WCharStringToCharString | smelly__vx |
RtlInitEmptyUnicodeString | ReactOS |
RtlInitUnicodeString | ReactOS |
CaplockString | simonc |
CopyMemoryEx | ReactOS |
SecureStringCopy | Apple (c) 1999 |
StringCompare | Apple (c) 1999 |
StringConcat | Apple (c) 1999 |
StringCopy | Apple (c) 1999 |
StringFindSubstring | Apple (c) 1999 |
StringLength | Apple (c) 1999 |
StringLocateChar | Apple (c) 1999 |
StringRemoveSubstring | smelly__vx |
StringTerminateStringAtChar | smelly__vx |
StringToken | Apple (c) 1999 |
ZeroMemoryEx | ReactOS |
ConvertCharacterStringToIntegerUsingNtdll | smelly__vx |
MemoryFindMemory | KamilCuk |
Function Name | Original Author |
---|---|
UacBypassFodHelperMethod | winscripting.blog |
Function Name | Original Author |
---|---|
InitHardwareBreakpointEngine | rad98 |
ShutdownHardwareBreakpointEngine | rad98 |
ExceptionHandlerCallbackRoutine | rad98 |
SetHardwareBreakpoint | rad98 |
InsertDescriptorEntry | rad98 |
RemoveDescriptorEntry | rad98 |
SnapshotInsertHardwareBreakpointHookIntoTargetThread | rad98 |
Function Name | Original Author |
---|---|
GenericShellcodeHelloWorldMessageBoxA | SafeBreach Labs |
GenericShellcodeHelloWorldMessageBoxAEbFbLoop | SafeBreach Labs |
GenericShellcodeOpenCalcExitThread | MsfVenom |
Microsoft Corp. today released software updates to quash 130 security bugs in its Windows operating systems and related software, including at least five flaws that are already seeing active exploitation. Meanwhile, Apple customers have their own zero-day woes again this month: On Monday, Apple issued (and then quickly pulled) an emergency update to fix a zero-day vulnerability that is being exploited on MacOS and iOS devices.
On July 10, Apple pushed a “Rapid Security Response” update to fix a code execution flaw in the Webkit browser component built into iOS, iPadOS, and macOS Ventura. Almost as soon as the patch went out, Apple pulled the software because it was reportedly causing problems loading certain websites. MacRumors says Apple will likely re-release the patches when the glitches have been addressed.
Launched in May, Apple’s Rapid Security Response updates are designed to address time-sensitive vulnerabilities, and this is the second month Apple has used it. July marks the sixth month this year that Apple has released updates for zero-day vulnerabilities — those that get exploited by malware or malcontents before there is an official patch available.
If you rely on Apple devices and don’t have automatic updates enabled, please take a moment to check the patch status of your various iDevices. The latest security update that includes the fix for the zero-day bug should be available in iOS/iPadOS 16.5.1, macOS 13.4.1, and Safari 16.5.2.
On the Windows side, there are at least four vulnerabilities patched this month that earned high CVSS (badness) scores and that are already being exploited in active attacks, according to Microsoft. They include CVE-2023-32049, which is a hole in Windows SmartScreen that lets malware bypass security warning prompts; and CVE-2023-35311 allows attackers to bypass security features in Microsoft Outlook.
The two other zero-day threats this month for Windows are both privilege escalation flaws. CVE-2023-32046 affects a core Windows component called MSHTML, which is used by Windows and other applications, like Office, Outlook and Skype. CVE-2023-36874 is an elevation of privilege bug in the Windows Error Reporting Service.
Many security experts expected Microsoft to address a fifth zero-day flaw — CVE-2023-36884 — a remote code execution weakness in Office and Windows.
“Surprisingly, there is no patch yet for one of the five zero-day vulnerabilities,” said Adam Barnett, lead software engineer at Rapid7. “Microsoft is actively investigating publicly disclosed vulnerability, and promises to update the advisory as soon as further guidance is available.”
Barnett notes that Microsoft links exploitation of this vulnerability with Storm-0978, the software giant’s name for a cybercriminal group based out of Russia that is identified by the broader security community as RomCom.
“Exploitation of CVE-2023-36884 may lead to installation of the eponymous RomCom trojan or other malware,” Barnett said. “[Microsoft] suggests that RomCom / Storm-0978 is operating in support of Russian intelligence operations. The same threat actor has also been associated with ransomware attacks targeting a wide array of victims.”
Microsoft’s advisory on CVE-2023-36884 is pretty sparse, but it does include a Windows registry hack that should help mitigate attacks on this vulnerability. Microsoft has also published a blog post about phishing campaigns tied to Storm-0978 and to the exploitation of this flaw.
Barnett said it’s while it’s possible that a patch will be issued as part of next month’s Patch Tuesday, Microsoft Office is deployed just about everywhere, and this threat actor is making waves.
“Admins should be ready for an out-of-cycle security update for CVE-2023-36884,” he said.
Microsoft also today released new details about how it plans to address the existential threat of malware that is cryptographically signed by…wait for it….Microsoft.
In late 2022, security experts at Sophos, Trend Micro and Cisco warned that ransomware criminals were using signed, malicious drivers in an attempt to evade antivirus and endpoint detection and response (EDR) tools.
In a blog post today, Sophos’s Andrew Brandt wrote that Sophos identified 133 malicious Windows driver files that were digitally signed since April 2021, and found 100 of those were actually signed by Microsoft. Microsoft said today it is taking steps to ensure those malicious driver files can no longer run on Windows computers.
As KrebsOnSecurity noted in last month’s story on malware signing-as-a-service, code-signing certificates are supposed to help authenticate the identity of software publishers, and provide cryptographic assurance that a signed piece of software has not been altered or tampered with. Both of these qualities make stolen or ill-gotten code-signing certificates attractive to cybercriminal groups, who prize their ability to add stealth and longevity to malicious software.
Dan Goodin at Ars Technica contends that whatever Microsoft may be doing to keep maliciously signed drivers from running on Windows is being bypassed by hackers using open source software that is popular with video game cheaters.
“The software comes in the form of two software tools that are available on GitHub,” Goodin explained. “Cheaters use them to digitally sign malicious system drivers so they can modify video games in ways that give the player an unfair advantage. The drivers clear the considerable hurdle required for the cheat code to run inside the Windows kernel, the fortified layer of the operating system reserved for the most critical and sensitive functions.”
Meanwhile, researchers at Cisco’s Talos security team found multiple Chinese-speaking threat groups have repurposed the tools—one apparently called “HookSignTool” and the other “FuckCertVerifyTimeValidity.”
“Instead of using the kernel access for cheating, the threat actors use it to give their malware capabilities it wouldn’t otherwise have,” Goodin said.
For a closer look at the patches released by Microsoft today, check out the always-thorough Patch Tuesday roundup from the SANS Internet Storm Center. And it’s not a bad idea to hold off updating for a few days until Microsoft works out any kinks in the updates: AskWoody.com usually has the lowdown on any patches that may be causing problems for Windows users.
And as ever, please consider backing up your system or at least your important documents and data before applying system updates. If you encounter any problems with these updates, please drop a note about it here in the comments.
ReconAIzer is a powerful Jython extension for Burp Suite that leverages OpenAI to help bug bounty hunters optimize their recon process. This extension automates various tasks, making it easier and faster for security researchers to identify and exploit vulnerabilities.
Once installed, ReconAIzer add a contextual menu and a dedicated tab to see the results:
Follow these steps to install the ReconAIzer extension on Burp Suite:
ReconAIzer.py
file in Step 3.1. Select the file and click "Open."Congratulations! You have successfully installed the ReconAIzer extension in Burp Suite. You can now start using it to enhance your bug bounty hunting experience.
Once it's done, you must configure your OpenAI API key on the "Config" tab under "ReconAIzer" tab.
Feel free to suggest prompts improvements or anything you would like to see on ReconAIzer!
Happy bug hunting!
HardHat is a multiplayer C# .NET-based command and control framework. Designed to aid in red team engagements and penetration testing. HardHat aims to improve the quality of life factors during engagements by providing an easy-to-use but still robust C2 framework.
It contains three primary components, an ASP.NET teamserver, a blazor .NET client, and C# based implants.
Alpha Release - 3/29/23 NOTE: HardHat is in Alpha release; it will have bugs, missing features, and unexpected things will happen. Thank you for trying it, and please report back any issues or missing features so they can be addressed.
Discord Join the community to talk about HardHat C2, Programming, Red teaming and general cyber security things The discord community is also a great way to request help, submit new features, stay up to date on the latest additions, and submit bugs.
documentation can be found at docs
To configure the team server's starting address (where clients will connect), edit the HardHatC2\TeamServer\Properties\LaunchSettings.json changing the "applicationUrl": "https://127.0.0.1:5000" to the desired location and port. start the teamserver with dotnet run from its top-level folder ../HrdHatC2/Teamserver/
Code contributions are welcome feel free to submit feature requests, pull requests or send me your ideas on discord.