AttackGen is a cybersecurity incident response testing tool that leverages the power of large language models and the comprehensive MITRE ATT&CK framework. The tool generates tailored incident response scenarios based on user-selected threat actor groups and your organisation's details.
If you find AttackGen useful, please consider starring the repository on GitHub. This helps more people discover the tool. Your support is greatly appreciated! ⭐
What's new? | Why is it useful? |
---|---|
Mistral API Integration | - Alternative Model Provider: Users can now leverage the Mistral AI models to generate incident response scenarios. This integration provides an alternative to the OpenAI and Azure OpenAI Service models, allowing users to explore and compare the performance of different language models for their specific use case. |
Local Model Support using Ollama | - Local Model Hosting: AttackGen now supports the use of locally hosted LLMs via an integration with Ollama. This feature is particularly useful for organisations with strict data privacy requirements or those who prefer to keep their data on-premises. Please note that this feature is not available for users of the AttackGen version hosted on Streamlit Community Cloud at https://attackgen.streamlit.app |
Optional LangSmith Integration | - Improved Flexibility: The integration with LangSmith is now optional. If no LangChain API key is provided, users will see an informative message indicating that the run won't be logged by LangSmith, rather than an error being thrown. This change improves the overall user experience and allows users to continue using AttackGen without the need for LangSmith. |
Various Bug Fixes and Improvements | - Enhanced User Experience: This release includes several bug fixes and improvements to the user interface, making AttackGen more user-friendly and robust. |
What's new? | Why is it useful? |
---|---|
Azure OpenAI Service Integration | - Enhanced Integration: Users can now choose to utilise OpenAI models deployed on the Azure OpenAI Service, in addition to the standard OpenAI API. This integration offers a seamless and secure solution for incorporating AttackGen into existing Azure ecosystems, leveraging established commercial and confidentiality agreements. - Improved Data Security: Running AttackGen from Azure ensures that application descriptions and other data remain within the Azure environment, making it ideal for organizations that handle sensitive data in their threat models. |
LangSmith for Azure OpenAI Service | - Enhanced Debugging: LangSmith tracing is now available for scenarios generated using the Azure OpenAI Service. This feature provides a powerful tool for debugging, testing, and monitoring of model performance, allowing users to gain insights into the model's decision-making process and identify potential issues with the generated scenarios. - User Feedback: LangSmith also captures user feedback on the quality of scenarios generated using the Azure OpenAI Service, providing valuable insights into model performance and user satisfaction. |
Model Selection for OpenAI API | - Flexible Model Options: Users can now select from several models available from the OpenAI API endpoint, such as gpt-4-turbo-preview . This allows for greater customization and experimentation with different language models, enabling users to find the most suitable model for their specific use case. |
Docker Container Image | - Easy Deployment: AttackGen is now available as a Docker container image, making it easier to deploy and run the application in a consistent and reproducible environment. This feature is particularly useful for users who want to run AttackGen in a containerised environment, or for those who want to deploy the application on a cloud platform. |
What's new? | Why is it useful? |
---|---|
Custom Scenarios based on ATT&CK Techniques | - For Mature Organisations: This feature is particularly beneficial if your organisation has advanced threat intelligence capabilities. For instance, if you're monitoring a newly identified or lesser-known threat actor group, you can tailor incident response testing scenarios specific to the techniques used by that group. - Focused Testing: Alternatively, use this feature to focus your incident response testing on specific parts of the cyber kill chain or certain MITRE ATT&CK Tactics like 'Lateral Movement' or 'Exfiltration'. This is useful for organisations looking to evaluate and improve specific areas of their defence posture. |
User feedback on generated scenarios | - Collecting feedback is essential to track model performance over time and helps to highlight strengths and weaknesses in scenario generation tasks. |
Improved error handling for missing API keys | - Improved user experience. |
Replaced Streamlit st.spinner widgets with new st.status widget | - Provides better visibility into long running processes (i.e. scenario generation). |
Initial release.
langchain
and mitreattack
).enterprise-attack.json
(MITRE ATT&CK dataset in STIX format) and groups.json
.git clone https://github.com/mrwadams/attackgen.git
cd attackgen
pip install -r requirements.txt
docker pull mrwadams/attackgen
If you would like to use LangSmith for debugging, testing, and monitoring of model performance, you will need to set up a LangSmith account and create a .streamlit/secrets.toml
file that contains your LangChain API key. Please follow the instructions here to set up your account and obtain your API key. You'll find a secrets.toml-example
file in the .streamlit/
directory that you can use as a template for your own secrets.toml file.
If you do not wish to use LangSmith, you must still have a .streamlit/secrets.toml
file in place, but you can leave the LANGCHAIN_API_KEY
field empty.
Download the latest version of the MITRE ATT&CK dataset in STIX format from here. Ensure to place this file in the ./data/
directory within the repository.
After the data setup, you can run AttackGen with the following command:
streamlit run 👋_Welcome.py
You can also try the app on Streamlit Community Cloud.
streamlit run 👋_Welcome.py
docker run -p 8501:8501 mrwadams/attackgen
This command will start the container and map port 8501 (default for Streamlit apps) from the container to your host machine. 2. Open your web browser and navigate to http://localhost:8501
. 3. Use the app to generate standard or custom incident response scenarios (see below for details).
Threat Group Scenarios
page..streamlit/secrets.toml
file.Custom Scenario
page..streamlit/secrets.toml
file.Please note that generating scenarios may take a minute or so. Once the scenario is generated, you can view it on the app and also download it as a Markdown file.
I'm very happy to accept contributions to this project. Please feel free to submit an issue or pull request.
This project is licensed under GNU GPLv3.
A deep dive into the latest updates from Secure Network and Cloud Analytics that show Cisco’s leadership in the Security Industry.
The year 2022 has been rather hectic for many reasons, and as the World undergoes its various challenges and opportunities, We At Cisco Security have buckled up and focused on improving the World in the way which we know best: by making it more Secure.
In an increasingly vulnerable Internet environment, where attackers rapidly develop new techniques to compromise organizations around the world, ensuring a robust security infrastructure becomes ever more imperative. Across the Cisco Security Portfolio, Secure Network Analytics (SNA) and Secure Cloud Analytics (SCA) have continued to add value for their customers since their inception by innovating their products and enhancing their capabilities.
In the latest SNA 7.4.1 release, four core features have been added to target important milestones in our roadmap. As a first addition, SNA has widely expanded on its Data Store deployment options by introducing the single node Data Store; supporting existing Flow Collector (FC) and new Data Store expansion by the Manager; and the capacity to mix and match virtual and physical appliances to build a Data Store deployment.
The SNA Data Store started as a simple concept, and while it maintained its simplicity, it became increasingly more robust and performant over the recent releases. In essence, it represents a new and improved database architecture design that can be made up of virtual or physical appliances to provide industry leading horizontal scaling for telemetry and event retention for over a year. Additionally, the Flow Ingest from the Flow Collectors is now separate from the data storage, which allows them to now scale to 500K + Flows Per Second (FPS). With this new database design, are now optimized for performance, which has improved across all metrics by a considerable amount.
For the second major addition, SNA now supports multi-telemetry collection within a single deployment. Such data encompasses network telemetry, firewall logging, and remote worker telemetry. Now, Firewall logs can be stored on premises with the Data Store, making data available to the Firepower Management Center (FMC) via APIs to support remote queries. From the FMC, users can pivot directly to the Data Store interface and look at detailed events that optimize SecOps workflows, such as automatically filtering on events of interest.
On the topic of interfaces, users can now benefit from an intelligent viewer which provides all Firewall data. This feature allows to select custom timeframes, apply unique filters on Security Events, create custom views based on relevant subsets of data, visualize trends from summary reports, and finally to export any such view as a CSV format for archiving or further forensic investigations.
With respect to VPN telemetry, the AnyConnect Secure Mobility Client can now store all network traffic even if users are not using their VPN in the given moment. Once a VPN connection is restored, the data is then sent to the Flow Collector, and, with a Data Store deployment, off-network flow updates can bypass FC flow caches which allow NVM historical data to be stored correctly.
Continuing down the Data Store journey (and, what a journey indeed), users can now monitor and evaluate its performance in a simple and intuitive way. This is achieved with charts and trends directly available in the Manager, which can now support traditional non-Data Store FCs and one singular Data Store. The division of Flow Collectors is made possible by SNA Domains, where a Data Store Domain can be created, and new FCs added to it when desired. This comes as part of a series of robust enhancements to the Flow Collector, where the FC can now be made up of a single image (NetFlow + sFlow) and its image can be switched between the two options. As yet another perk of the new database design, any FC can send its data to the Data Store.
As it can be seen, the Data Store has been the star of the latest SNA release, and for obvious good reasons. Before coming to an ending though, it has one more feature up its sleeve: Converged Analytics. This SNA feature brings a simplified, intuitive and clear analytics experience to Secure Network Analytics users. It comes with out- of-the-box detections mapped to MITRE with clearly defined tactics and techniques, self-taught baselining and graduated alerting, and the ability to quiet non-relevant alerts, leading to more relevant detections.
This new Analytics feature is a strong step forward to give users the confidence of network security awareness thanks to an intuitive workflow and 43 new alerts. It also gives them a deep understanding of each alert with observations and mappings related to the industry-standard MITRE tactics and techniques. When you think it couldn’t get any better, the Secure Network and Cloud Analytics teams have worked hard to add even more value to this release, and ensured the same workflows, functionality and user experience could be further available in the SCA portal. Yes, this is the first step towards a more cohesive experience across both SNA and SCA, where users of either platform will start to benefit from more consistent outcomes regardless of their deployment model. As some would say, it’s like a birthday coming early.
Pivoting to Secure Cloud Analytics, as per Network sibling, the product got several enhancements over the last months of development. The core additions revolve around additional detections and context, as well as usability and integration enhancements, including those in Secure Cloud Insights. In parallel with SNA’s Converged Analytics, SCA benefits from detections mapped to the MITRE ATT&CK framework. Additionally, several detections underwent algorithm improvements, while 4 new ones were added, such as Worm Propagation, which was native to SNA. Regarding the backbone of SCA’s alerts, a multitude of new roles and observations were added to the platform, to further optimize and tune the alerts for the users.
Additionally, alerts now offer a pivot directly to AWS’ load balancer and VPC, as well as direct access to Azure Security Groups, to allow for further investigation through streamlined workflows. The two Public Cloud Providers are now also included in coverage reports that provide a gap analysis to gain insight as to what logs may have potentially gone missing.
Focusing more on the detection workflows, the Alert Details view also got additional information pertaining to device context which gives insight into hostnames, subnets, and role metrics. The ingest mechanism has also gotten more robust thanks to data now coming from Talos intelligence feed and ISE, shown in the Event Viewer for expanded forensics and visibility use cases.
While dealing with integrations, the highly requested SecureX integration can now be enabled in 1 click, with no API keys needed and a workflow that is seamless across the two platforms. Among some of the other improvements around graphs and visualizations, the Encrypted Traffic widget now allows an hourly breakdown of the data, while the Event Viewer now displays bi-directional session traffic, to bring even greater context to SCA flows.
In the context of pivots, as a user is navigating through devices that, for example, have raised an alert, they will now also see the new functionality to pivot directly into the Secure Cloud Insights (SCI) Knowledge Graph, to learn more about how various sources are connected to one another. Another SCI integration is present within the Device Outline of an Alert, to gain more posture context, and as part of a configuration menu, it’s now possible to run cloud posture assessments on demand, for immediate results and recommendations.
With this all said, we from the Secure Analytics team are extremely excited about the adoption and usage of these features so that we can keep on improving the product and iterating to solve even more use cases. As we look ahead, the World has never needed more than now a comprehensive solution to solve one of the most pressing problems in our society: cyber threats in the continuously evolving Internet space. And Secure Analytics will be there, to pioneer and lead the effort for a safe World.
We’d love to hear what you think. Ask a Question, Comment Below, and Stay Connected with Cisco Secure on social!
Cisco Secure Social Channels
The introduction of the MITRE ATT&CK evaluations is a welcomed addition to the third-party testing arena. The ATT&CK framework, and the evaluations in particular, have gone such a long way in helping advance the security industry as a whole, and the individual security products serving the market.
The insight garnered from these evaluations is incredibly useful. But let’s admit, for everyone except those steeped in the analysis, it can be hard to understand. The information is valuable, but dense. There are multiple ways to look at the data and even more ways to interpret and present the results (as no doubt you’ve already come to realize after reading all the vendor blogs and industry articles!) We have been looking at the data for the past week since it published, and still have more to examine over the coming days and weeks.
The more we assess the information, the clearer the story becomes, so we wanted to share with you Trend Micro’s 10 key takeaways for our results:
1. Looking at the results of the first run of the evaluation is important:
|
|
|
|
2. There is a hierarchy in the type of main detections – Techniques is most significant
|
|
https://attackevals.mitre.org/APT29/detection-categories.html
3. More alerts does not equal better alerting – quite the opposite
|
|
4. Managed Service detections are not exclusive
|
|
5. Let’s not forget about the effectiveness and need for blocking!
|
|
6. We need to look through more than the Windows
|
|
7. The evaluation shows where our product is going
|
|
8. This evaluation is helping us make our product better
|
|
9. MITRE is more than the evaluation
|
|
10. It is hard not to get confused by the fud!
|
|
The post Trend Micro’s Top Ten MITRE Evaluation Considerations appeared first on .
Full disclosure: I am a security product testing nerd*.
I’ve been following the MITRE ATT&CK Framework for a while, and this week the results were released of the most recent evaluation using APT29 otherwise known as COZY BEAR.
First, here’s a snapshot of the Trend eval results as I understand them (rounded down):
91.79% on overall detection. That’s in the top 2 of 21.
91.04% without config changes. The test allows for config changes after the start – that wasn’t required to achieve the high overall results.
107 Telemetry. That’s very high. Capturing events is good. Not capturing them is not-good.
28 Alerts. That’s in the middle, where it should be. Not too noisy, not too quiet. Telemetry I feel is critical whereas alerting is configurable, but only on detections and telemetry.
So our Apex One product ran into a mean and ruthless bear and came away healthy. But that summary is a simplification and doesn’t capture all the nuance to the testing. Below are my takeaways for you of what the MITRE ATT&CK Framework is, and how to go about interpreting the results.
Takeaway #1 – ATT&CK is Scenario Based
The MITRE ATT&CK Framework is intriguing to me as it mixes real world attack methods by specific adversaries with a model for detection for use by SOCs and product makers. The ATT&CK Framework Evaluations do this but in a lab environment to assess how security products would likely handle an attack by that adversary and their usual methods. There had always been a clear divide between pen testing and lab testing and ATT&CK was kind of mixing both. COZY BEAR is super interesting because those attacks were widely known for being quite sophisticated and being state-sponsored, and targeted the White House and US Democratic Party. COZY BEAR and its family of derivatives use backdoors, droppers, obfuscation, and careful exfiltration.
Takeaway #2 – Look At All The Threat Group Evals For The Best Picture
I see the tradeoffs as ATT&CK evals are only looking at that one scenario, but that scenario is very reality based and with enough evals across enough scenarios a narrative is there to better understand a product. Trend did great on the most recently released APT/29/COZY BEAR evaluation, but my point is that a product is only as good as all the evaluations. I always advised Magic Quadrant or NSS Value Map readers to look at older versions in order to paint a picture over time of what trajectory a product had.
Takeaway #3 – It’s Detection Focused (Only)
The APT29 test like most Att&ck evals is testing detection, not prevention nor other parts of products (e.g. support). The downside is that a product’s ability to block the attacks isn’t evaluated, at least not yet. In fact blocking functions have to be disabled for parts of the test to be done. I get that – you can’t test the upstairs alarm with the attack dog roaming the downstairs. Starting with poor detection never ends well, so the test methodology seems to be focused on ”if you can detect it you can block it”. Some pen tests are criticized that a specific scenario isn’t realistic because A would stop it before B could ever occur. IPS signature writers everywhere should nod in agreement on that one. I support MITRE on how they constructed the methodology because there has to be limitations and scope on every lab test, but readers too need to understand those limitations and scopes. I believe that the next round of tests will include protection (blocking) as well, so that is cool.
Takeaway #4 – Choose Your Own Weather Forecast
Att&ck is no magazine style review. There is no final grade or comparison of products. To fully embrace Att&ck imagine being provided dozens of very sound yet complex meteorological measurements and being left to decide on what the weather will be. Or have vendors carpet bomb you with press releases of their interpretations. I’ve been deep into the numbers of the latest eval scores and when looking at some of the blogs and press releases out there they almost had me convinced they did well even when I read the data at hand showing they didn’t. I guess a less jaded view is that the results can be interpreted in many ways, some of them quite creative. It brings to mind the great quote from the Lockpicking Lawyer review “the threat model does not include an attacker with a screwdriver”.
Josh Zelonis at Forrester provides a great example of the level of work required to parse the test outcomes, and he provides extended analysis on Github here that is easier on the eyes than the above. Even that great work product requires the context of what the categories mean. I understand that MITRE is taking the stance of “we do the tests, you interpret the data” in order to pick fewer fights and accommodate different use cases and SOC workflows, but that is a lot to put on buyers. I repeat: there’s a lot of nuance in the terms and test report categories.
If, in the absence of Josh’s work, if I have to pick one metric Detection Rate is likely the best one. Note that Detection rate isn’t 100% for any product in the APT29 test, because of the meaning of that metric. The best secondary metrics I like are Techniques and Telemetry. Tactics sounds like a good thing, but in the framework it is lesser than Techniques, as Tactics are generalized bad things (“Something moving outside!”) and Techniques are more specific detections (“Healthy adult male Lion seen outside door”), so a higher score in Techniques combined with a low score in Tactics is a good thing. Telemetry scoring is, to me, best right in the middle. Not too many alerts (noisy/fatiguing) and not too few (“about that lion I saw 5 minutes ago”).
Here’s an example of the interpretations that are valuable to me. Looking at the Trend Micro eval source page here I get info on detections in the steps, or how many of the 134 total steps in the test were detected. I’ll start by excluding any human involvement and exclude the MSSP detections and look at unassisted only. But the numbers are spread across all 20 test steps, so I’ll use Josh’s spreadsheet shows 115 of 134 steps visible, or 85.82%. I do some averaging on the visibility scores across all the products evaluated and that is 66.63%, which is almost 30% less. Besides the lesson that the data needs gathering and interpretation, it highlights that no product spotted 100% across all steps and the spread was wide. I’ll now look at the impact of human involvement add in the MSSP detections and the Trend number goes to 91%. Much clinking of glasses heard from the endpoint dev team. But if I’m not using an MSSP service that… you see my point about context/use-case/workflow. There’s effectively some double counting (i.e. a penalty, so that when removing MSSP it inordinately drops the detection ) of the MSSP factor when removing it in the analyses, but I’ll leave that to a future post. There’s no shortage of fodder for security testing nerds.
Takeaway #5 – Data Is Always Good
Security test nerdery aside, this eval is a great thing and the data from it is very valuable. Having this kind of evaluation makes security products and the uses we put them to better. So dig into ATT&CK and read it considering not just product evaluations but how your organization’s framework for detecting and processing attacks maps to the various threat campaigns. We’ll no doubt have more posts on APT29 and upcoming evals.
*I was a Common Criteria tester in a place that also ran a FIPS 140-2 lab. Did you know that at Level 4 of FIPS a freezer is used as an exploit attempt? I even dipped my toe into the arcane area of Formal Methods using the GYPSY methodology and ran from it screaming “X just equals X! We don’t need to prove that!”. The deepest testing rathole I can recall was doing a portability test of the Orange Book B1 rating for MVS RACF when using logical partitions. I’m never getting those months of my life back. I’ve been pretty active in interacting with most security testing labs like NSS and ICSA and their schemes (that’s not a pejorative, but testing nerds like to use British usages to sound more learned) for decades because I thought it was important to understand the scope and limits of testing before accepting it in any product buying decisions. If you want to make Common Criteria nerds laugh point out something bad that has happened and just say “that’s not bad, it was just mistakenly put in scope”, and that will then upset the FIPS testers because a crypto boundary is a very real thing and not something real testers joke about. And yes, Common Criteria is the MySpace of tests.
The post Getting ATT&CKed By A Cozy Bear And Being Really Happy About It: What MITRE Evaluations Are, and How To Read Them appeared first on .