Leveraging STIX and TAXII for better Cyber Threat Intelligence

Leveraging STIX and TAXII for better Cyber Threat Intelligence (Part 1)

The modern cyberspace, with its increasingly complex attack scenarios and sophisticated modus operandi, is becoming more and more difficult to defend and secure. And given the evolving complexities of the threat landscape, the speed at which events occur, and the vast quantities of data involved, the need of the hour is a machine-readable and easily automatable system for Sharing Cyber Threat Intelligence (CTI) data.

This is where STIX and TAXII come into the picture.

STIX is a structured representation of threat information that is expressive, flexible, extensible, automatable, and readable. Using STIX feeds with TAXII enables organizations to exchange cyber threat intelligence in a more structured and standardized manner, allowing for deeper collaboration against threats.

In this article, we will explore the basics of STIX and TAXII and some of their applications in the cybersecurity space.

What is STIX?

STIX, as per the oasis guide, is “Structured Threat Information Expression (STIX™) is a language and serialization format used to exchange cyber threat intelligence (CTI)”.

It’s nothing but a standard defined by the community to share threat intel across various organizations. Using STIX, all aspects of a potential threat such as suspicion, compromise, and attack attribution can be represented clearly with objects and descriptive relationships. STIX is easy to read and consume because it is in the JSON format and it can also be integrated with other popular threat intel platforms such as QRADAR, ThreatConnect etc.

Applications of STIX

(UC1) Analyzing Cyber Threats

A security analyst analyses a variety of cyber threats from different sources every day. During which it is important to analyse various factors of a threat such as its behaviour, modes of operation, capabilities, threat actors etc. The STIX objects make it easier to represent all the data required for analysis easily.

(UC2) Specifying Indicator Patterns for Cyber Threats

An analyst often looks out for patterns in a cyber attack or a threat feed. This includes assessing the characteristics of the threat, the relevant set of observables (Indicators of Compromise (IOCs), attachments, files, IP addresses etc.), and suggested course of action. This data too can be represented well by assigning the required STIX objects to a threat.

(UC3) Managing Cyber Threat Response Activities

Remediating or preventing a cyber attack is the most important role of a security professional. After analysing the threat data, it is expected to plan a proper remedial action plan to safeguard one from future attacks. STIX enables analysts to plan remedial action.Remediating or preventing a cyber attack is the most important role of a security professional. After analysing the threat data, it is expected to plan a proper remedial action plan to safeguard one from future attacks. STIX enables analysts to plan remedial action.

 

What is TAXII?

TAXII, as per the oasis guide, is “Trusted Automated Exchange of Intelligence Information (TAXII™) and is an application protocol for exchanging CTI over HTTPS. ”

TAXII is a standard that defines a set of protocols for Client and Servers to exchange CTI along with a RESTful API (a set of services and message exchanges).

TAXII defines two primary services to support a variety of common sharing models

Collection: A server-provided repository of objects where TAXII Clients and Servers exchange information in a request-response model.

Channel: When there is more than one producer, and all the producers feed the objects onto the Channels which are then consumed by TAXII clients, TAXII Clients exchange information within a publish-subscribe model.

The TAXII 2.1 specification reserves the keywords required for Channels but does not specify Channel services. Channels and their services will be defined in a later version of TAXII.

Note: The TAXII 2.1 specification reserves the keywords required for Channels but does not specify Channel services. Channels and their services will be defined in a later version of TAXII.

TAXII was specifically designed to support the exchange of CTI represented in STIX, and support for exchanging STIX 2.1 content. It is important to note that STIX and TAXII are independent standards and TAXII can be used to transport non-STIX data.

The three principal models for TAXII

1. Hub and spoke – one repository of information

Hub and spoke – one repository of information
2. Source/subscriber – one single source of information

Source/subscriber – one single source of information

3.Peer-to-peer – multiple groups share information

Peer-to-peer – multiple groups share informationUpcoming…

In Part 2 we will delve deeper into STIX architecture, implementation, and usage, and dissect to get a deeper understanding of the different versions of TAXII, and their Client and Server implementations.

References: 

  1. https://oasis-open.github.io/cti-documentation/taxii/intro.html
  2. https://oasis-open.github.io/cti-documentation/stix/intro 
  3. https://www.first.org/resources/papers/munich2016/wunder-stix-taxii-Overview.pdf
  4. https://stixproject.github.io

Weaponizing AI to orchestrate cyber attacks

Introduction

Since the coinage of the term in 1956, Artificial Intelligence (AI) has evolved considerably. From its metaphorical reference in Mary Shelly’s Frankenstein, to its most popular recent application in autonomous cars, AI has made a progressive shift, over the years. It influences all the major industries such as transportation, communication, banking, education, healthcare, media, etc. 

When it comes to cybersecurity, AI is changing how we detect and respond to threats. However, with the benefits, comes the risk of the potential misuse of AI capabilities. Is the primary catalyst for cybersecurity, also a threat to it?  

How do we use AI in our daily life?

Social media users encounter AI on a daily basis and probably don’t recognize it at all. Online shopping recommendations, image recognition, personal assistants such as Siri and Alexa, and smart email replies, are the most popular examples.

For instance, Facebook identifies individual faces in a photo, and helps users “tag” and notify them. Businesses often embed chatbots in their websites and applications. These AI-driven chatbots detect words in the questions entered by customers, to predict and deliver prompt responses. 

How do malicious actors abuse and weaponize AI?

To orchestrate attacks, cyber criminals often tinker with existing AI systems, instead of developing new AI programs and tools. Some common attacks that exploit Artificial Intelligence include: 

  • Misusing the nature of AI algorithms/ systems: AI capabilities such as efficiency, speed and accuracy can be used to devise precise and undetectable attacks like targeted phishing attacks, delivering fake news, etc.
  • Input attacks/ adversarial attacks: Attackers can feed altered inputs into AI systems, to trigger unexpected/incorrect results. 
  • Data Poisoning: Malicious actors corrupt AI training data sets by poisoning them with bad data, affecting the system’s accuracy. 

Examples of how AI can be weaponized

GPT-2 text generator/ language models 

In November 2019, OpenAI released the latest and largest version of GPT-2 (Generative Pretrained Transformer 2). This language model has the training to generate unique textual content, based on a given input. It even tailors the output style and subject based on the input. So, if you input a specific topic or theme, GPT-2 will yield a few lines of text. GPT-2 is exceptional in that it doesn’t produce pre-existing strings, but singular content that didn’t exist before the model created it. 

Drawbacks of GPT-2

The language model is built with 1.5 billion parameters and has a “credibility score” of 6.9 out of 10. The model received a training with the help of 8 million text documents. As a result, OpenAI claims that “GPT-2 outperforms other language models.” The text generated by GPT-2 is as good as text composed by a human. Since detecting this synthetic text is challenging, creating spam emails and messages, fake news, or performing targeted phishing attacks, among other things, becomes easier.

Image recognition software

Image recognition is the process of identifying pixels and patterns to detect objects in digital images. The latest smartphones (for biometric authentication), social networking platforms, Google reverse image search, etc. use facial recognition. AI-based face recognition softwares detect faces in the camera’s field of vision. Given its multiple uses across industries and domains, researchers expect the image recognition software market to make a whopping USD 39 billion, by 2021. 

Drawbacks of image recognition softwares 

Major smartphone brands are now using facial recognition instead of fingerprint recognition, in their biometric authentication systems. Since this cutting-edge technology is popular among consumers, cyber criminals have found ways to exploit it. 

  • Tricking facial recognition: It has been demonstrated that Apple’s Face ID can be duped using 3D masks. There are also other instances of deceiving facial recognition with infrared lights, glasses, etc. Identical twins, such as myself, can swap our smartphones to trick even the most efficient algorithms, currently available. 
  • Blocking automated facial recognition: As facial recognition depends on key features of the face, an alteration made to the features can block automated facial recognition. Similarly, researchers are exploring various ways by which automated facial recognition can be blocked.
Altering facial features (by CVDazzle)
Altering facial features (by CVDazzle)

For example: Researchers found that minor modifications to a stop sign confuses autonomous cars. If implemented in real life, these technologies could have severe consequences.

Subtle alterations to the sign comes at a cost
Subtle alterations to the sign comes at a cost (by securityintelligence)

Poisoned training sets

Machine learning algorithms that power Artificial Intelligence, learn from data sets (training sets) or by extracting patterns from data sets. 

Poisoning Machine Learning models
Poisoning Machine Learning models

Drawbacks of Machine Learning algorithms

Attackers can poison training sets with bad data, to alter a system’s accuracy. They can even “teach” the model to behave differently, through a backdoor or otherwise. As a result, the model fails to work in the intended way, and will remain corrupted.

In the most unusual of ways, Microsoft’ AI chatbot, Tay, was corrupted through Twitter trolls. Releasing the smart chatbot was on an experimental basis, to engage people in “playful conversations.” However, Twitter users deluged the chatbot with racist, misogynistic, and anti-semitic tweets, turning Tay into a mouthpiece for a terrifying ideology in under a day. 

What next?

AI is here to stay. So, as we build Artificial Intelligence systems that can efficiently detect and respond to cyber threats, we should take small steps to ensure they are not exploited:

  1. Focus on basic cybersecurity hygiene including network security and anti-malware systems.
  2. Ensure there is some human monitoring/ intervention even for the most advanced AI systems. 
  3. Teach AI systems to detect foreign data based on timestamps, data quality etc.