Weaponizing AI to orchestrate cyber attacks

Introduction

Since the coinage of the term in 1956, Artificial Intelligence (AI) has evolved considerably. From its metaphorical reference in Mary Shelly’s Frankenstein, to its most popular recent application in autonomous cars, AI has made a progressive shift, over the years. It influences all the major industries such as transportation, communication, banking, education, healthcare, media, etc. 

When it comes to cybersecurity, AI is changing how we detect and respond to threats. However, with the benefits, comes the risk of the potential misuse of AI capabilities. Is the primary catalyst for cybersecurity, also a threat to it?  

How do we use AI in our daily life?

Social media users encounter AI on a daily basis and probably don’t recognize it at all. Online shopping recommendations, image recognition, personal assistants such as Siri and Alexa, and smart email replies, are the most popular examples.

For instance, Facebook identifies individual faces in a photo, and helps users “tag” and notify them. Businesses often embed chatbots in their websites and applications. These AI-driven chatbots detect words in the questions entered by customers, to predict and deliver prompt responses. 

How do malicious actors abuse and weaponize AI?

To orchestrate attacks, cyber criminals often tinker with existing AI systems, instead of developing new AI programs and tools. Some common attacks that exploit Artificial Intelligence include: 

  • Misusing the nature of AI algorithms/ systems: AI capabilities such as efficiency, speed and accuracy can be used to devise precise and undetectable attacks like targeted phishing attacks, delivering fake news, etc.
  • Input attacks/ adversarial attacks: Attackers can feed altered inputs into AI systems, to trigger unexpected/incorrect results. 
  • Data Poisoning: Malicious actors corrupt AI training data sets by poisoning them with bad data, affecting the system’s accuracy. 

Examples of how AI can be weaponized

GPT-2 text generator/ language models 

In November 2019, OpenAI released the latest and largest version of GPT-2 (Generative Pretrained Transformer 2). This language model has the training to generate unique textual content, based on a given input. It even tailors the output style and subject based on the input. So, if you input a specific topic or theme, GPT-2 will yield a few lines of text. GPT-2 is exceptional in that it doesn’t produce pre-existing strings, but singular content that didn’t exist before the model created it. 

Drawbacks of GPT-2

The language model is built with 1.5 billion parameters and has a “credibility score” of 6.9 out of 10. The model received a training with the help of 8 million text documents. As a result, OpenAI claims that “GPT-2 outperforms other language models.” The text generated by GPT-2 is as good as text composed by a human. Since detecting this synthetic text is challenging, creating spam emails and messages, fake news, or performing targeted phishing attacks, among other things, becomes easier.

Image recognition software

Image recognition is the process of identifying pixels and patterns to detect objects in digital images. The latest smartphones (for biometric authentication), social networking platforms, Google reverse image search, etc. use facial recognition. AI-based face recognition softwares detect faces in the camera’s field of vision. Given its multiple uses across industries and domains, researchers expect the image recognition software market to make a whopping USD 39 billion, by 2021. 

Drawbacks of image recognition softwares 

Major smartphone brands are now using facial recognition instead of fingerprint recognition, in their biometric authentication systems. Since this cutting-edge technology is popular among consumers, cyber criminals have found ways to exploit it. 

  • Tricking facial recognition: It has been demonstrated that Apple’s Face ID can be duped using 3D masks. There are also other instances of deceiving facial recognition with infrared lights, glasses, etc. Identical twins, such as myself, can swap our smartphones to trick even the most efficient algorithms, currently available. 
  • Blocking automated facial recognition: As facial recognition depends on key features of the face, an alteration made to the features can block automated facial recognition. Similarly, researchers are exploring various ways by which automated facial recognition can be blocked.
Altering facial features (by CVDazzle)
Altering facial features (by CVDazzle)

For example: Researchers found that minor modifications to a stop sign confuses autonomous cars. If implemented in real life, these technologies could have severe consequences.

Subtle alterations to the sign comes at a cost
Subtle alterations to the sign comes at a cost (by securityintelligence)

Poisoned training sets

Machine learning algorithms that power Artificial Intelligence, learn from data sets (training sets) or by extracting patterns from data sets. 

Poisoning Machine Learning models
Poisoning Machine Learning models

Drawbacks of Machine Learning algorithms

Attackers can poison training sets with bad data, to alter a system’s accuracy. They can even “teach” the model to behave differently, through a backdoor or otherwise. As a result, the model fails to work in the intended way, and will remain corrupted.

In the most unusual of ways, Microsoft’ AI chatbot, Tay, was corrupted through Twitter trolls. Releasing the smart chatbot was on an experimental basis, to engage people in “playful conversations.” However, Twitter users deluged the chatbot with racist, misogynistic, and anti-semitic tweets, turning Tay into a mouthpiece for a terrifying ideology in under a day. 

What next?

AI is here to stay. So, as we build Artificial Intelligence systems that can efficiently detect and respond to cyber threats, we should take small steps to ensure they are not exploited:

  1. Focus on basic cybersecurity hygiene including network security and anti-malware systems.
  2. Ensure there is some human monitoring/ intervention even for the most advanced AI systems. 
  3. Teach AI systems to detect foreign data based on timestamps, data quality etc.

Hierarchical Attention Neural Networks: New Approaches for Text Classification

by Bofin Babu, Machine Learning Lead

Text classification is an important task in Natural Language Processing in which predefined categories are assigned to text documents. In this article, we will explore recent approaches for text classification that consider document structure as well as sentence-level attention.

In general, a text classification workflow is like this:

 

You collect a lot of labeled sample texts for each class (your dataset). Then you extract some features from these text samples. The text features from the previous step along with the labels are then fed into a machine learning algorithm. After the learning process, you’ll save your classifier model for future predictions.

One difference between classical machine learning and deep learning when it comes to classification is that in deep learning, feature extraction and classification is carried out together but in classical machine learning, they’re usually separate tasks.

 

Proper feature extraction is an important part of machine learning for document classification, perhaps more important than choosing the right classification algorithm. If you don’t pick good features, you can’t expect your model to work well. Before discussing further the feature extraction, let’s talk about some methods of representing text for feature extraction.

A text document is made of sentences which in turn made of words. Now the question is: how do we represent them in a way that features can be extracted efficiently?

Document representation can be based on two models:

  1. Standard Boolean models: These models use boolean logic and the set theory are used for information retrieval from the text. They are inefficient compared to the below vector space model due to many reasons and are not our focus.

  2. Vector space models: Almost all of the existing text representation methods are based on VSMs. Here, documents are represented as vectors.

Let’s look at two techniques by which vector space models can be implemented for feature extraction.

Bag of words (BoW) with TF-IDF

In Bag of words model, the text is represented as the bag of its words. The below example will make it clear.

Here are two sample text documents:

(i) Bofin likes to watch movies. Rahul likes movies too.

(ii) Bofin also likes to play tabla.

Based on the above two text documents, we can construct a list of words in the documents as follows.

[ “Bofin”, “likes”, “to”, “watch”, “movies”, “Rahul”, “too”, “also”, “play”, “tabla” ]

Now if you consider a simple Bag of words model with term frequency (the number of times a term appears in the text), feature lists for the above two examples will be,

(i) [1, 2, 1, 1, 2, 1, 1, 0, 0, 0]

(ii) [1, 1, 1, 0, 0, 0, 0, 1, 1, 1]

A simple term frequency like this to characterize the text is not always a good idea. For large text documents, we use something called term frequency – inverse document frequency (tf-idf)tf-idf is the product of two statistics, term frequency (tf) and inverse document frequency (idf). We’ve just seen from the above example what tf is, now let’s understand idf. In simple terms, idf is a measure of how much a word is common to all the documents. If a word occurs frequently inside a document, that word will have high term frequency, if it also occurs frequently in the majority of the documents, it will have a low inverse document frequency. Basically, idf helps us to filter out words like the, i, an that occur frequently but are not important for determining a document’s distinctiveness.

Word embeddings with word2vec

Word2vec is a popular technique to produce word embeddings. It is based on the idea that a word’s meaning can be perceived by its surrounding words (Like the proverb: “Tell me who your friends are and I’ll tell you who you are” ). It produces a vector space from a large corpus, with each unique word in the corpus being assigned a corresponding vector in the vector space. The heart of word2vec is a two-layer neural network which is trained to model linguistic context of words such that words that share common contexts in a document are in close proximity inside the vector space.

Word2vec uses two architectures for representation. You can (loosely) think of word2vec model creation as the process consisting of processing every word in a document in either of these methods.

  1. Continuous bag-of-words (CBOW): In this architecture, the model predicts the current word from its context (surrounding words within a specified window size)
  2. Skip-gram: In this architecture, the model uses the current word to predict the context.

Word2vec models trained on large text corpora (like the entire English Wikipedia) have shown to grasp some interesting relations among words as shown below.

An example of word2vec models trained on large corpora, capturing interesting relationships among words. Here, not only the countries and their capitals cluster in two groups, the distance in vector space between them are also similar. Image source: DL4J
An example of word2vec models trained on large corpora, capturing interesting relationships among words. Here, not only the countries and their capitals cluster in two groups, the distance in vector space between them are also similar. Image source: DL4J

Following these vector space models, approaches using deep learning made progress in text representations. They can be broadly categorized as either convolutional neural network based approaches or recurrent neural network (and its successors LSTM/GRU) based approaches.

Convolutional neural networks (ConvNets) are found impressive for computer vision applications, especially for image classification. Recent research that explores ConvNets for natural language processing tasks have shown promising results for text classification, like charCNN in which text is treated as a kind of raw signal at the character level, and applying temporal (one-dimensional) ConvNets to it.

Recurrent neural network and its derivatives might be perhaps the most celebrated neural architectures at the intersection of deep learning and natural language processing. RNNs can use their internal memory to process input sequences, making them a good architecture for several natural language processing tasks.

Although straight neural network based approaches to text classification have been quite effective, it’s observed that better representations can be obtained by including knowledge of document structure in the model architecture. This idea is conceived from common sense that,

  • Not all part of a document is equally relevant for answering a query from it.
  • Finding the relevant sections in a document involves modeling the interactions of the words, not just their presence in isolation.

We’ll explore one such approach where word and sentence level attention mechanisms are incorporated for better document classification.

APPLYING HIERARCHICAL ATTENTION TO TEXTS

Two basic insight from the traditional methods we discussed so far in contrast to hierarchical attention methods are:

  1. Words form sentences, sentences form documents. And essentially, documents have a hierarchical structure and a representation capturing this structure can be more effective.
  2. Different words and sentences in a document are informative to different extents.

To make this clear, let’s look at the example below:

In this restaurant review, the third sentence delivers strong meaning (positive sentiment) and words amazing and superb contributes the most in defining sentiment of the sentence.

Now let’s look at how hierarchical attention networks are designed for document classification. As I said earlier, these models include two levels of attention, one at the word level and one at the sentence level. This allows the model to pay less or more attention to individual words and sentences accordingly when constructing the representation of the document.

The hierarchical attention network (HAN) consists of several parts,

  1. a word sequence encoder
  2. a word-level attention layer
  3. a sentence encoder
  4. a sentence-level attention layer

Before exploring them one by one, let’s understand a bit about the GRU based sequence encoder, which’s the core of the word and the sentence encoder of this architecture.

Gated Recurrent Units or GRU is a variation of LSTMs (Long Short Term Memory networks) which is, in fact, a kind of Recurrent Neural Network. If you are not familiar with LSTMs, I suggest you read  this  wonderful article. 

Unlike LSTM, the GRU uses a gating mechanism to track the state of sequences without using separate memory cells. There are two types of gates, the reset gate, and the update gate. They together control how information is updated to the state.

Refer the above diagram for the notations used in the following content.

1. Word Encoder

A bidirectional GRU is used to get annotations of words by summarising information from both directions for words and thereby incorporating the contextual information.

 

Where xit is the word vector corresponding to the word wit. and We is the embedding matrix.

We obtain an annotation for a given word wit by concatenating the forward hidden state and backward hidden state,

 

2. Word Attention

Not all words contribute equally to a sentence’s meaning. Therefore we need an attention mechanism to extract such words that are important to the meaning of the sentence and aggregate the representation of those informative words to form a sentence vector.

 

At first, we feed the word annotation hit through a one-layer MLP to get uit (called the word-level context vector) as a hidden representation for hit. Then we get the importance vector (?) as shown in the above equation. The context vector is basically a high-level representation of how informative a word is in the given sentence and learned during the training process.

3. Sentence Encoder

Similar to the word encoder, here we use a bidirectional GRU to encode sentences.

 

The forward and the backward hidden states are calculations are carried out similar to the word encoder and the hidden state h is obtained by concatenating them as, 

 

Now the hidden state hi summarises the neighboring sentences around the sentence but still with the focus on i.

4. Sentence Attention

To reward sentences that are clues to correctly classify a document, we again use attention mechanism at the sentence level.

 

Similar to the word context vector, here also we introduce a sentence level context vector us.

Now the document vector v is a high-level representation of the document and can be used as features for document classification.

So we’ve seen how document structure and attention can be used as features for classification. In the benchmark studies, this method outperformed the existing popular methods with a decent margin on popular datasets such as Yelp Reviews, IMDB, and Yahoo Answers.

Let us know what you think of attention networks in the discussion section below and feel free to ask your queries.