With Herbie, humanoid robots are finally here

 

The concept of anthropomorphic robots has always captured the human imagination, starting with playwright Karel Čapek, who introduced the word “Robot” in the 1920s, to Isaac Asimov’s incredible science fiction. But, in reality, robots have been relegated to doing menial tasks such as lifting things or executing repetitive actions, as is the case with industrial robots. 

However, the latest developments in Artificial intelligence (AI) and advanced automation could give us advanced humanoid robots, capable of carrying out multiple functions that not only replace manual labour in manufacturing industries, but also have significant domestic, everyday uses. While AI enables machines to mimic remarkable human cognitive functions, the developments in the field of automation have widened the scope of its usage across different areas such as agriculture, manufacturing, business, analytics, and more. 

More significantly, AI has been able to transform elementary robotics to intelligent humanoids capable of interpreting voice commands and gestures, interacting with human beings and its own environment, locomoting, and other computational abilities. 

In this article we will discuss one such humanoid known as Herbie and delve into its mechanical structure and sensory faculties. 

 

Herbie

This humanoid model is a type of Socially Assistive Robot (SAR), developed to facilitate the rehabilitation of children with Cerebral Palsy (CP). CP is a congenital disorder caused by abnormal brain development. It affects muscular movement, motor skills, and results in various disabilities. While Herbie is not the first attempt to provide assistance in the field of medicine and therapy, it is a multi-functional humanoid built to address real-time challenges by leveraging different faculties such as vision, speech and hearing, smell, etc. 

 

Mechanical Structure of Herbie

The structure of Herbie is made of acrylic sheets which are light-weight. Herbie consists of a skeleton and various chambers, as well as a base and a head. It also has different subsections that hold electronic circuits and enable power supply. The skeletal structure consists of an acrylic tube that is 200mm in diameter and 400mm in height. The base of Herbie is made up of a circular acrylic sheet that has a radius of 192mm and a thickness of 5mm.

Herbie: Humanoid Robots

The base also consists of two clamps that hold the motor, and two castor wheels attached beneath it. The motors are DC motors and run at 60rpm. And each motor is individually powered and controlled by a driver through Raspberry Pi that functions as Herbie’s brain. 

Herbie base

 

The latest upgraded version of Herbie is more efficient and stable, with comparatively low power consumption. 

Humanoid Robot Herbie

Application of Intelligence

The artificial cognition of Herbie facilities:

  • Image processing through the faculty of vision
  • Speech processing through faculties of hearing and speaking
  • Gaseous detectors through the faculty of smell
  • Ultrasonic sensors through the faculty of touch

 

Faculty of vision and image processing

Herbie’s faculty of vision is used particularly for image processing that can help with:

  • Surveillance or obstacle detection
  • Navigation or path detection
  • Face recognition

 

Obstacle detection using Herbie

Herbie: Detection

The camera affixed to Herbie captures images and identifies obstacles as well as the path. The image captured is then mapped and labeled using a Planner. Based on the label, the subsequent motion commands are configured. For instance, if the label is “X,” then a motion command to move forward is configured. And if the label is “Y,” it should move to the right. 

 

Algorithm used for this purpose

Herbie Algorithm for obstacle detection

  • The first step in the algorithm is to detect obstacles from a captured image.
  • All the images are then processed as grayscale images as it is a single-layer image and reduces the quantity of information obtained from the image, as compared to a colour image (RGB). 
  • The image is then resized and divided into M x N matrices.
  • The edge of the image represents the outline of the obstacle. The surface and the shadow of the image will be composed of darker or brighter pixels, which helps Herbie tell the difference.
  • The edge of the obstacle is extracted by calculating the variance in a small square window which has the width w[pixel].
  • The variance value V(p,q) is calculated for each pixel. A Threshold (Th) is fixed for the calculation 

Th = b[V(p,q)] +tσ; where σ is a standard deviation of V(p,q), t is decided on an experimental basis. 

  • If the result of the calculation is greater than the threshold itself, Herbie identifies it as an obstacle. However, if the result of the calculation is lower than the threshold figure, Herbie continues to process images, moving in the default direction. 
  • The surface of obstacles is determined by extracting bright and dark pixels.
  • Areas that do not generate edge or surface data are considered to be minute areas of irregularity on the surface. Herbie rejects this as it does not detect any obstacle. 

  Humanoid robots: ROI

 

Humanoid robots: grayscale ROI

 

Image after variance

 

Result of obstacle detection

 

Faculty of Smell and Gas Sensors

Gas sensors are capable of detecting various gases as well smoke. Such sensors consist of sensing material that has lower conductivity when the air is clear of high concentrations of different gases such as LPG, Propane, Hydrogen, Methane, Smoke, Carbon Monoxide, etc. 

Similarly, the sensing material becomes highly conductive when the concentration of such gases are higher in the atmosphere.

Humanoid Robots: Gas sensors

  • In the case of Herbie, an MQ-2 gas sensor is used. 
  • The detection range is 300-10,000ppm. 
  • MQ-2 can function between the temperature range of -20℃ to 50℃.
  • The sensing material of MQ-2 is SnO2. 
  • The response time is less than 10 seconds. 
  • The analog reading is the converted to sensor voltage using the following equation:

Sensor Voltage = Analog Reading*3.3V/4095

  • The concentration of gas in PPM is then calculated using the equation:

PPM = 10.938*e˄ (1.7742*sensor voltage)

 

Faculty of touch and ultrasonic sensor

Ultrasonic sensors are used to measure the distance to an object. These sensors are usually environmentally independent. Its module transmits sound pulses and picks up the echo generated. It measures the time lapse between sending and receiving these pulses to calculate the distance to an object or obstacle. 

Humanoid Robots: Ultrasonic sensors

  • Herbie uses HC-SR04 ultrasonic sensors. 
  • It has a resolution of 3mm and the ranging distance between 2-50cm
  • As soon as the module detects an object it transmits the information to a remote station using a wireless transceiver. 
  • A pair of ZigBee is used to transmit and receive data. 
  • When a high pulse of 10μs hits the trigger pin, the sensor transmits 8 pulses of 40KHz each.
  • The range is then calculated using the difference in the time taken by the pulse/ signal to leave and return as shown in the equation below:

Distance(cm) = Time(sec)/58

Faculty of speech and hearing and speech recognition

The speech recognition process is composed of three major modules: 

  1. Acoustic analysis
  2. Training
  3. Testing

Architecture for speech processing

  • Multiple utterances of the same words that are part of the vocabulary are recorded. 
  • These acoustic signals are then processed using the acoustic analysis module. 
  • A knowledge base for the speech recognition system is developed using the training module.
  • And the testing module is employed for system testing.
  • A multilingual speech recognition system for associated words is developed using Hidden Markov Model Toolkit (HTK).
  • The system uses Mel Frequency Cepstral Coefficients (MFCC) as feature vectors for each speech unit. 
  • The speech samples are recorded at a sampling rate of 16,000Hz and are represented using 16 bits. 
  • For Herbie we chose three languages for recognition namely, Hindi, English, and Kannada. 
  • The system’s vocabulary consists of 75 different commands, 25 from each of these three languages. 
  • Commands include words such as front, backward, left, right, etc.
  • The confusion matrix for speech recognition in English was calculated at 98% accuracy.

 

Herbie: Confusion Matrix

 

Conclusion

Soon, we plan to carry out the training and testing of Herbie in challenging environments and probably extend the recognition capabilities to multiple languages. The estimated market size of assistive technology has expanded over the past couple of years. There is a growing demand for Socially Assistive Robots in the field of healthcare. And although it is not entirely possible to replace human beings, humanoid robots are here to stay and to transform the future of mankind.

ML model

Here’s how we proactively monitor XVigil’s 50+ Machine Learning models

 

Machine Learning (ML) models are essential to identifying patterns and making reliable predictions. And at CloudSEK, our models are trained to derive such predictions across data collected from more than 1000 sources. With over 50 different models running in production, monitoring these Machine Learning models is a daunting task and yet indispensable. 

The ML development life cycle consists of training and testing models, their deployment to production, and monitoring them for improved accuracy. A lack of adequate monitoring could lead to inaccurate predictions, obsolete models, and the presence of unnoticed bugs in them.

CloudSEK’s Data Engineering team works together with Data Scientists to deploy ML models and track their performances continuously. To achieve this, we ensure that the following requirements are fulfilled:

  • Model versioning: Enable multiple versions of the same models
  • Initializing observer patterns using incoming data 
  • Comparing results from different versions

At CloudSEK, different Machine Learning models and their multiple versions classify a document across various stages. Whereby, the client is alerted only to the most accurate results from efficient models or an ensemble of results by combining different versions.

 

What constitutes a version upgrade? 

At its core, all machine learning modules are composed of 2 parts. The output of an ML module depends on both of these components:

  • The core ML model weights file which is generated upon training a model.
  • The surrounding code statements that represent preprocessing, feature extraction, post-processing, etc. 

As a rule of thumb, any significant modifications made to these two components qualify as a version upgrade. However, minor changes or bug fixes or even static rules additions don’t lead to an upgrade, and are simply considered as regular code updates, which we track via Git. 

 

Deploying and Monitoring Models

 Generally, Machine Learning models are hosted on stateless docker containers. Such containerized models listen to queues for messages, as soon as the docker container runs on a system. The container maintains a configuration file with information about the type of models, their versions, and whether these models are meant for production. 

When the docker container is built, you can pass the latest Git repository Git commit hash to it, to be set as an environment variable. The diagram explains the data flow between ML models and their different versions: 

Machine Leaning models diagram

 

When the container is run, data is consumed from a message queue. The model name present in the configuration file determines the data that is consumed. Once it is processed, the predictions are returned as a dictionary which is then persisted into a database. 

The ML modules can also possibly return optional metadata that contains information such as the actual prediction scores, functions triggered inside, etc.

Given below is a sample of a document after processing the results from all the models:

 

{

"document_id" : "root-001#96bfac5a46", 

"classifications_stage_1_clf_v0" : {

"answer" : “suspected-defaced-webpage",

"content_meta" : null,

"hit_time" : ISODate("2019-12-24T14:54:09.892Z"),

"commit_hash" : "6f8e8033"

},

"classifications_stage_2_clf_v0" : {

"answer" : {

"reason" : null,

"type" : "nonthreat",

"severity" : null

},

"content_meta" : null,

"hit_time" : ISODate("2019-12-24T15:40:46.245Z"),

"commit_hash" : null

},

"classifications_stage_2_clf_v1" : {

"answer" : {

"reason" : null,

"type" : "nonthreat",

"severity" : null

},

"content_meta" : null,

"hit_time" : ISODate("2019-12-24T15:40:46.245Z"),

"commit_hash" : null

}

}

 

How this helps us

This process allows us to find, for any given document, the exact state of all the models that classified a particular document. We can rollback between the model versions and a minor change in the value provided in the configuration file should allow us to set the main production model apart from the test models. 

A Metabase instance can be leveraged to visualize key metrics and the performance of each classifier, on a dashboard. It may also contain details about the documents that are processed by each model, or how many documents were classified with category X, category Y, etc. ( in the case of classification tasks), and more.

 

 

Screenshot of the internal dashboard which helps in visualisations key metrics
Screenshot of the internal dashboard which helps in visualizing key metrics

Monitoring also allows data scientists to study and compare the results of the various versions of the models, given that the particulars of version outputs are retrieved. This data provides them with a set of documents that reveal which output may have been influenced by a new model. This data is then added to the training data to calibrate the models.

 

Neural networks

A peek into the black-box: Debugging deep neural networks for better predictions

 

Deep learning models are often criticized for being complex and opaque. They are called black-boxes because they deliver predictions and insights, but the logic behind their outputs is hard to comprehend. Due to its complex multilayer nonlinear neural networks, Data Scientists find it difficult to ascertain the factors or reasons for a certain prediction. 

This unintelligibility makes people wary of taking important decisions based on the models’ outputs. As human beings, we trust what we understand; what we can verify. And over time, this has served us well. So, being able to show how models go about solving a problem to produce insights, will help build trust, even among people with cursory knowledge of data science. 

To achieve this it is imperative to develop computational methods that can interpret, audit, and debug such models. Debugging is essential to understanding how models identify patterns and generate predictions. This will also help us identify and rectify bugs and defects.

In this article we delve into the different methods used to debug machine learning models. 

 

 

Source: interpretable-ml-book
Source: https://christophm.github.io/interpretable-ml-book/terminology.html

 

Permutation Importance

Also known as permutation feature importance, it is an algorithm that computes the sensitivity of a model to permutations/alterations in a feature’s values. In essence, feature importance evaluates each feature of your data and scores it based on its relevance or importance towards the output. While permutation feature importance, measures each feature of the data after it has been altered, and scores it based on its importance towards generating an output.

For instance, let us randomly permute or shuffle the values of a single column in the validation dataset with all the other columns intact. If the model’s accuracy drops substantially and causes an increase in error, that feature is considered “important”. On the other hand, a feature is considered ‘unimportant’ if shuffling its values doesn’t affect the model’s accuracy.

 

Debugging ML models using ELI5

ELI5 is a Python library that assists several ML frameworks and helps to easily visualize and debug black-boxes using unified API. It helps to compute permutation importance. But it should be noted that the permutation importance is only computed on test data after the model is built. 

Debugging ML models using ELI5

 

After our model is ready, we import ELI5 to calculate permutation importance.

Importing ELI5 for debugging ML models

The output for above code is shown below:

output

The features that are on top are the most important, which means that any alterations made to these values will reduce the accuracy of the model significantly. The features that are at the bottom of the list are unimportant in that any permutation made to their values will not reduce the accuracy of the model. In this example, OverallQual was the most important feature.

 

Debugging CNN-based models using Grad-CAM (Gradient-weighted Class Activation Mapping)

Grad-CAM is a technique that produces visual explanations for outputs to deliver transparent Convolutional Neural Network (CNN) based models. It examines the gradient information that flows into the final layer of the neural network to understand the output. Grad-CAM can be used for image classification, image captioning, and visual question answering. The output provided by Grad-CAM is a heatmap visualization, which is used to visually verify that your model is trained to look at the right patterns in an image.

Debugging ML models using Grad-CAM

 

Debugging ML models using Grad-CAM2

 

Debugging ML models using SHAP (SHapley Additive exPlanations)

SHAP is a game theoretic approach that aims to explain a prediction by calculating the importance of each feature towards that prediction. The SHAP library uses Shapley values at its core and explains individual predictions. Lloyd Shapley introduced the concept of Shapley in 1953 and it was later applied to the field of machine learning. 

Shapley values are derived from Game theory, where each feature in the data is a player, and the final reward is the prediction. Depending on their contribution towards the reward, Shapley values tell us how to distribute this reward fairly among the players.

We use SHAP quite often, especially for the models in which explainability is critical. The results are really quite accurate.

SHAP can explain:

  • General feature importance of the model by using all the data
  • Why a model calculates a particular score for a specific row/ record
  • The more dominant features for a segment/ group of data

Shapley values calculate the feature importance by comparing two predictions, one with the feature included and the other without it. The positive SHAP values affect the prediction/ target variable positively whereas the negative SHAP values affect the target negatively. 

Here is an example to explain the same. For this purpose, I am taking a red wine quality dataset from kaggle.

Debugging ML models using Shapley

 

Now, we produce variable importance plots, which lists the most significant variable in descending order. Wherein, the top variable would contribute more to the model.

Importing SHAP for debugging ML models

 

Mapping the plot

In the above figure, all the variables are plotted in descending order. Color of the variables represent the feature value, whether they are high (in red) or low (in blue) in that observation. A high level of the “sulphates” content has a high and positive impact on the quality rating. X-axis represents the “positive” impact. Similarly, we can say that “chlorides” is negatively correlated with the target variable.

Now, I would like to show you how SHAP values are computed in individual cases. Then, we execute such values on several observations and choose a few of the observations randomly.

Interpreting SHAP values

 

After choosing random observations, we initialize our notebook with initjs().

Explanation for some of the terms seen in the plot above: 

  1. Output value: Prediction for that observation, which in this case is 5.35
  2. Base value: Base value is the mean prediction, or mean (yhat), here it is 5.65
  3. Blue/ red: Features that can impact the prediction more are shown in red, and those that have the least impact are in blue color.

For more information and to test your skills, check out kaggle.

[Quiz] Weekly Cyber Trivia Friday #1

Cyber Trivia Friday is here!

In our first-ever cyber quiz, we want to find out if you’re up-to-date on your cybersecurity news from across the world.

If you’re behind on the news, fret not. We’ve sprinkled in some hints to help you along.

The winners will be the first 3 people to submit the quiz and get all 5 questions right.

START QUIZ

GraphQL 101: Here’s everything you need to know about GraphQL (Part 2)

 

Read Part 1 on GraphQL here.

In last week’s blog we learnt about the basics of GraphQL and how to configure a GraphQL server. We also discussed some key differences between GraphQL and REST API. 

Here’s a brief recap:

GraphQL is an open source query language for APIs that helps to load data from a server to the client in a much simpler way. Comparing multiple aspects of both GraphQL and REST API, we have seen that GraphQL operations are superior to that of REST API. Although the benefits of this open source language is plenty, it comes with security vulnerabilities that developers tend to encounter from time to time. 

Picking up from there, in the second part, we will discuss:

  • GraphQL: Common misconfigurations that enable hackers
  • Testing common misconfigurations in GraphQL

 

GraphQL: Common misconfigurations that enable hackers

GraphQL is one of the most efficient and flexible alternatives to REST API. However, it is vulnerable to the same attacks that REST API is prone to. 

GraphQL depends on API developers to implement its schema for data validation. During this process they could inadvertently introduce errors. In addition to this, new features and functionalities, which meet client requirements, are added to web applications by the hour. This also increases the chance of developers committing errors.

Here we highlight some of the common misconfigurations and issues with GraphQL, that allow hackers to exploit it.

  • Improper Access Control or Authorization Checks

The limitations of the user authentication process can potentially grant unauthorized access to anyone. In the event that the authentication process is defective, the web application fails to restrict access to an object or resource which leads to delivering the data of another user without performing authorization checks. The flawed method allows attackers to bypass the intended access restriction, exposing user data to abuse, deletion, or alteration.

Let’s consider the scenario where a user, with the numeric object ID: 5002, wants to retrieve his/her PII such as email ID, password, name, mobile number, etc. If this user knowingly or unknowingly uses a different user ID (say, 5005), and this ID belongs to another active user, but the search leaks his/ her data, it shows that the GraphQL resolver allows unauthorized access. 

The following query consists of a user ID representing a logged in user and fetches the user’s PII:

 

query userData {

  users(id: 5002) {

    name

    email

    id

    password

    mobileNumber

    authorizationKey

    bankAccountNumber

    cvvNumber

  }

}

 

However, if the same authenticated user is able to fetch the data of ID 5005, that means there is improper access control to an object or resource. It enables hackers/ attackers to access the data of multiple users. It also allows an attacker to perform malicious activities like editing the user’s data, deleting the user from the database, etc.

 

  • Rate Limit Issue and Nested GraphQL queries leading DoS attack

In a single query, it is capable of taking multiple actions in response to multiple HTTP requests. This feature increases the complexity of GraphQL APIs, which makes it difficult for API developers to limit the number of HTTP requests. If the request is not properly handled by the resolver at the GraphQL API layer then it opens a door for actors to attack the API and to perform denial-of-service (DoS) attacks.

To avoid this, only a specific number of requests per minute should be allowed to the API, and in case the number of requests exceeds the limit, it should trigger an error or reject the response.

Let’s look at the following example of such nested queries:

 

query nestedQuery {

  allUsers {

    posts {

      follower {

        author {

          posts {

            follower {

              author {

                posts {

                  follower {

                    author {

                      posts {

                        id

                      }

                    }

                  }

                }

              }

            }

          }

        }

      }

    }

  }

}

 

  • Default introspection system reveals internal information

Some API endpoints enable server-to-server communications and are not meant for the general public. And yet, GraphQL’s feature of introspection makes this possible without much difficulty. 

Assume the instance where someone sends a query against an internal endpoint only to gain access to admin credentials and thereby obtain Admin Privilege. 

Here is an example of a single endpoint which has the potential to allow attackers to access hidden API calls from the backend. 

Hidden API calls in GraphQL

 

There are several websites such as https://apis.guru/graphql-voyager/ that display the entire list of API calls available at the backend. This provides a better understanding of the its interface and also demonstrates ways to gather sensitive information from the server. 

Entire list of GraphQL API calls

We have covered a few of the misconfigurations here, but there could be many others. Also, it is vulnerable to all bugs that affect any other API.

 

Testing common misconfigurations in GraphQL

Here we’ll be exploring two ways to test these common misconfigurations, further exploited to gain access to sensitive data or information:

  • Burp Suite – Advance intercepting proxy tool
  • Altair GraphQL – GraphQL interface available as software, extension for OS and browser.

 

Testing GraphQL misconfigurations with Burp Suite

Burp Suite is a popular pentesting framework that works as a great proxy tool to intercept and visualize requests. Assuming that readers are already aware of how to configure it, we’ll be focusing on testing alone. 

  • This is what a normal HTTP request in GraphQL looks like:Normal HTTP request on GraphQL
  • Modify the POST query per your requirements and send it to the server. 
  • If it is misconfigured, it will fetch sensitive/ internal data. 
  • However, if you find it difficult to visualize or modify the query, you can use a Burp plugin called “GraphQL” to achieve the same results.

Burp plugin for GraphQL

 

But the major drawback of testing it in Burp Suite is the inadequate visualization of the entire schema documentation. This could result in several common misconfigurations being overlooked, unless we locate the proper documentation of API endpoints or calls implemented at the backend. Another issue related to using Burp is that we have to modify or debug the query manually which makes the process more complex. All these issues can be resolved with the help of the following method.

 

Debugging GraphQL with Altair GraphQL 

Altair helps with debugging GraphQL queries and server implementations. It rectifies the problem of modifying the queries manually and instead helps to focus on other aspects. 

Altair GraphQL is available as a software for Mac, Linux, and Windows OS and also as an extension for almost all browsers. It provides proper documentation of the entire schema that is available at the backend. All we need to do is configure the web application we are testing.

Three operations of GraphQL

 

All three operations of GraphQL can be seen in the documentation section as shown above. You can further explore these options to get other fields, API endpoints, etc.

Adding supported values

 

Altair is capable of solving one of the most challenging tasks in this process: Adding a query manually. It comes with the feature that enables automatic query generation. So, all we need to do is pass supported type values.

Add Query

 

  • Click on the ADD QUERY as shown above, to automatically add a query along with its arguments and fields. 
  • Now, provide the argument values to test for any bugs or misconfigurations.

Altair GraphQL

 

Altair makes bug hunting on any web application quite easy. You can test GraphQL queries and server implementations easily, as Altair performs the complex part of the process and lets you focus on the results.

 

Conclusion

In comparison with REST API, GraphQL offers developers improved standards of API development. And yet, it has its own security misconfigurations that can be exploited relatively easily. However, it is capable of bridging the gaps created by REST API. Alternatively, REST can help address the drawbacks of GraphQL as well. They need not be referred to as two incompatible technologies. Moreover, we believe these technologies can coexist with each other.

GraphQL 101: Here’s everything you need to know about GraphQL

 

API or an Application Program Interface is a family of protocols, commands, and functions that are used to build and integrate application software. In essence, API is the intermediary that enables two applications to connect. And for years, REST API was recognized as the standard protocol for web APIs. However, there is a visible trend that could potentially upend this inclination towards REST API. The State of Javascript 2018 report finds that, out of the developers who were surveyed in 2016, only 5% used GraphQL. Whereas in 2018, the numbers rapidly increased to a massive 20.4%. Although these numbers do not represent the preference of GraphQL over REST API, this clearly indicates the significant growth in the number of developers who opt for GraphQL in web applications. 

In a two-part series about GraphQL, we discuss the following topics:

  • Introduction to GraphQL 
  • GraphQL vs. REST
  • GraphQL’s declarative data fetching
  • GraphQL: Common misconfigurations that enable hackers
  • Testing common misconfigurations in GraphQL

In this part, we explore:

  • What is GraphQL?
  • How to identify a GraphQL instance?
  • How to configure the GraphQL server?
  • Root types of GraphQL schema
  • GraphQL vs. REST (focused on data fetching)
  • GraphQL: Smooth, declarative data fetching

 

Introduction to GraphQL

What is GraphQL?

It is an open source query language for APIs that helps to load data from a server to the client in a much simpler way. Unlike REST APIs, GraphQL APIs are arranged as types and fields and not as endpoints. This feature ensures that applications can request for only what can be obtained to provide clear responses. GraphQL API fetches the exact data requested; nothing less, nothing more.

It was developed by Facebook in 2012 and was later publicly released in 2015. Today, GraphQL is rapidly adopted by clients, big and small, from all over the world. Prominent websites and mobile apps such as Facebook, Twitter, GitHub, Pinterest, Airbnb, etc. use GraphQL.

REST API suffers from the lack of a robust API documentation making it difficult for developers to know the specific operations that the API supports and how to use them efficiently. But the GraphQL schema properly defines its operations, input values, and possible outcomes. When a client sends a query request, GraphQL’s resolver function fetches the result from the source.

GraphQL resolver

GraphQL allows frontend developers to retrieve data from the backend with unparalleled ease, which explains why it is generally described as Frontend Directed API technology. Evidently, it is more efficient, powerful, and flexible, and is a better alternative to REST API. 

 

How to identify a GraphQL instance?

  • A GraphQL HTTP request is typically sent to the following endpoints:
    • /graphql
    • /graphql/console
    • /graphiql
  • The GraphQL request looks similar to JSON but it’s different, as seen in the following request:

JSONWhile pentesting a web application, if you come across any of the above-mentioned attributes, then it most probably uses GraphQL.

 

How to configure the GraphQL server?

We can run the GraphQL API server on localhost with the use of Express, a web application framework for Node.js. 

  • For this, after you have installed “npm” proceed to install two additional dependencies which can be done using the following command:
npm install express express-graphql graphql --save
  • For more clarity of the concept, this is how we fetch “hello world” data from the GraphQL server using a query. 

 

var express = require('express');

var graphqlHTTP = require('express-graphql');


var { buildSchema } = require('graphql');

// Construct a schema, using GraphQL schema language

var schema = buildSchema(`

  type Query {

    hello: String

  }

`);

// The root provides a resolver function for each API endpoint

var root = {

  hello: () => {

    return 'Hello world!';

  },

};

var app = express();

app.use('/graphql', graphqlHTTP({

  schema: schema,

  rootValue: root,

  graphiql: true,

}));

app.listen(4000);

console.log('Running a GraphQL API server at http://localhost:4000/graphql');
  • Save the subsequent code as GraphQL_Server.js.
  • Now, run this GraphQL server with: 
node GraphQL_Server.js
  • If you don’t encounter an error, that means we have successfully configured our GraphQL server and we are good to proceed.Configuring GraphQL
  • Navigate to the endpoint http://locahost:4000/graphql in a web browser where you will come across the GraphQL interface. You can now enter your queries here. Entering a query in GraphQL
  • Once the query “hello” is issued, it fetches the following data from the server:

Hello World on GraphQL

Root types of GraphQL schema

GraphQL schema has three root types. Requests against GraphQL endpoint should start with any of these root types while communicating with the server:

  • Query: This type allows reading or fetching data from the backend server.GraphQL query
  • Mutation: This type is used to create, edit, or delete data. GraphQL Monitoring
  • Subscriptions: This root type enables real-time communication, to automatically receive real-time updates about particular events.

GraphQL Subscription

GraphQL is said to be client-driven, as it allows clients to add new fields and types to the GraphQL API and provide each field with functions. Clients can ultimately decide the exact response they require for queries. Contrary to how REST API operates, GraphQL permits clients to retrieve essential data alone, instead of the complete data.

 

GraphQL vs. REST

Both REST and GraphQL help developers design the functioning of APIs, and the process by which applications will be able to access data from it. They send data over HTTP. However, there are quite a few differences between them which prove that GraphQL is superior. 

A key differentiator is a method of fetching data from the backend server. While a typical REST API sends multiple requests to load data from multiple URLs, GraphQL has to send only a single request, to get the most accurate response, leaving out unwanted data behind. This is called declarative data fetching. Here’s an example to help you understand this better.

In REST API, multiple requests are sent at multiple endpoints to fetch the specific or particular data from the server. For instance, in order to fetch a user’s ID, his posts on a social media platform, and his followers on the same channel, we have to send multiple requests at the endpoint using the right path such as /user/<id>, /users/id/<posts>, and /users/id/<followers>  respectively, only to get multiple responses regarding the same.

REST API multiple endpoints

 

However, in the case of GraphQL, a single query, that includes all three requests, at a single endpoint will be enough to retrieve accurate responses to the corresponding requests. All we need to do is send a query through the interface, which gets interpreted against the entire schema, and returns the exact data that the client requested.

 

Smooth, declarative data fetching

In the instance of REST API mentioned above, we saw that 3 requests were made at 3 different endpoints to retrieve the corresponding response from the server. But in GraphQL, this process is relatively easier. A query that includes the exact data requirements like IDs, posts, and followers at a single endpoint would do the job. 

Let’s have a look at the query that is sent to the server to fetch a user’s ID, posts, and followers:

GraphQL endpoint

The above image shows how a query that includes data requirements is sent to the backend server, for which it’s response is retrieved at the client-side. 

The dissimilarities between REST and GraphQL is quite significant and can be understood while comparing their functioning. 

Difference between GraphQL and REST API

 

However, these differences do not necessarily mean that REST is not as efficient or flexible, after all it has been the standard API for several years now. However, GraphQL is gaining popularity because it addresses and bridges the shortcomings of REST API. 

In next week’s blog, we will explore the common GraphQL misconfigurations that enable hackers, and how you can test them. 

web-application-testing

6 major quality metrics that will optimize your web app

 

As more businesses migrate to cloud environments, making it easier for customers to access their services/ products, we have witnessed a sharp rise in the number of online businesses employing web applications. Also known as web apps, they have assumed great significance in this digital era, allowing businesses to develop and achieve their objectives, expeditiously.

Well designed web apps allow organizations to gain competitive advantage and appeal to more customers. Hence, it is essential to have measurable or quantifiable metrics to gauge the quality of a web app.

 

What is a web application?

Web apps are software programs that require a web browser for interaction. And unlike other applications, users need not install the software to run web applications; all they require is a web browser. Web applications include everything from small-scale online games to video streaming applications like Netflix.

 

 What are Software Quality Metrics?

Software quality metrics gauge the quality of the software, its development and maintenance, and the execution of the project itself. In essence, software quality metrics record not only the number of defects or security flaws in the software, but also the entire process of development of the project, as well as the product.  

 

Classification of Quality Metrics

Based on the components and features, software quality metrics can be classified into:

  • Product quality metrics
  • In-process quality metrics
  • Project quality metrics

A user grades the quality of an application based on their experience with its features/functionalities, the value it provides, and after-sales services such as maintenance, upgrades, etc. However, the quality of the software is also measured based on the project, the teams involved, project cost, etc.  

 

Six major quality metrics to consider for better web applications

 

  1. Usability of the web application:

Usability testing assesses the ease with which end-users consume the application. It ensures effective interaction between the user and the app. Web applications that have a complicated design or interface, are least prefered by users.

In order to test the usability of web apps, its navigation, content, and other user-facing features should be tested.

For example:

  • Images and other non-text content should be placed appropriately, so as to avoid distractions.
  • The options “Search” and “Contact us” should be easy to find. 

 

  1. Performance of the web application:

Performance testing determines the behaviour of the application under different settings and configurations. For example: Performance during high usage vs normal usage. Performance of a web app contributes to its adoption, continued usage, and overall success.  

Types of performance testing

  • Load testing
  • Web stress testing

In load testing, we evaluate the performance of the web app when multiple users access it concurrently. This helps to ascertain if the app can sustain peak hours, handle large user requests or simultaneous database access requests, etc.

In web stress testing, the system is tested beyond the limits of standard conditions.  The objective of web stress testing is to assess the behaviour of the app during volatile conditions such as when web pages time out or a delay between requests and responses, and how it recovers from crashes.

 

  1. Compatibility on different platforms and browsers:

The quality of the software also depends on whether the application is compatible with different browsers, hardware, operating systems, applications, network environments, and devices.

For instance,

  • If developers intend to have a mobile version of a web application, they ought to address and resolve any issues that may arise in that scenario.
  • While performing various actions such as printing or downloading, from a web application, the elements on the page, including text, images, etc., should be fixed in place, and properly aligned to fit on the page. 

 

  1. Requirements Traceability:

This parameter traces and maps user requirements throughout its life (from its source, through stages of its development and deployment), using test cases. It checks whether every user requirement is met and defines the purpose of each requirement and the factors they depend on.

 

Modes of requirement traceability

Based on the direction of tracing, requirement traceability can be classified into:

  • Forward traceability: Tracing the requirement sources to the resulting requirement, to ensure coherence.
  • Backward traceability: Tracing the various components of design or implementation back to its source, to verify that requirements are updated.
  • Bidirectional traceability: Tracing both backward and forward.

 

  1. Reliability:

A web application is not reliable if it does not produce consistent results. In an ideal situation, the application must operate failure-free, for a specified period of time, in a particular environment.

For example, a medical thermometer is only reliable if it measures the accurate temperature every time it is used.

 

  1. Security testing for the web application:

The security implementations of a web application is another factor that determines its success.  As a study shows, hackers can attack users in 9 out of 10 web applications. These attacks include redirecting users to a malicious site, stealing credentials, and spreading malware. So, ignoring this factor could cause serious damage to users and their businesses.

For example,

  • To test the security of web applications, we test URLs that a user can and cannot access. If an online document has an ID/ identifier such as ID=”456″ or identifier=”zm9vdC0xNl8yMDE5…” at the end of its URL, the user should only be able to access that document. In the event that the user tries to change the ID/ identifier, they should receive an appropriate error message upon altering the URL.
  • Automatic traffic can be prevented by using CAPTCHA.

Types of security testing

  • Dynamic Application Security Testing (DAST): It detects indicators of security vulnerabilities in applications that are running.
  • Static Application Security Testing (SAST): It analyzes the application source code, and/ or compiled versions of code that are indicative of security vulnerabilities.
  • Application Penetration Testing: It assesses how applications defend against possible attacks.

 

Additional components to be considered

To ensure that the web application is fully functional in all aspects, the following components should be inspected:

Links

  • Internal links
  •  Outgoing links
  •  Links that direct users to another section on the same page
  • Orphan pages in web applications
  • Broken links

Forms or other input fields

  • Verify all validations
  • Check default values
  • Wrong input
  • Links to update forms, edit forms, delete forms, etc. (if any)

Database 

  • Review data integrity while editing, deleting, and updating forms
  • Check if data is being retrieved and updated correctly

Cookies 

  • Check whether the cookies are encrypted or not
  • Evaluate application behavior after deleting cookies

How to bypass CAPTCHAs easily using Python and other methods

 

Internet service providers generally face the risk of authentication-related attacks, spam, Denial-of-Service attacks, and data mining bots. Completely Automated Public Turing test, to tell Computers and Humans apart, popularly known as CAPTCHA, is a challenge-response test created to selectively restrict access to computer systems. As a type of Human Interaction Proof, or a human authentication mechanism, CAPTCHA generates challenges to identify users. In essence, a CAPTCHA test can tell machines/ computers and humans apart. This has caused a heightened adoption of CAPTCHAs across various online businesses and services.

The concept of CAPTCHA depends on human sensory and cognitive skills. These skills enable humans to read a distorted text image or choose specific images from several different images. Generally, computers and computer programs such as bots are not capable of interpreting a CAPTCHA as they generate distorted images with text or numbers, which most Optical Character Recognition (OCR) technologies fail to make sense of. However, with the help of Artificial Intelligence, algorithms are getting smarter and bots are now capable of cracking these tests. For instance, there are bots that are capable of solving a text CAPTCHA through letter segmentation mechanisms. That said, there aren’t a lot of automated CAPTCHA solving algorithms available. 

This article outlines the various methods of generating and verifying CAPTCHAs, their application, and multiple ways to bypass CAPTCHAs.

 

Reasons for using CAPTCHA

Web developers deploy CAPTCHAs on websites to ensure that they are protected against bots. CAPTCHAs are generally used to prevent:

  • Bots from registering for services such as free email.
  • Scraper bots from gathering your credentials or personal information, upon logging in or while making online payments.
  • Bots from submitting online responses.
  • Brute-force bot attacks.
  • Search engine bots from indexing pages with personal/ sensitive information.

 

General flow of CAPTCHA generation and verification

The image below represents the common method of generating and verifying CAPTCHAs:

Form Submission

Application of different types of CAPTCHA and how to bypass them

 

I. reCAPTCHA and the protection of websites

recaptcha

Google reCAPTCHA is a free service offered to prevent spam and abuse of websites. It uses advanced risk analysis techniques and allows only valid users to proceed. 

Process flow diagram of Google reCAPTCHA
Process flow diagram of Google reCAPTCHA

 

How to bypass reCAPTCHA?

Verification using browser extensions

Browser extensions such as Buster help solve CAPTCHA verification challenges. Buster, for instance, uses speech recognition software to bypass reCAPTCHA audio challenges. reCAPTCHA allows users to download audio files. Once it is downloaded, Google’s own Speech Recognition API can be used to solve the audio challenge.

CAPTCHA solving services

Online CAPTCHA solving services offer human based services. Such services involve actual human beings hired to solve CAPTCHAs. 

 

II. Real person CAPTCHA and automated form submissions

The jQuery real person CAPTCHA plugin prevents automated form submissions by bots. These plugins offer text-based CAPTCHAs in a dotted font. This solves the problem of fake form submissions. 

 

text in dotted font

 

How to bypass real person CAPTCHA?

The following steps can be used to solve real person CAPTCHAs:

A. Create data set

In this one-time process:

  1. Collect texts from real person HTML tags
  2. Group the texts based on the words
  3. Create data set model for A-Z words (training data)
B. Testing to predict the solutions

After successfully completing process A, set up a process to:

  1. Collect texts from real person HTML tags
  2. Group the texts based on the words
  3. Fetch the word from the data set model created in process A.

 

Example: 



from selenium import webdriver
import time


dataset = {'     *       *      *      *      *      ******* ': 'J',
           '*******      *      *      *      *      *      *': 'L',
           '********  *  **  *  **  *  **  *  **  *  * ** ** ': 'B',
           '*       *       *       ****  *     *     *      ': 'Y',
           '*      *      *      ********      *      *      ': 'T',
           ' ***** *     **     **     **     **     * *   * ': 'C',
           '********  *  **  *  **  *  **     **     **     *': 'E',
           '********     **     **     **     **     * ***** ': 'D',
           '*     **     **     *********     **     **     *': 'I',
           ' ***** *     **     **     **     **     * ***** ': 'O',
           '******* *       *       *     *     *     *******': 'M',
           '******* *       *       *       *       * *******': 'N',
           '********  *   *  *   *  *   *      *      *      ': 'F',
           ' **  * *  *  **  *  **  *  **  *  **  *  * *  ** ': 'S',
           ' ***** *     **     **     **   * **    *  **** *': 'Q',
           '*******   *     * *    * *   *   *  *   * *     *': 'K',
           '     **   **   ** *  *   *   ** *     **       **': 'A',
           '******       *      *      *      *      ******* ': 'U',
           '*******   *      *      *      *      *   *******': 'H',
           '**       **       **       *    **   **   **     ': 'V',
           '*     **    ***   * **  *  ** *   ***    **     *': 'Z',
           '********  *   *  *   *  *   *  *   *  *    **    ': 'P',
           '*     * *   *   * *     *     * *   *   * *     *': 'X',
           ' ***** *     **     **     **   * **   * * *  ** ': 'G',
           '********  *   *  *   *  *   *  **  *  * *  **   *': 'R',
           '*******     *     *     *       *       * *******': 'W'}


def group_captcha_string(word_pos):
    captcha_string = ''
    for i in range(len(word_pos[0])):
        temp_list = []
        temp_string = ''

        for j in range(len(word_pos)):
            val = word_pos[j][i]
            temp_string += val

            if val.strip():
                temp_list.append(val)

        if temp_list:
            captcha_string += temp_string
        else:
            captcha_string += 'sp'

    return captcha_string.split("spsp")


# create client
client = webdriver.Chrome()
client.get("http://keith-wood.name/realPerson.html")
time.sleep(3)

# indexing text
_get = lambda _in: {index: val for index, val in enumerate(_in)}

# get text from html tag
captcha = client.find_element_by_css_selector('form [class="realperson-text"]').text.split('\n')

word_pos = list(map(_get, captcha))

# group text
text = group_captcha_string(word_pos)

# get text(test)
captcha_text = ''.join(list(map(lambda x: dataset[x] if x else '', text)))
print("captcha:", captcha_text)

III. Text-in-image CAPTCHA

Text-based/ text-in-image CAPTCHAs are the most commonly deployed kind and they use distorted text rendered in an image. There are two types of text-based CAPTCHAs:

 

Simple CAPTCHA

Simple CAPTCHAs can be bypassed using the Optical Character Recognition (OCR) technology that recognizes the text inside images, such as scanned documents and photographs. This technology converts images containing written text into machine-readable text data.

simple

Example:



import pytesseract
import sys
import argparse
try:
    import Image
except ImportError:
    from PIL import Image
from subprocess import check_output


def resolve(path):
    print("Resampling the Image")
    check_output(['convert', path, '-resample', '600', path])
    return pytesseract.image_to_string(Image.open(path))


if __name__=="__main__":
    argparser = argparse.ArgumentParser()
    argparser.add_argument('path', help = 'Captcha file path')
    args = argparser.parse_args()
    path = args.path
    print('Resolving Captcha')
    captcha_text = resolve(path)
    print('Extracted Text', captcha_text)



# command to run script
python3 captcha_resolver.py cap.jpg

 

Complicated CAPTCHA

These text-in-image CAPTCHAs are too complex to be solved using the OCR technology. Instead the following measures can be considered:

  • Build machine learning models such as Convolutional Neural Network (CNN) or Recurrent Neural Network (RNN)
  • Resort to CAPTCHA solving services

 

 

IV. Sum of integers or logical operations

This unique challenge involves solving mathematical problems, particularly, finding the sum of integers.

logical captcha

To bypass this challenge, one can:

  1. Extract text from HTML tags or images
  2. Identify the operator
  3. Perform the logic
  4. Get the result

 

V. Mitigating DDoS attacks using CAPTCHAs

In distributed denial-of-service attacks, cyber criminals target network resources and render them inaccessible to users. These attacks temporarily or indefinitely slows down the target resource by flooding the target with incoming traffic from several hosts. To prevent such attacks, businesses use CAPTCHAs. 

DDoS

The following methods or programs can be used to bypass DDoS protected sites:

  1. JavaScript supported browsers (Chrome/ Firefox)
  2. Deriving logic to generate DDoS answers
  3. Fetch the DDoS problem on the site and execute it using node.js
crawling

How are Python modules used for web crawling?

 

Let us assume search engines, like Google, never existed! How would you find what you need from across 4.2 billion web pages? Web crawlers are programs written to browse the internet, to gather information, index and parse the collected data, to facilitate quick searches. Crawlers are, thus, a smart solution to big data sets and a catalyst to major advancements in the field of cyber security.

In this article, we will learn:

  1. What is crawling?
  2. Applications of crawling
  3. Python modules used for crawling
  4. Use-case: Fetching downloadable URLs from YouTube using crawlers
  5. How do CloudSEK Crawlers work?

web crawler

What is crawling?

Crawling refers to the process of scraping/ extracting data from websites/ the internet using web crawlers. For instance, Google uses spider bots (crawlers) to read the content of billions of web pages and posts. Then, it gathers data from these sites and arranges them in the Google Search index.

Basic stages of crawling: 
  1. Scrape data from the source
  2. Parse the collected data
  3. Clean the data of any noise or duplicate entries
  4. Structure the data as per requirement

 

Applications of crawling

Organizations crawl and scrape data off of web pages for various reasons that may benefit them or their customers. Here are some lesser known applications of crawling: 

  • Comparing data for market analysis
  • Monitoring data leaks
  • Preparing data sets for Machine Learning algorithms
  • Fact-checking information on social media

 

Python modules used for crawling

  • Requests – Allow you to send HTTP requests to web pages
  • Beautifulsoup – Python library that retrieves data from HTML and XML files, and parses its elements to the required format
  • Selenium – Open source testing suite used for web applications. It also performs browser actions to retrieve data.

 

Use-case: Fetching downloadable URLs from YouTube using crawlers

A single YouTube video may have several downloadable URLs, based on: its content, resolution, bitrate, range and VR/3D. Here is a sample API and CLI code to get downloadable URLs on YouTube along with their Itags:

 

Project structure 

youtube
|
|---- app.py
|---- cli.py
`---- core.py

The project will contain three files:

app.py: For the api interface, using flask micro framework

cli.py: For command line interface, using argparse module

core.py: Contains all the core (common) functionalities which act as helper functions for app.py and cli.py.

# youtube/app.py
import flask
from flask import jsonify, request
import core
app = flask.Flask(__name__)
app.config["DEBUG"] = True
@app.route('/', methods=['GET'])
def get_downloadable_urls():
if 'url' not in request.args:
return "Error: No url field provided. Please specify an youtube url."
url = request.args['url']
urls = core.get_downloadable_urls(url)
return jsonify(urls)
app.run()

The flask interface code to get downloadable URLs through API.

Request url - localhost:<port>/?url=https://www.youtube.com/watch?v=FIVPlraNgXs

# youtube/cli.py
import argparse
import core
my_parser = argparse.ArgumentParser(description='Get youtube downloadable video from url')
my_parser.add_argument('-u', '--url', metavar='', required=True, help='youtube url')
args = my_parser.parse_args()
urls = core.get_downloadable_urls(args.url)
print(f'Got {len(urls)} urls\n')
for index, url in enumerate(urls, start=1):
print(f'{index}. {url}\n')

Code snippet to get downlodable urls through comand line interface (using argparse to parse command like arguments)

Command line interface - python cli.py -u 'https://www.youtube.com/watch?v=aWPYw7iVBg0'

# youtube/core.py
import json
import re
import requests
def get_downloadable_urls(url):
html = requests.get(url).text
RE = re.compile(r'ytplayer[.]config\s*=\s*(\{.*?\});')
conf = json.loads(RE.search(html).group(1))
player_response = json.loads(conf['args']['player_response'])
data = player_response['streamingData']
return [{'itag': frmt['itag'],'url': frmt['url']} for frmt in data['adaptiveFormats']]

This is the core (common) function for both API and CLI interface. 

The execution of these commands will:

  1. Take YouTube url as an argument
  2. Gather page source using the Requests module
  3. Parse it and get streaming data
  4. Return response objects: url and itag

How to use these URLs?

  • Build your own YouTube downloader (web app)
  • Build an API to download YouTube video

Sample result 

[{
'itag': 251,
'Url': 'https://r2---sn-gwpa-h55k.googlevideo.com/videoplayback?expire=1585225812&ei=9Et8Xs6XNoHK4-EPjfyIiA8&ip=157.46.68.124&id=o-AGeDi3DVtAbmT5GiuGsDU7-NPLk23fOXNnY16gGQcHWu&itag=251&source=youtube&requiressl=yes&mh=Av&mm=31%2C26&mn=sn-gwpa-h55k%2Csn-cvh76ned&ms=au%2Conr&mv=m&mvi=1&pl=18&initcwndbps=112500&vprv=1&mime=audio%2Fwebm&gir=yes&clen=14933951&dur=986.761&lmt=1576518368612802&mt=1585204109&fvip=2&keepalive=yes&fexp=23882514&c=WEB&txp=5531432&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cgir%2Cclen%2Cdur%2Clmt&sig=ADKhkGMwRAIgK4L4VVHAlWMPVPEcmdkhnb2u8UM6eYhFz16kGruxZjUCIFXZJM9ejVK7OZJFqx7YwBqa3CrDvVakuU86vcIyMv-a&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=ABSNjpQwRAIgKBhJytjv73-c7eMWbVkb-X8_rNb7_xApZvaPfw7wGcMCIHqJ405fQ3Kr-e_5fV8gokMUNi0rrrLG8T85sLGTQ17W'
}]

What is ITag?

ITag gives us more details about the video such as the type of video content, resolution, bitrate, range and VR/3D. A comprehensive list of YouTube format code ITags can be found here.

How do CloudSEK Crawlers work?

 

cloudsek crawlers

 

CloudSEK’s digital risk monitoring platform, XVigil, scours the internet across surface web, dark web, and deep web, to automatically detect threats and alert customers. After configuring a list of keywords suggested by the clients, CloudSEK Crawlers:

  1. Fetch data from various sources on the internet
  2. Push the gathered data to a centralized queue
  3. ML Classifiers group the data into threats and non-threats
  4. Threats are immediately reported to clients as alerts, via XVigil. Non-threats are simply ignored.

How do you achieve concurrency with Python threads?

Introduction

The process of threading allows the execution of multiple instructions of a program, at once. Only multi-threaded programming languages like Python support this technique. Several I/O operations running consecutively decelerates the program. So,  the process of threading helps to achieve concurrency.

In this article, we will explore:

  1. Types of concurrency in Python
  2. Global Interpreter Lock
  3. Need for Global Interpreter Lock
  4. Thread execution model in Python 2
  5. Global Interpreter Lock in Python 3

Types of Concurrency In Python

In general, concurrency is the parallel execution of different units of a program, which helps optimize and speed up the overall process. In Python, there are 3 methods to achieve concurrency:

  1. Multi-Threading
  2. Asyncio
  3. Multi-processing

We will be discussing the fundamentals of thread-execution model. Before going into the concepts directly, we shall first discuss Python’s Global Interpreter Lock (GIL).

Global Interpreter Lock

Python threads are real system threads (POSIX threads). The host operating system fully manages the POSIX threads, also known as p-threads.

In multi-core operating systems, the Global Interpreter Lock (GIL) prevents the parallel execution of p-threads of a multi-threaded Python process. Thus, ensuring that only one thread runs in the interpreter, at any given time.

Why do we need Global Interpreter Lock?

GIL helps to simplify, the implementation of the interpreter, and memory management. To understand how GIL does this, we need to understand reference counting.

For example: In the code below, ‘b’ is not a new list. It is just a reference to the previous list ‘a.’

>>> a = []
>>> b = a
>>> b.append(1)
>>> a
[1]
>>> a.append(2)
>>> b
[1,2]

 

Python uses reference counting variables to track the number of references that point to an object. The memory occupied by the object is released if the value of the reference counting variable is zero. If threads, of a process sharing the same memory, try to access this variable to increment and decrement simultaneously, it can cause leaked memory that is never released, or releasing the memory incorrectly. And this leads to crashes.

One solution is to have a lock for the reference counting variable object memory by using semaphores so that it is not modified simultaneously. However, adding locks to all objects increases the performance overhead in the acquisition and release of locks. Hence, Python has a GIL which gives access to all resources of a process to only one thread at one time. Apart from GIL there are other solutions, such as garbage collection used in JPython interpreter, for memory management.

So, the primary outcome of GIL is, instead of parallel computing you get pre-emptive (threading) and co-operative multitasking (asyncio).

Thread Execution Model in Python 2

 

#sek.py #par.py
import time
def countdown(n):
     while n > 0:
     n -= 1
count = 50000000
start = time.time()
countdown(count)
end = time.time()
print('Time taken in seconds -', end-start)

 

import time
from threading import Thread
COUNT = 50000000
def countdown(n):
     while n>0:
     n -= 1
t1 = Thread(target=countdown, args=(COUNT//2,))
t2 = Thread(target=countdown, args=(COUNT//2,))
​start = time.time()
t1.start()
t2.start()
t1.join()
t2.join()
end = time.time()
​print('Time taken in seconds -', end - start)

 

~/Pythonpractice/blog❯ python seq.py
(‘Time taken in seconds -‘, 1.2412900924682617)
~/Pythonpractice/blog❯ python par.py
(‘Time taken in seconds -‘, 1.8751227855682373)

Ideally, the par.py execution time should be half of the seq.py execution time. However, in the above example we can see that the par.py execution time is slightly higher than that of seq.py. To understand the reduction in performance, despite the sharing the work between two threads which run in parallel, we need to first discuss CPU-bound and I/O-bound threads.

CPU-bound threads are threads performing CPU intense operations such as matrix multiplication or nested loop operations. Here, the speed of program execution depends on CPU performance.

 

 

I/O-bound threads are threads performing I/O operations such as listening to a socket or waiting for a network connection. Here, the speed of program execution depends on factors including external file systems and network connections.

 

Scenarios in a Python multi-threaded program

When all threads are I/O-bound

If a thread is running, it holds the GIL. When the thread hits the I/O operation, it releases the GIL, and another thread acquires it to get executed. This alternate execution of threads is called multi-tasking.

 

 

Where one thread is CPU bound, and another thread is IO-bound:

A CPU bound thread is a unique case in thread execution. A CPU bound thread releases the GIL after every few ticks and tries to acquire again. A tick is a machine instruction. When releasing the GIL, it also signals the next thread in the execution queue (ready queue of the operating system) that the GIL has been released. Now, these CPU-bound and I/O-bound threads are in a race to acquire the GIL. The operating system decides which thread needs to be executed. This model of executing the next thread in the execution queue, before completing the previous thread, is called pre-emptive multitasking.

 

 

 

In most of the cases, operating system gives preference to the CPU bound thread and allows it to reacquire the GIL, leaving the I/O-bound thread starving. In the below diagram, the CPU-bound thread has released GIL and signaled thread 2. But even before thread 2 tries to acquire GIL, the CPU-bound thread has reacquired the GIL.  This issue has been resolved in Python 3 interpreter’s GIL.

 

 

In a single core operating system, if the CPU bound thread reacquires GIL, it pushes back the second thread to the ready queue, assigning it some priority. This is because Python doesn’t have control over the priorities assigned by the operating system.

In a multicore operating system, if the CPU bound thread reacquires the GIL then it does not push back the second thread, but continuously tries to acquire the GIL using another core. This characterizes thrashing. Thrashing reduces the performance if many threads try to acquire the GIL, using different cores of the operating system. Python 3 also addresses the issue of thrashing.

 

 

Global Interpreter Lock in Python 3

The Python 3 threat execution model has a new GIL. If there is only one thread running, it continues to run, until it hits an I/O operation or other thread requests, to drop the GIL. A global variable (gil_drop_request) helps to implement this.

If gil_drop_request = 0, running thread can continue until it hits I/O

If gil_drop_request = 1, running thread is forced to give up the GIL

Instead of CPU-bound thread check after every few ticks, the second thread is sending a GIL drop request by setting the variable gil_drop_request = 1 after reaching a timeout. The first thread will then immediately drop the GIL. Additionally, to suspend the first thread’s execution, the second thread sends a signal. This helps to avoid thrashing. This check is not available in Python 2.

 

 

Missing Bits in the New GIL

While the new GIL does address issue such as thrashing, it still has some areas of improvement:

Waiting for time out

Waiting for timeout can make the I/O response slow. Especially, when there are multiple, recurrent I/O operations. I/O-bound Python programs take considerable time to add a GIL, followed by a time out. This happens after every I/O operation, and before the next input/output is ready.

Unfair GIL acquiring

As seen below, the thread that makes the GIL drop request is not the one that gets the GIL. This type of situation can reduce the performance of I/O, where response time is critical.

 

 

Prioritizing threads

There is a need for the GIL to distinguish between CPU-bound and I/O-bound threads and then assign priorities. High priority threads must be able to immediately preempt low priority threads. This will help improve the response time considerably.  This issue has already been resolved in operating systems. Operating systems use timeout to automatically adjust task priorities. If a thread is preempted by a timeout, it is penalized with a low priority. Conversely, if a thread suspends early, it is rewarded with raised priority. Incorporating this in Python will help improve thread performance.