Home » Technical Topics » Data Science

Could an explainable model be inherently less secure?

  • ajitjaokar 
Machine learning digital transformation, creative abstract shape
Machine Learning Is Not Proof Against Hacking

In  a week when privacy is very much on the agenda, we ask how can we protect AI models. This is already a mature field but still not on the radar of most developers. Recently, ENISA published a report called securing machine learning applications (link below), which gave a good summary of the key threats involved

We first explain the threats and then list the vulnerabilities mapped to the threats

Evasion

A type of attack in which the attacker works on the ML algorithm’s inputs to find small perturbations leading to large modification of its outputs (e.g. decision errors). It is as if the attacker created an optical illusion for the algorithm. Such modified inputs are often called adversarial examples.

Oracle

A type of attack in which the attacker explores a model by providing a series of carefully crafted inputs and observing outputs. These attacks can be previous steps to more harmful types, evasion or poisoning for example. Example: an attacker studies the set of input-output pairs and uses the results to retrieve training data.

Poisoning

A type of attack in which the attacker altered data or model to modify the ML algorithm’s behavior in a chosen direction (e.g. to sabotage its results, to insert a backdoor). It is as if the attacker conditioned the algorithm according to its motivations.  Example: massively indicating to an image recognition algorithm that images of dogs are indeed cats to lead it to interpret it this way.

Label modification

An attack in which the attacker corrupts the labels of training data.

Model or data disclosure

This threat refers to the possibility of leakage of all or partial information about the model. Example: the outputs of a ML algorithm are so verbose that they give information about its configuration (or leakage of sensitive data)

Data disclosure

This threat refers to a leak of data manipulated by ML algorithms. This data leakage can be explained by an inadequate access control, a handling error of the project team or simply because sometimes the entity that owns the model and the entity that owns the data are distinct.

Model disclosure

This threat refers to a leak of the internals (i.e. parameter values) of the ML model. This model leakage could occur because of human error or contraction with a third party with a too low-security level.

Compromise of ML application components

This threat refers to the compromise of a component or developing tool of the ML application.

Example: compromise of one of the open-source libraries used by the developers to implement the ML algorithm

Failure or malfunction of ML application

This threat refers to ML application failure (e.g. denial of service due to bad input, unavailability due to a handling error).  For example, the service level of the support infrastructure of the ML application hosted by a third party is too low compared to the business needs, the application is regularly unavailable.

Human error

The different stakeholders of the model can make mistakes that result in a failure or malfunction of ML application. For example, due to lack of documentation, they may use the application in use-cases not initially foreseen.

Having understood an idea of the threats, lets now look at the vulnerabilities that map to the above threats

Could an explainable model be inherently less secure?

Evasion               

  • Lack of detection of abnormal inputs
  • Poor consideration of evasion attacks in the model design implementation
  • Poor consideration of evasion attacks in the model design implementation
  • Lack of training based on adversarial attacks
  • Using a widely known model allowing the attacker to study it
  • Inputs totally controlled by the attacker which allows for input-output-pairs
  • Use of adversarial examples crafted in white or grey box conditions (e.g. FGSM…)
  • Too much information available on the model
  • Too much information about the model given in its outputs

Oracle  

  • The model allows private information to be retrieved
  • Too much information about the model given in its outputs
  • Too much information available on the model
  • Lack of consideration of attacks to which ML applications could be exposed to
  • Lack of security process to maintain a good security level of the components of the ML application
  • Weak access protection mechanisms for ML model components

Poisoning             

  • Lack of data for increasing robustness to poisoning
  • Poor access rights management
  • Poor data management
  • Undefined indicators of proper functioning, making complex compromise identification
  • Lack of consideration of attacks to which ML applications could be exposed to
  • Use of uncontrolled data
  • Use of unsafe data or models (e.g. with transfer learning)
  • Lack of control for poisoning
  • No detection of poisoned samples in the training dataset
  • Weak access protection mechanisms for ML model components
  • Use of unreliable sources to label data

Model or data disclosure              

  • Existence of unidentified disclosure scenarios
  • Weak access protection mechanisms for ML model components
  • Lack of security process to maintain a good security level of the components of the ML application
  • Unprotected sensitive data on test environments
  • Data disclosure Too much information about the model given in its outputs
  • The model can allow private information to be retrieved
  • Disclosure of sensitive data for ML algorithm training
  • Too much information available on the model
  • Too much information about the model given in its outputs

Compromise of ML application components        

  • Too much information available on the model
  • Existence of several vulnerabilities because the ML application was not included into process for integrating security into projects
  • Use of vulnerable components (among the whole supply chain)
  • Too much information about the model given in its outputs
  • Existence of unidentified compromise scenarios
  • Undefined indicators of proper functioning, making complex compromise identification
  • Bad practices due to a lack of cybersecurity awareness
  • Lack of security process to maintain a good security level of the components of the ML application
  • Weak access protection mechanisms for ML model components
  • Existence of several vulnerabilities because ML specificities are not integrated to existing policies
  • Existence of several vulnerabilities because ML application do not comply with security policies
  • Contract with a low security third party

Failure or malfunction of ML application               

  • ML application not integrated in the cyber-resilience strategy
  • Existence of unidentified failure scenarios
  • Undefined indicators of proper functioning, making complex malfunction identification
  • Lack of explainability and traceability of decisions taken
  • Lack of security process to maintain a good security level of the components of the ML application
  • Existence of several vulnerabilities because ML specificities are not integrated in existing policies
  • Contract with a low security third party
  • Application not compliant with applicable regulations

Human error

  • Poor access rights management
  • Lack of documentation on the ML application
  • Denial of service due to inconsistent data or a sponge example
  • Use of uncontrolled data
  • Cybersecurity incident not reported to incident response teams
  • Lack of cybersecurity awareness

But all this raises a curious question for me: Could an explainable model be inherently less secure? i.e. the more that is known about the internal workings / data of the model – the easier it is to deceive?

Maybe a topic for a Ph.D. researcher!

The full report is here Securing machine learning algorithms