The protection of personal data in the case of an Artificial Intelligence — Gowper

9 min readNov 2, 2020

How the personal data processed by an Artificial Intelligence should be protected

Artificial Intelligence (AI) or Machine Learning is a tool that is currently used for the development of solutions by third parties (programmers, product developers, etc.) with applications in technology and innovation sectors, from Internet search engines, personal assistants, robotics, video games, industrial control, to others such as the health field, optimization of public services, defence systems or environmental management. AI accesses and treats these data as part of other technologies or treatments, such as big data, IoT, 5G, etc. It corresponds to data that, due to their characteristics and special sensitivity, are protected by the authorities and governments, and are treated not only by AI but also by the human and technological elements that develop it.

AI is “the science and engineering of creating intelligent machines” and its legal configuration and some risks of machine learning were the subject of an earlier post..

On this occasion, we would like to explain how personal data that can be processed by an AI are protected. Such processing, especially since it involves AI that is still evolving, must be legitimized, carried out in a transparent manner according to standardized elements of protection, and with respect for the exercise of rights by the data subjects, sampling accuracy, minimization of impact and a filter of proportionality.

In order to standardize this protection, the Spanish Data Protection Agency (AEPD) has published a document of Adaptation to the RGPD of treatments that incorporate Artificial Intelligence. An introduction. Based on the General Regulation of Data Protection (GDPR), the AEPD has elaborated this guide on how the data should be treated in the cases of products and services that include AI components, including the design and implementation of the treatments that include AI.

TYPE OF (PERSONAL) DATA PROCESSED BY THE IA

There are three different categories of AI: general, strong and weak artificial intelligence. General AI should solve any intellectual task on an equal footing with a human being; strong or super intelligent AI would go beyond human capabilities, while weak AI (AI-weak), in contrast to the other two, focuses on developing solutions to concrete and delimited problems.

How is personal data affected?. AI may process data of natural persons (e.g. use of personal data linked to the configuration of search preferences and personal accounts connected by online search engines) of others that do not involve processing of personal data (e.g. weather forecasting models).

The data collected or processed (whether personal or not) are used in decision making by the AI in two ways: to assist in the decision process but without making the final decision (which is up to the human being), or to make (and execute) the decision autonomously.

Another important element is the life cycle of AI solutions.

Conception and analysis

The functional and non-functional requirements of the solution are established

Development

It includes sub-stages such as research, prototyping, design, testing, training and validation, etc.

Exploitation

It includes sub-steps such as integration, production, deployment, inference, decision, maintenance and evolution, etc.

Conclusion

of the treatment / component

Ethical filter. The AEPD also identifies some ethical challenges of AI applicable to its entire life cycle, including discriminatory biases or lack of critical evaluation, etc.

HOW THE IA SHOULD CARRY OUT THE PROCESSING OF PERSONAL DATA

Types of treatment

Development/Training

Definition / search and retrieval of the data set / pre-processing (unstructured data processing, cleaning, balancing, selection, transformation) / splitting or partitioning of the data set for verification / traceability and audit.

Validation

Only if actual data is used to determine the validity of the model.

Deployment

Data will be communicated whenever the AI solution is distributed to or obtained from third parties.

Exploitation

Inference: when the data is intended to obtain a result, or if it comes from third parties, or in the case of storage.
Decision: any decision on a data subject will involve processing.
Evolution: where the data and results of the data subjects are intended to improve the AI model.

Withdrawal

Through local, centralized or distributed data suppression treatments, as well as service portability.

Withdrawal or completion of the treatment, either partial at some stage of the IA life cycle or end of the IA (e.g. through deletion or anonymization procedures), should be real, and include verification of the re-identification risk.

Roles

The controller is the entity that takes the decision to use, in the framework of a personal data processing, an IA solution, being the entity that “determines the means and purposes of the processing”. The IA itself will never be responsible for the processing.

The roles may vary according to the different stages.

STEP: Development / Training

CONTROLLER: Entity that defines the objectives of AI and decides the data to be used for AI training.

PROCESSOR: Third party contractor for AI development or training (it does not matter if the contractor provides the data or if the contractor obtains the data himself).

STEP: Validation

CONTROLLER: Entity that develops the decision making by the AI regarding the data to be used for its training.

STEP: Deployment

CONTROLLER: Entities that market the AI solution (private or domestic use by individuals is excluded).

PROCESSOR: Entity that assigns the use of the AI solution to a third party through a service provision and does not process the data for its own purposes.

STEP: Inference / profiling

CONTROLLER: Entity that processes the AI solution data for its own purposes (private or domestic use by a natural person is excluded).

PROCESSOR: Entity that assigns the use of the AI solution to a third party through a service provision and does not process the data for its own purposes.

STEP: Decision

CONTROLLER: An entity that makes automated decisions about stakeholders for its own purposes.

PROCESSOR: Entity that assigns the use of the AI solution to a third party through a service provision and does not process the data for its own purposes.

STEP: Evolution

CONTROLLER: Entity that communicates AI solution data to third parties.

Entity that arranges for the evolution of AI.

PROCESSOR: Entity contracted to provide the service of data processing in the AI system.

WHAT RULES MUST BE FOLLOWED?

The AEPD imposes a series of conditions to verify that the processing by the AI respects the data protection regulations: accountability, legitimacy, information and rights of the data subjects.

Legal basis of the processing

Regulars:

Need to process data for the execution of a contract or for the implementation of pre-contractual measures.
Legitimate interest.
Consent, taking into account that the subsequent withdrawal of consent will not affect the processing carried out until that moment, nor will it have a retroactive effect in relation to the results previously obtained.

Exceptional

Protection of vital interests.
Reasons of public interest established in EU or Member States’ law.
Compliance with legal obligations laid down in a rule of the EU or of the Member States.

Data from third parties for AI training must have been legitimately acquired (e.g. by a contract justifying it, identifying the origin and ensuring the legality of such data).

Special categories of data may not be used for automated decision making, unless the data subject’s consent or reasons of essential public interest are specified.

The purposes of the processing may change with respect to those initially envisaged, but the lawfulness of the processing will be maintained at all times.

The requirement for transparency

Data subjects should be aware of the impact of the use of AI solutions on the processing of their data, for the duration of the processing.

Information in the training stage: The interested party must know if he/she will be able to be reidentified from the data of the model

Certification: Certification mechanisms increase transparency and guarantee greater confidentiality in order to preserve industrial property.

First Information Layer: The interested parties will be informed in accordance with Articles 13 and 14 of the GDPR at each stage of the IA life cycle.

Relevant information for the interested party:

Details and relevance of the data used for decision making.
Quality of training data and patterns.
Profiles carried out.
Precision or error values according to the metric applied to determine the inference.
Qualified human supervision (when applicable).
Audits and certification of the AI system.
Existence of third party data, prohibitions and sanctions provided.

Internal figures of the controller

Personnel. Information, training and audits to manage the personnel involved.

Data Protection Delegate (“DPD”). Without being mandatory, the appointment of a DPD is recommended by the AEPD in AI solutions.

Exercise of rights by the persons concerned

Data deletion.
Determination of the legal basis for the communication of data to third parties and information to data subjects.
Application of privacy measures by default, from the design.
Risk impact assessment.
Blocking of data from the inference process (i.a. entries and results obtained) in the event of a complaint by the interested parties.

Specific rights:

Rectification. Inaccurate data is only exceptionally allowed if it is to make the data subjects anonymous and avoid re-identification.
Portability. The data controller must determine the possibility of portability of the data.
Exclusion of automated individual decisions. A human operator must always be able to ignore the algorithm used and be ready to intervene.
Enhanced protection of children’s data. Data may be automated only where this is essential to protect the welfare of the child and with appropriate safeguards.

THE RISKS OF TREATMENT IN THE IA

There are two phases:

Identification of the threats

The controller must identify the risks inherent in AI, including data protection impact assessments (“EIPD”) in the case of profiling or automated individual decisions.

Technical and organizational measures

The controller must implement the necessary technical and organizational measures to eliminate or reduce the risk. These include:

Transparency.

Accuracy.Taking into account the following factors:

Errors in AI systems due to internal (programming or design) or external (biometric readers, etc.) elements.
Errors of the AI itself in the training or validation data.
Biased evolution of the AI model.
Specialty of biometric information (e.g., facial recognition, fingerprints, voice, etc.).

Minimization

Limit the degree of detail or accuracy of the information.
Limit the number of people affected.
Limit the accessibility of the different categories of data to staff or end users at each stage.
Use of standard techniques: suppression of unstructured data or information not needed during the pre-processing of the information; aggregation of data; anonymization and pseudonymization.
Specialization of third party data.

Security

Specific threats:

Access and manipulation of the training data set.
Trojans or backdoors in the code or development tools.
User API manipulation to access the model and manipulate parameters, filtering or attacks.
Filtering or access to logs resulting from inferences generated in the interaction with stakeholders.

Logs or activity records:

Who and under what circumstances accesses the personal data included.
Traceability of the update of the inference models, the communications of the user’s API with the model, and the detection of abuse or intrusion attempts.
Monitoring of the quality parameters of the inference.
Legal basis: Article 6 of the RGPD. E.g. network and information security, compliance with legal obligations (e.g. prevention of money laundering and terrorist financing).

Audit

The compliance with data protection regulations of the processing carried out by the AI must be verified during its entire life cycle.

International transfers

There are often cross-border data flows in an AI system (e.g. cloud computing, etc.), including data transfer to third parties for model evaluation or development. These must at all times comply with Chapter V of the GDPR.

WHAT SOLUTIONS DO WE OFFER?

At Gowper we encourage the use of AI models that are beneficial for the different technological sectors, industry and society in general. In the case of sensitive information and personal data of great value to all, we must always verify compliance with all necessary security measures and current legislation. In order to provide our customers with the legal tools that allow them to carry out a safe processing of personal data by an AI or Machine Learning system, we put at your disposal:

Design of models of regulatory compliance in the field of personal data adapted to AI.
Review and adaptation of models of regulatory compliance in the field of personal data processing to the requirements of the GDPR.
Examination of personal data susceptible to discrimination based on the specific AI model.
Negotiation and drafting of contracts for the development and marketing of Machine Learning systems, including the protection of personal data that may be processed by the AI.

Learn more about our offer of Individualized Solutions Plans (ISPs), especially our Orange Solutions, or our range of Services & Industries.

Originally published at https://gowper.com

The protection of personal data in the case of an Artificial Intelligence — Gowper

Written by Gowper