The Transparency Problem with AI

The use of Artificial intelligence (AI) in important decision-making areas continues to grow and includes such important decisions as: loan-worthiness, emergency response, medical diagnosis, job candidate selection, parole determination, criminal punishment, and educator performance. But, a critical question keeps coming up in these areas, how are the decisions being made?

The use of Artificial intelligence (AI) in important decision-making areas continues to grow and includes such important decisions as: loan-worthiness, emergency response, medical diagnosis, job candidate selection, parole determination, criminal punishment, and educator performance. But, a critical question keeps coming up in these areas,  how are the decisions being made? What factors did AI look at and what weighting did it give these factors? How comprehensive were the training cases that prepared the AI model for this decision making? These are not trivial questions given that decisions about lives and livelihood are being made based on the AI output.

The “Black Box”

The issue is typically described as the "black box" problem - the inability for people to understand exactly what machines are doing when they're teaching themselves. Simply, data goes into the computer or cloud-server, the training algorithm processes the data and "learns" from it, and the AI algorithms process the query and decisions or output to decision-making processes are made. The AI algorithms are changing based on what they are learning, and their output or decisions are changing as a result. But, how did the machine (the black box) make that decision? With the AI model constantly changing based on its training data, how does a user know how the model arrived at its decisions? With AI being used in more and more decision-making scenarios where people's lives or livelihood are at stake, transparency into how the machine made the decisions is going to become more and more important. For example, recent changes in privacy law in the EU, require that certain decisions made by automated processing be explainable.

AI and GDPR


The European Union's new privacy law, (the General Data Protection Regulation, better known by its acronym, GDPR) came into effect in May 2018. There are several requirements that impact the ability of entities to collect large amounts of personal data and to use artificial intelligence (AI).
One significant issue is that individuals have the right not to be subject to a decision based solely on automated processing where such processing either has legal effects or significantly affects the person (GDPR Article 22). It is still early to understand the ramifications of the complete legal scope but, at a minimum, companies that are using AI that significantly impacts people are going to have to be prepared to explain how an automated process arrived at a decision: be able to explain how or why AI made the decision. This involves being able to understand the data and algorithms used for training as well as explaining the AI algorithms that were involved in making the decision - all of which may be near impossibilities given the large amounts of data used in training AI algorithms and the ever-changing (learning) AI algorithms themselves being used in the decision-making processes. Without the ability to explain how decisions are arrived at, entities must have explicit consent to process the personal data from each user, not process individual's data who refuse to consent, and provide an alternative process, for users who request it, that allows for human intervention.

The GDPR's direct reach is the European Union's 28 countries, 500 million people, and almost fifteen trillion-dollar GDP. Its potential reach is much bigger in that it applies to all companies located in the EU despite where their customers live. For companies located outside the EU, the GDPR' s reach applies to any EU citizen. Thus, the ability to use AI tools legally under the GDPR is going to be crucial. AI tools need to explain how their decisions were arrived at or companies using them can be subject to fines for non-compliance.

And, the issue of lack of AI transparency is not unique to the EU. How long will it be before it becomes a focus issue in other parts of the world and the US?

AI and Transparency

Important issues with most AI is that:  1) it requires significant amounts of training data to create a learned model (build its algorithms); 2) the algorithms themselves tend to include the biases of the developers, the trainers, and/or the data it learns from; 3) the decision-support algorithms are continuously changing as they receive additional training data and feedback; and 4) there are limited means to understand how the decisions or recommendations that a particular algorithm is making is made, meaning there is little or no understanding or visibility into how the decision was arrived at by the algorithm.

Imagine you are seeking a job. You are one of a thousand applicants who turns in a resume for the position as defined in a detailed job description. The employer could engage screeners to look at all the applicant resumes and try to narrow down the number to a more reasonable number of applicants to contact and set up for interviews. They could do this by looking for specific words or phrases or positions listed in the resume that best match key attributes from the job description. The decision as to which ones make the cut is then based on the best match to the key attributes.

To save time and costs, the employer might instead digitize all of the resumes and run them through a program that looks for these same words or phrases. Again, the end result is the ones that make the cut are based on the ones with the most words (often called "keywords") or phrases that are most relevant for the position. The problem with this approach is that applicants quickly figure out that their best opportunity to make the cut is to customize each resume to the job description by including as many of the most relevant keywords as possible based on the job description. Even website services have been developed that provide lists of keywords to include in resumes to improve one's chances of being selected. If you are an applicant and don't include keywords, your chances of making the cut are likely to decrease despite your qualifications being very relevant to the position. The result for the employer is many people who should not be making the cut are making it, and ones that should be and are likely the best candidates, are not making it.

Now enter AI with contextual analysis capabilities into the equation. Instead of looking for keywords, algorithms can be trained to look for context in the resumes that best fit a particular job position. The algorithms can be run against the resumes and develop the shortlist of candidates and even rank them based on relevancy according to the algorithm. One issue is the lack of transparency in the process; e.g., how or why was the ranking done the way it was? In other words, what were the factors that led to that particular sequencing of candidates? The problem to date has been a lack of transparency in how the decisions were made. The algorithm made the decisions but why those choices? Without understanding the why, one is putting “blind” faith into an algorithm. Without visibility, the questions for the employer and the applicants are:

•    Was the decision made fairly?
•    On what basis was the decision made and at what confidence level?
•    Was there bias in how AI looked at the data or in the actual decision it made?
•    If there was bias, was it conscious, unconscious, or a combination of both?
•    Are the decisions being made consistent between data sets?
•    Are the decisions being made consistent over time?
•    And, on a personal level, the issue may be as simple as, why wasn't I chosen?


In part 2 of the Transparency Problem, learn about one methodology Vettd developed to approaching the issue of transparency.

About Vettd

One of the biggest economic data challenges of our time is this: How can organizations be more competitive by better levering technology to identifying skill gaps and star talent to fill them?  

Vettd recognized how this challenge creates massive inefficiencies throughout the HR process. Only by understanding the true value of applicants and employees, at scale, can talent management ever be aligned with the goals of the organization. 

We founded Vettd to solve this problem using artificial intelligence. Our talent classification approach quickly distinguishes star talent qualities that are impossible for humans to recognize. By leveraging deep learning applied to natural language processing, we can help organizations interpret masses of profiles and understand the value of individuals.

Vettd’s AI-driven talent classification is a quantum leap improvement in the human resource decisions that will affect the future of your organization.

Free Ebook
Talent Classification Guide
for the AI Era of HR
Download now

You Might Also Like

Should you tag resumes manually?
As part of the candidate screening process, some organizations task their recruiters with manual tagging of the resumes they review. The idea here is to increase the usability of talent databases by generating metadata manually.
4 Reasons Why Candidate Screening is Not a Human Task
Humans will always play a critical role in the hiring process, but the consistency vs efficiency challenge exemplifies why reviewing resumes is a task better suited for modern machines than modern humans.
How Talent Classification Works
Talent classification is a simple concept to grasp but can be a difficult practice to adopt without the right tools. We refer to talent classification as the process of categorizing human capital according to shared qualities or characteristics. This process helps you recognize, differentiate, and understand the talent you have at your disposal. Decision-making in talent acquisition and strategic workforce planning becomes much more straightforward with this level of insight.
All Posts