AI Fairness: Understanding Bias and Data Privacy

Today, I want to talk about a topic that has rapidly moved from geek forums to newspaper front pages and government agendas. We'll be discussing the ethics of artificial intelligence.

We'll talk about why a "smart" algorithm might deny you a loan because of your zip code, why an autopilot system struggles to recognize pedestrians with darker skin tones, and why your gender could be the reason your resume never even reaches a human's desk.

This isn't science fiction or horror stories. This is a reality we've already encountered. I've analyzed dozens of discussions, technical reports, and real-world cases to delve into the two main demons of modern AI: bias and privacy. And, most importantly, to understand how we plan to combat them.

The Machine Said "No"

Let's start with a story that has already become a classic example. This isn't an internet anecdote, but a documented failure of a multi-million dollar project by one of the world's largest corporations — Amazon.

In the mid-2010s, the company set an ambitious task for its engineers: to create an AI recruiter. The idea was brilliant. A machine, devoid of human biases, would impartially analyze hundreds of thousands of resumes and select the best of the best. No nepotism, no favoritism, no fatigue. Just pure data and objective evaluation.

The team spent several years on development. They "fed" the neural network a giant archive of resumes the company had received over the past 10 years, showing it which candidates were ultimately hired and became successful employees. The AI was supposed to learn success patterns and apply them to screen new applicants.

And it learned. Too well, in fact. When the system was launched in test mode, engineers noticed a troubling pattern: the algorithm systematically "rejected" resumes from women, especially for technical roles.

How could this happen? Was the machine taught to be sexist? No, it's far more prosaic and dangerous. The AI simply analyzed historical data and made a logical conclusion, from its perspective: since in the past most successful candidates for technical positions were men, then "male" resumes were a sign of success. The algorithm began to penalize resumes that contained words associated with women (for example, the name of a women's college or the word "women's" in the context of "captain of a women's chess team").

To better understand this logic, imagine a bouncer at an elite club. He was given a very simple, but flawed, instruction: "Don't let anyone in who doesn't look like our regular guests." Since the club's regulars historically were men in business suits, the bouncer automatically starts rejecting anyone who doesn't fit this image. He's not a misogynist and holds no personal animosity. He's simply perfectly executing the rule given to him.

Amazon spent years and millions of dollars to create an objective recruiter, only to end up with an automated "glass ceiling." The project had to be shelved. But, according to the data, this isn't an isolated glitch but a systemic issue. A recent study by the University of Washington, which analyzed the performance of three modern LLMs on 550 real resumes, revealed shocking figures: in 85% of cases, algorithms favored resumes with names associated with white men. Female names were preferred in only 11% of cases, and Black male names never outperformed white male names. This is particularly alarming given that, as of 2024, 62% of Australian organizations already use AI in recruiting, and in the US, approximately 99% of Fortune 500 companies employ some form of hiring automation.

Bad Textbook for a Genius Student

When we hear the phrase "biased AI," our imagination conjures up some evil racist robot from a Hollywood movie. But the reality, as I've already said, is far more prosaic. And that's precisely why it's far more dangerous.

Artificial intelligence, by its nature, is neither "evil" nor "good." It is an incredibly diligent and capable student. But it has one characteristic: it learns only from the materials we provide it. And if these materials — the "textbooks" — are full of the errors, stereotypes, and injustices of our world, then the AI will learn precisely those. And it will reproduce them with the soulless efficiency of a machine.

Imagine AI as a genius chef. It can master any culinary technique in seconds and prepare a dish with perfect precision. But there's one problem: it was taught to cook from a cookbook written by Julia Child in the 1960s. It will skillfully prepare any dish from that book for you, but all of them will reflect the gastronomic tastes, dietary norms, and social realities of that era. It doesn't "want" to cook outdated, fatty, and not very diverse food. It simply doesn't know how else to cook. It had no other textbooks.

The problem with Amazon's AI recruiter wasn't that it hated female programmers. The problem was that in the data it was trained on, there were ten times fewer of them than men. For the AI, this wasn't sexism; it was cold, irrefutable statistics. A pattern to be reproduced.

And this brings us to the core of the problem, which the community calls algorithmic bias. AI does not create prejudices. It finds them in our data, amplifies them, and automates them.

Two Horsemen of the Apocalypse: Bias and Privacy

Analyzing discussions on this topic, I saw that the entire tangle of problems breaks down into two large, closely interconnected parts. These are bias, which we've already started discussing, and privacy — its dark and complex counterpart. Let's examine them in order.

The Echo Chamber of Our Past

Data bias is not just an abstract problem. Today, it is already creating dangerous and unfair systems that operate on a vicious cycle principle.

Here are a few real examples that frequently emerge in discussions:

Predictive Policing. In the U.S., systems like PredPol (now Geolitica) were used to predict crimes. But a The Markup investigation showed their catastrophic ineffectiveness. Out of 23,631 predictions made by the system for the Plainfield police in 2018, fewer than half a percent were successful. The accuracy of robbery predictions was only 0.6%, and burglaries a paltry 0.1%. No wonder Police Captain David Guarino eventually stated: "I don't know what it [PredPol] gave us... We ended up getting rid of it." The department paid $20,500 for virtually useless software, which also risked increasing patrols in already stigmatized areas.
Medical Diagnostics. Many algorithms for detecting skin cancer were trained on data where patients with light skin predominated. A study of the HAM10000 dataset showed that the DenseNet121 model, while achieving an overall accuracy of 91.9%, significantly underperformed on darker skin types. For the darkest group, accuracy was only 78.9% — a drop of more than 13 percentage points. The problem lies in the source data: an analysis of 21 datasets with over 100,000 images revealed that out of nearly 2500 images specifying skin color, only 10 belonged to people with brown skin and only one to those with dark brown or black skin.
Credit Scoring. Banking AI can lower your credit score because of your zip code. The problem is so serious that regulators have become concerned. For example, the Bank of Russia published a report in November 2023 with approaches to regulation. In particular, it proposed obliging banks to check models for correlations with gender, age, and region, and to provide the client with a clear explanation for denial. This is a direct response to the risk that AI might reproduce historical injustice by denying loans to residents of "disadvantaged" areas.
Facial Recognition. Testing by the U.S. National Institute of Standards and Technology (NIST) found that facial recognition systems can be 10-100 times more inaccurate for certain demographic groups. Studies showed higher false positive rates for Black individuals and people of East Asian descent, as well as for women compared to men. Under ideal conditions, accuracy can exceed 99%, but in reality, it falls to 36-87%.

In all these cases, AI acts like a giant digital echo chamber. We shout our old prejudices into it, and it returns them to us — amplified and stamped with "Data Confirmed."

Dr. House's Dilemma

At first glance, the solution seems obvious: if the problem lies in "crooked" data, we need to "straighten" it. But here we run into the second horseman of the apocalypse — privacy.

Let me present you with a difficult choice. Okay, to make an AI recruiter fair, we need it to know the gender and race of candidates to track and correct imbalances. But do you want this sensitive information to be mandatorily collected and stored in the database of every company you send your resume to?

This is what I would call "Dr. House's Dilemma." To make an accurate diagnosis, the brilliant doctor needs all information about you. You trust him with this data for your own good. Now, imagine that this "doctor" is an anonymous corporation, whose goals are not entirely clear to you.

Adding to this dilemma is a new headache — the so-called Shadow AI. According to recent reports, 77% of companies rely on employees using AI tools without the knowledge or approval of the IT department. The growth in the use of such applications was 156% over the year. Employees are inserting confidential corporate data into public chatbots, creating colossal risks. It's no surprise that Gartner predicted that over $100 billion of AI spending in 2024 would be directed towards risk reduction and security.

Here lies the main paradox: to protect people from discrimination, we must classify them. And while we ponder how to do this ethically, our data leaks through uncontrolled channels.

Brakes and Airbags for AI

When delving into these problems, it's easy to fall into despair. But fortunately, that's not the case. Right now, the IT industry is going through a very important maturation phase.

To continue the analogy, AI development can be compared to car manufacturing in the early 20th century. Initially, everyone was thrilled with the speed. But then they realized: we urgently need brakes, seatbelts, and airbags. Today, the exact same thing is happening in the world of AI. We are moving from the "let's make the model more powerful" stage to the "let's make it safe and fair" stage. This new approach has been termed Responsible AI.

And this is a global trend. In Russia, for example, as early as 2021, leading tech companies, including Sber, Yandex, VK, and Gazprom Neft, signed the national "AI Ethics Code". They committed to striving for their systems not to create undue advantages for some groups at the expense of others. By April 2024, over 360 Russian organizations had already joined this code.

Moreover, the industry is moving from declarations to practice. In April 2025, Sberbank presented Russia's first comprehensive threat model for AI systems, describing 70 potential threats for generative and predictive AI. This document, based on global best practices (OWASP, MITRE, NIST), has become a practical tool for vulnerability assessment, accessible to any company.

"Responsible AI" is a whole set of practices and tools:

Algorithm Audits. "AI auditors" are emerging, who specifically look for bias and test the system on different population groups.
Ethics Committees and "Red Teams." Internal departments that ask uncomfortable questions before a product is released to the market.
De-biasing Techniques. Mathematical methods that allow for "cleaning" data or adjusting the performance of an already trained model.
Transparency and Explainability (Explainable AI, XAI). Creating systems that can explain why they made a particular decision. "I denied the loan because...", rather than just "Deny. Error code: 734."

The question "Is this algorithm ethical?" should become as natural for an engineer as the question "Will this bridge withstand the load?" is for a builder. This is no longer abstract philosophy. It's a new safety protocol.

AI as Fire - A Tool in Human Hands3-min.png

Don't Be Afraid, But Be Vigilant

So what's the conclusion? Is AI fair? After all I've analyzed, my answer is this: the question itself is fundamentally flawed. Asking if AI is fair is like asking if a hammer is fair. Artificial intelligence is not a moral agent. It is the most honest and ruthless mirror humanity has ever created.

It doesn't flatter or lie. It impassively reflects data on whom we have historically hired, to whom we have granted loans, and in which areas people were more frequently arrested. It shows us the imprints of our own systemic errors and unconscious biases, amplified to industrial scales. And if we don't like the reflection, the problem isn't with the mirror.

The real challenge isn't teaching AI not to be sexist or racist. It's about stopping the supply of data from which such conclusions can be drawn. All the ethical codes, audits, and de-biasing techniques we've discussed are, in essence, our first attempts not to smash the mirror in anger, but to learn to work with the reflection: to clean it in some places, adjust the focus in others, and in some, acknowledge that we need to change the reality it reflects.

AI doesn't ask us about its future. It poses a question about our present. Looking into this digital mirror, are we ready not just to be horrified by the reflection, but to start changing what it shows?

Stay curious.