In 2016, Microsoft released its chatbot Tay onto Twitter to engage in "playful conversation" with users. In less than 24 hours, Tay began spouting racist and sexist comments. More recently, in U.S. courtrooms, judges are increasingly using algorithms (instead of cash bails) to predict which criminal defendants will flee or commit another crime, even though a 2016 ProPublica investigation found that such algorithms may be biased against black prisoners.

Tech companies have made big advances in terms of building artificially intelligent software that gets smarter over time and potentially makes life and work easier. But these examples reveal an uncomfortable reality about A.I.: even the most intelligent bots can be biased.

Who is building the robots--and how--will only become more important questions in the future. In 2016, sales of consumer robots reached $3.8 billion. They're expected to reach $13.2 billion by 2022, according to market firm Tractica. Add to that the coming wave of self-driving cars and virtual assistants in the workplaces, and you see a future in which A.I. is going to continue to play a bigger role in the culture and the economy.

"One source of urgency is simply due to the fact that we're about to experience a proliferation of robots in society," says Jim Boerkoel, a robotics professor at Harvey Mudd. 

Boerkoel says removing all biases from robots is an enormous and difficult task. Bots and A.I. are built by human beings who have implicit biases. Even if you could design an A.I. algorithm to be completely agnostic to race, gender, religion, and orientation, "[for robots] to be effective, they will inevitably need to learn from experience, either through interactions with the world or from data provided to them," he says.

A number of startups are working on this problem. Their goal? Minimize bias problems from the get-go by designing A.I. systems that reflect a broader range of human experiences.

The invisible human hand

Founded in 2009 at M.I.T Media Lab, Affectiva makes software that claims to detect and understand human emotions and facial expressions just by scanning a person's face. The software's applications vary widely: Learning apps have employed it to better understand how students use them, Giphy uses it to tag GIFs with emotions, and global research firms use it to measure audience responses to TV ads and movie trailers. Recently, Affectiva began shipping software to auto companies that can monitor a driver's face and emotions to create a better car experience. For example, the car may suggest that you don't stop by the grocery store and do it tomorrow instead if it detects that you're tired or frustrated.

The company says that avoiding bias in its algorithms starts with accumulating an enormous set of diverse data. To that end, Affectiva says that it has analyzed over six million faces in 87 countries--a process that involves hundreds of millions of different data points, says Affectiva director of applied A.I. Jay Turcot. The optical sensors that work with Affectiva's software to capture details of facial expressions, such as the movements of an eyebrow or the corners of a mouth, are placed in the background in an effort to analyze people acting naturally.

Such heavy data collection requires a team of human annotators, who help feed the algorithms, to manually tag what they see in the data. This process helps Affectiva's scientists like Turcot look for potential biases hidden in the algorithms. For instance, Turcot says that their data showed that in everyday conversations women tend to laugh more than men, which could help perpetuate gender bias. If the human annotators didn't make sure to balance the data appropriately, the algorithm might inadvertently conclude that laughter is essentially a "woman's thing"--which is not so helpful for software that aims to read people's emotions to provide a better driving experience.

A robot that's anything you want it to be

It's not just the data powering the A.I. that can be biased. The physical design of robots can reflect certain prejudices as well--particularly when engineers anthropomorphize them, says Boerkoel. Consider, for example, the sheer number of bot assistants that are given female names, voices, and (in some cases) bodies: Siri, Alexa, and Sophia are just a few examples.

Furhat Robotics doesn't want to determine the gender--or for that matter, species--of its robots for you. The Stockholm-based social A.I. robotics startup aims to build robots that use language and gestures to converse with people in a natural manner. The robot comes as a head and stand (and optional fur hat). Furhat says the robot learns how to be more conversational by speaking with humans; it is equipped with microphone sensors and cameras that pick up and convert the speech to text using machine learning.

The signature feature of Furhat is its "projection mask," a plastic mask that can look like a man, woman, animal, or even a Disney-inspired avatar, with the help of computer animation. Securing $2.5 million in seed funding last September, Furhat has partnered with such companies as Honda, Intel, Disney, and KPMG, which use the robot for various social purposes: a job interview trainer, a robot that tells stories to kids, a conversation tool for the elderly. Furhat will be piloting robots in March at Frankfurt Airport to communicate with international travelers.

"Once [the founders] had this tool, what made this special was the ability to represent any human or non-human or any gender and that started becoming a founding philosophy of the company," says Furhat's senior business developer Joe Mendelson. 

Robots that are specialists instead of generalists

Another potential way to limit bias in A.I. systems is to narrow the focus of what they're designed to do. is a good example of a bot with a single-minded purpose. The New York-based startup created an A.I. virtual assistant--which goes by the name of Amy or Andrew--that can manage your schedule: You ping it on Slack or CC it on an email and the bot can arrange meetings, send out calendar invites, and plan any reschedules or cancellations. The assistant is fully autonomous, so it doesn't need additional input or control after being asked to do a job.

Founder Dennis Mortensen argues that, in the future, the most effective virtual assistants will eventually "do one thing and one thing very well."

"I'm not convinced that we'll end up with a single A.I. or one of the personal assistants, whether it'll be Google Assistant or Siri, that will be an entity that can answer all your questions--that just doesn't sound realistic," he says.

He suggests that in the same way that there isn't just one app on your phone that can do all things, virtual assistants shouldn't be expected to overstep their job descriptions. Amy and Andrew are designed to care only about dates, times, locations, and names of people. The algorithm simply doesn't process "bad" input--for instance, racist or sexist language--Mortensen says.

A $23 million investment led by Two Sigma Ventures in 2016 has allowed to build its dataset from scratch. The company has about 70 A.I. trainers--which make up two-thirds of the startup's team--who assemble and label data that comes from email conversations with Amy and Andrew. Despite the female and male distinction, the virtual assistants behave in exactly the same way. And like most A.I. systems, nuanced language is still an ongoing challenge for the algorithms. (If someone sends a note at 12:30 a.m. and says she's free "tomorrow," does she really mean today?)

Despite the progress being made, bias will likely continue to creep into technology in ways that tech companies haven't even thought of yet. 

"I'd argue that it is at least as hard as trying to fix implicit biases within ourselves and society," Boerkoel says. "While implicit bias training can help designers of technology attempt to keep biases in check, it is impossible to be truly blind to all of the ways a culture has shaped our views about gender, race, and religion."