Google Translate Has Dinosaur Ideas About Gender Roles
Here's a riddle: A boy and his dad get into a horrible car accident. The father dies on the spot, the boy is in critical condition. When brought into the ER, the surgeon exclaims: "I cannot perform the operation. That's my son!" Also, the surgeon is smoking a pipe and fixing a pothole in front of the hospital. How is this possible? If you can't figure out that the surgeon is, of course, the boy's female woman-mother, you might be a victim of institutional sexism. Or someone who really sucks at context clues. Or you're Google trying to translate this riddle into Italian, which means you're both.
Despite some languages' obsession with slapping sex chromosomes on everything from an airbag to a zucchini (looking at you, Hindi), a majority of human languages do not use gendered words, not even pronouns. This provides a unique problem for AI interpreters such as Google Translate when they have to morph a genderless grammatical language into a gendered one, forcing this bot to do the single most risky thing one can do on the internet: assume someone's gender. And it sure seems like our future AI daddies have developed some pretty Mad Men era presumptions about gender roles.
It turns out that Google Translate has a bad habit of systematically assigning male pronouns to sentences involving high-competence job titles like doctor or lawyer while figuring anything involving auxiliary jobs, traditional housework, or soft descriptors like "pretty" has to be referring to the fairer sex. Of course, machine translators aren't the kind of sexist that need an HR-mandated trip to the nearest workplace harassment seminar. Like your sweet grandma who cannot comprehend what's wrong with saying "lady postman," ole Google Translate is simply a product of a different time -- that time being the entire sexist history of human civilization.
Like other artificial intelligences, online translators make use of machine learning. Instead of studying languages like a human would (by obsessively using Duolingo for five weeks and then giving up), they absorb and analyze billions of existing writing samples to get so good at matching human gibberish to other human gibberish they can "guess" the right translation with remarkable consistency. But internalizing these human examples also means internalizing their authors' biases. And sweet gormless Google Translate doesn't realize it's being a tool of the patriarchy by clinging to dated stereotypes. All it knows is that millions of examples prove it's statistically the safest to identify lumberjacks as men, nurses as women, and attack helicopters as conservative comedians.
Of course, Google has been thoroughly informed that its AI is spouting more institutionalized sexism in a day than an entire construction crew can do in a lifetime. But how to solve this ancient language problem? Machine translators could accept the future of writing and when in doubt, switch to a gender-neutral pronoun. But since not all languages have easy access to a "they" and/or have gender so ingrained, it would require neutering their entire grammar; this approach would require a significant overhaul of machine translators' approach to language. This is something Google isn't interested in doing. Instead, they've been busy as male bees to make every language that needs it gets two simultaneous translations, one with "he's" and one with "she's. But not only will this process take decades and merely "reduce" gender bias, it'll leave the AI wholly unprepared for the upcoming age of subjects that don't identify as either established gender or prefer to be referred to by non-traditional pronouns. At that rate, there's a good chance that Google Translate will be able to pass the Turing Test before it can pass the Bechdel Test.
For more untranslatable tangents, do follow Cedric on Twitter.
Top Image: Google