This article was written by Kate Crawford & Vladan Joler. Below is an extract, featuring the first three sections of this long article (21 sections total.) Link to the full article is provided at the bottom.
The Amazon Echo as an anatomical map of human labor, data and planetary resources.
A cylinder sits in a room. It is impassive, smooth, simple and small. It stands 14.8 cm high, with a single blue-green circular light that traces around its upper rim. It is silently attending. A woman walks into the room, carrying a sleeping child in her arms, and she addresses the cylinder.
‘Alexa, turn on the hall lights’
The cylinder springs into life. ‘OK.’ The room lights up. The woman makes a faint nodding gesture, and carries the child upstairs.
This is an interaction with Amazon’s Echo device. A brief command and a response is the most common form of engagement with this consumer voice-enabled AI device. But in this fleeting moment of interaction, a vast matrix of capacities is invoked: interlaced chains of resource extraction, human labor and algorithmic processing across networks of mining, logistics, distribution, prediction and optimization. The scale of this system is almost beyond human imagining. How can we begin to see it, to grasp its immensity and complexity as a connected form? We start with an outline: an exploded view of a planetary system across three stages of birth, life and death, accompanied by an essay in 21 parts. Together, this becomes an anatomical map of a single AI system.
The scene of the woman talking to Alexa is drawn from a 2017 promotional video advertising the latest version of the Amazon Echo. The video begins, “Say hello to the all-new Echo” and explains that the Echo will connect to Alexa (the artificial intelligence agent) in order to “play music, call friends and family, control smart home devices, and more.” The device contains seven directional microphones, so the user can be heard at all times even when music is playing. The device comes in several styles, such as gunmetal grey or a basic beige, designed to either “blend in or stand out.” But even the shiny design options maintain a kind of blankness: nothing will alert the owner to the vast network that subtends and drives its interactive capacities. The promotional video simply states that the range of things you can ask Alexa to do is always expanding. “Because Alexa is in the cloud, she is always getting smarter and adding new features.”
How does this happen? Alexa is a disembodied voice that represents the human-AI interaction interface for an extraordinarily complex set of information processing layers. These layers are fed by constant tides: the flows of human voices being translated into text questions, which are used to query databases of potential answers, and the corresponding ebb of Alexa’s replies. For each response that Alexa gives, its effectiveness is inferred by what happens next:
Is the same question uttered again? (Did the user feel heard?)
Was the question reworded? (Did the user feel the question was understood?)
Was there an action following the question? (Did the interaction result in a tracked response: a light turned on, a product purchased, a track played?)
With each interaction, Alexa is training to hear better, to interpret more precisely, to trigger actions that map to the user’s commands more accurately, and to build a more complete model of their preferences, habits and desires. What is required to make this possible? Put simply: each small moment of convenience – be it answering a question, turning on a light, or playing a song – requires a vast planetary network, fueled by the extraction of non-renewable materials, labor, and data. The scale of resources required is many magnitudes greater than the energy and labor it would take a human to operate a household appliance or flick a switch. A full accounting for these costs is almost impossible, but it is increasingly important that we grasp the scale and scope if we are to understand and govern the technical infrastructures that thread through our lives.
The Salar, the world's largest flat surface, is located in southwest Bolivia at an altitude of 3,656 meters above sea level. It is a high plateau, covered by a few meters of salt crust which are exceptionally rich in lithium, containing 50% to 70% of the world's lithium reserves. The Salar, alongside the neighboring Atacama regions in Chile and Argentina, are major sites for lithium extraction. This soft, silvery metal is currently used to power mobile connected devices, as a crucial material used for the production of lithium-Ion batteries. It is known as ‘grey gold.’ Smartphone batteries, for example, usually have less than eight grams of this material. Each Tesla car needs approximately seven kilograms of lithium for its battery pack. All these batteries have a limited lifespan, and once consumed they are thrown away as waste. Amazon reminds users that they cannot open up and repair their Echo, because this will void the warranty. The Amazon Echo is wall-powered, and also has a mobile battery base. This also has a limited lifespan and then must be thrown away as waste.
According to the Aymara legends about the creation of Bolivia, the volcanic mountains of the Andean plateau were creations of tragedy. Long ago, when the volcanos were alive and roaming the plains freely, Tunupa - the only female volcano – gave birth to a baby. Stricken by jealousy, the male volcanos stole her baby and banished it to a distant location. The gods punished the volcanos by pinning them all to the Earth. Grieving for the child that she could no longer reach, Tunupa wept deeply. Her tears and breast milk combined to create a giant salt lake: Salar de Uyuni. As Liam Young and Kate Davies observe, "your smart-phone runs on the tears and breast milk of a volcano. This landscape is connected to everywhere on the planet via the phones in our pockets; linked to each of us by invisible threads of commerce, science, politics and power."
To read the full article, with illustrations, click here.