Depth Sensing: The Connotations of Body Data and the Microsoft Kinect
by Chaz Evans — School of the Art Institute Chicago, DePauw University
April 25, 2014 – 09:52
New inventions are often received in terms of the science fiction texts that foreshadowed their coming to be. This was the case when Microsoft released the Xbox Kinect camera in 2010. As a depth sensor, hooked up to your home entertainment system, which converts the appearance and motion of the user’s body and physical surroundings into spatial data that your Xbox can interpret, it resembled the “telescreen” from George Orwell’s 1984. The telescreen functioned as a compulsory entertainment device that observed Orwell’s dystopian subjects in their own homes. The connection was not lost to voices in blogging and the popular press who characterized it as a machination of “Big Brother.”
The telescreen from a film adaptation of 1984 (Virgin Films, 1984)
Simultaneously the device was also received as an example of great progress in Human-Computer Interaction (HCI), issuing a new era of relatively inexpensive “natural interfaces” with personal devices freeing users from the tyranny of keyboard, mouse, or other hand-held controller. This characterization was posited by other voices in the popular press as well as Microsoft.
After the device’s proprietary drivers were reverse engineered the device was also characterized as a utopian hacker triumph. The great innovation was no longer the strict property of Microsoft, but could work with a number of devices and applications. Soon thereafter, Microsoft reversed their proprietary stance and released a software development kit (SDK), so that outside technologists could experiment with the device in a sanctioned manner. Microsoft then characterized the device as an embrace of techno-utopian openness and experimentation.
Throughout this whole narrative, the Kinect was still a toy: a family friendly play-object that resembled Disney Pixar’s Wall-E, especially when its enclosure is removed, as Greg Borenstein pointed out in Making Things See (Maker Media, 2012).
A Kinect camera without an enclosure. (Photo: iFixit)
The Kinect camera converts the shape, motion, and depth of corporeal bodies into digital information a computer can understand and reproduce. It works using a principle called “structured light.” Before receiving any light, the device first projects an infrared grid onto the space in front of it. An infrared receiver on the device then measures the distance the infrared emissions travel between an object in space and the device. These distances can then be compared to a more standard RGB camera, giving the image that the device captures more detail. As a consumer-grade depth sensor, the Kinect is a household appliance for making digital copies of an individual’s body and personal space. There have been other video game peripherals that have captured physical motion in some aspect (such as the Nintendo Power Glove or the Activator for Sega Genesis), but none that have digitally recorded so much corporeal space at once, making the stakes seem much higher any other motion controller in history.
The extreme meanings circulating upon the Kinect’s release reflect how these stakes have created emotional and cultural impact. Microsoft and others described the device as “revolutionary”, but if these revolutionary narratives differ or outright contradict each other, how do we know which revolution this device is allegedly waging? And for whom?
The ongoing narrative of what the device means continued when Microsoft revealed that the device’s second version (for the XBox One) would always be on. After this announcement struck another chord with critics sensitive to privacy issues, Microsoft once again changed position, stating the device could be turned off anytime, defraying further associations with surveillance technology.
A toy, a surveillance device, a revolution in HCI: the Kinect opens up new issues of personal space as politicized space, commodification of personal data, and the viewer/user’s physical participation as constitutive element of a media text. What additional meanings can we expect from this device’s relatively short but unfolding history?