Augmented Reality, “Her”, and the Story of You

tumblr_my58njS7rs1seyhpmo1_1280Her pixel art by QuickHoney

Her is a story about people-centric technology. Spike Jonze shows us a near future where it’s all about you. This is our new Augmented Reality (AR), and it’s not science fiction.

I’ve been working with AR as a PhD researcher and designer for the past decade. The second wave of AR will surpass the current gimmickry and extend our human capacities to better understand, engage with, and experience our world in new ways. It will be human-centered and help to make our lives better. Driven by the one thing that is central and unique to AR – context – our devices will be highly cognizant of our constantly changing environments continually deciphering, translating, analyzing, and navigating to anticipate our specific needs, predicting and delivering personalized solutions with highly relevant content and experiences. Our smart devices will act on our behalf. This next wave of AR is adaptive; it is live and always on, working quietly in the background, presenting itself when necessary with the user forever at the center. It works for you, and you alone. It knows you very well, your behaviours, your likes, dislikes, your family and friends, even your vital statistics. The next wave of AR combines elements like Artificial Intelligence (A.I.), machine learning, sensors, calm computing, and data all to tell the unique story of you.

Meet Samantha, the world’s first intelligent operating system. Samantha is not real yet, only imagined in Jonze’s film Her; however, she gives us a glimpse of our soon to be augmented life when our devices come to learn and grow with us. Dr. Genevieve Bell, Director of Interaction and Experience Research at Intel, describes a world of computing where we enter a much more reciprocal relationship with technology where it begins to look after us, anticipating our needs, and doing things on our behalf. Dr. Bell’s predictions are echoed by Carolina Milanesi, Gartner’s Research Vice President. Milanesi states that by 2017, your smartphone will be smarter than you. “If there is heavy traffic, it will wake you up early for a meeting with your boss, or simply send an apology if it is a meeting with your colleague. The smartphone will gather contextual information from its calendar, its sensors, the user’s location and personal data.” Gartner’s research claims this will work with initial services being performed “automatically” to assist generally with menial tasks that are significantly time consuming such as time-bound events, like calendaring, or responding to mundane email messages. A gradual confidence will be built in the outsourcing of menial tasks to the smartphone with an expectation that consumers will become more accustomed to smartphone apps and services taking control of other aspects of their lives.

relationshipwithdevices_intelImages from Intel’s video interview with Dr. Genevieve Bell: What Will Personal Computers Be Like in 2020?

Gartner calls this the era of cognizant computing and identifies the four stages as: Sync Me, See Me, Know Me, Be Me. ‘Sync Me’ and ‘See Me’ are currently occurring, with ‘Know Me’ and ‘Be Me’ just ahead, as we see Samantha perform. ‘Sync Me’ stores copies of your digital assets, which are kept in sync across all contexts and end points. ‘See Me’ knows where you are currently and where you have been in both the real world and on the Internet, as well as understanding your mood and context to best provide services. ‘Know Me’ understands what you need and want, proactively and presents it to you with ‘Be Me’ as the final step where the smart device acts on your behalf based on learning. Samantha learns Theodore very well and with access to all of his emails, files, and other personal information, her tasks range from managing his calendar to gathering some of the love letters he ghostwrites to send them to a publisher, acting on his behalf.

Milanesi states, “Phones will become our secret digital agent, but only if we are willing to provide the information they require.” Privacy issues will certainly come into play, and a user’s level of comfort in sharing information. Dr. Bell observes that we will go beyond “an interaction” with technology to entering a trusting “relationship” with our devices. She reflects that a great deal of work goes into “getting goodness” out of our computing technology today and that we “have to tell it a tremendous amount.” She continues that in 10 years from now, our devices will know us in a very different way by being intuitive about who we are.

her-movie-2013-screenshot-samantha-pocketStill from Spike Jonze’s film Her

The world is filled with AR markers, no longer clearly distinguishable as black and white glyphs or QR code triggers; the world itself and everything in it is now one giant trackable: people, faces, emotions, voices, eye-movement, gesture, heart-rate, and more. The second wave of AR presents a brave new digital frontier, where the objects in our world are shape-shifting, invoked, and on-demand. This era will see one of new interaction design and user experiences in AR, towards natural user interfaces with heightened immediacy; we will be in the presence of the ‘thing’, more deeper immersed, yet simultaneously with both feet rooted in our physical reality. Our devices will not only get smaller and faster, and closer to, and perhaps even implanted inside our bodies, they will be smarter in how they connect with and speak to each other and multiple sensors to present a multi-modal AR experience across all devices.

Samantha is just this. She is a universal operating system that seamlessly and intelligently connects everything in her user Theodore’s world to help him be more human.

In a telephone conversation with Intel’s Futurist Brian David Johnson, he described to me how for decades our relationship with technology has been based on an input output model which has been command and control: if commands aren’t communicated correctly, or dare we have an accent, it breaks. Today, we are entering into intelligent relationships with technology. The computer knows you and how you are doing on any particular day and can deliver a personalized experience to increase your productivity. Johnson says this can, “Help us to be more human” and comments on how Samantha nurses Theodore back to having more human relationships. Johnson states that technology is just a tool: we design our tools and imbue them with our sense of humanity and our values. We can have the ability to design our machines to take care of the people we love, allowing us to extend our humanity. He calls this designing “our better angels”. Johnson says the question we need to ask is, “What are we optimizing for?” The answer needs to be to make people’s lives better, and I wholeheartedly agree.

My personal hopes for the new AR are that by entering into this more intelligent relationship with technology, we are freed to get back to human relationships and to doing what we love in the real world with real people, without our heads buried in screens. There is a whole beautiful tactile reality out there that AR can help us to explore and ‘see’ better, engaging with each other in more human ways. Get ready for a smarter, more human, and augmented you.

Let’s continue the conversation on Twitter: I’m @ARstories.

READ: “Augmented Human: How Technology is Shaping the New Reality.”

Google Glass & Augmented Reality Eyewear: “Oh, the Places You’ll Go!” Defining a New Era in Visual Culture

Augmented Reality eyewear and Google’s Glass will take us to new heights, quite literally: the first sequence in the “How It Feels [through Glass]” video was shot via Glass in a hot air balloon.


It was 155 years ago that the first aerial photograph was taken on a balloon flight in 1858 over Paris, France by artist Nadar (born Gaspar Félix Tournachon)(1820-1910). A pioneer in the newly emerging medium of photography, Nadar also attempted underground photography using artificial light to produce pictures of the catacombs and sewers of Paris. Nadar’s technical experiments and innovation took us to places via his camera that were previously inaccessible to photography, inspiring new ways of seeing and capturing our world.

AR eyewear and Glass offer this same opportunity at a time when AR is emerging as a new medium, which will give way to novel conventions, stylistic modes, and genres. Referencing Dr. Seuss’s book in the title of this article, AR also promises to transport us to wondrous, magical places we’ve yet to see.

This article is a follow up to a post I wrote a year ago posing the questions: Will Google’s Project Glass change the way we see & capture the world in Augmented Reality (AR) and what kind of new visual space will emerge?

As both a practitioner and PhD researcher specializing in AR for nearly a decade, my interests are in how AR will come to change the way we see, experience, and interact with our world, with a focus on the emergence of a new media language of AR and storytelling.

I’ve previously identified Point-of-View (PoV) as one of “The 4 Ideas That Will Change AR”, noting the possibilities for new stylistic motifs to emerge based on this principle. I’d like to revisit the significance of PoV in AR at this time, particularly with the release of Google Glass Explorer Edition. PoV, more specifically, “Point-of-Eye”, is a characteristic of AR eyewear that is beginning to impact and influence contemporary visual culture in the age of AR.

Google Glass

Image: Google Glass

AR eyewear like “Glass” (2013) and Steve Mann’s “Digital Eye Glass” (EyeTap) (1981) are worn in front of the human eye, serving as a camera to both record the viewer’s environment and superimpose computer-generated imagery atop the present environment. With the position of the camera, such devices present a direct ‘Point-of-Eye’ (PoE), as Mann calls it, providing the ability to see through someone else’s eyes.

AR eyewear like Glass remediates the traditional camera, aligning our eye once again with the viewfinder, enabling hands-free PoE photography and videography. Eye am the camera.

Contemporary mass-market digital photography has us forever looking at a screen as we document an event, rather than seeing or engaging with the actual event. As comedian Louis C.K. so facetiously points out, we are continually holding up a screen to our faces, blocking our vision of the actual event with our digital devices. “Everyone’s watching a shitty movie of something that’s happening 10 feet away” he says, while the ‘resolution on the actual thing is unbelievable’.

Glass presents an opportunity where your experience in that moment is documented as is without having to stop and grab your camera. Glass captures what you are seeing as you see it through PoE, very close to how you are seeing it. Google Co-Founder Sergey Brin states, “I think this can bring on a new style of photography that allows you to be more intimate with the world you are capturing, and doesn’t take you away from it.”

Google Glass video

Image: Recording video with Google Glass, “Record What You See. Hands Free.”

I agree with Brin; Glass will bring on new stylistic modes and conventions through PoE, which also appears to be influencing other mediums outside of AR.

Take for instance the viral Instagram series “Follow Me” by Murad Osmann featuring photographs of his hand being led by his girlfriend to some of the world’s most iconic landmarks.


Photographs by Murad Osmann, “Follow Me” series, 2013.

(Similar in style to the above video recording visual from Google Glass of a ballerina taking and leading the viewer’s hand.)

The article “How Will Google Glass Change Filmmaking?”, identifies two other examples in contemporary music videos: the viral first-person music video for Biting Elbows and the award-winning music video for Cinnamon Chasers’ song “Luv Deluxe”.

In “The Cinema as a Model for the Genealogy of Media” (2002), Andre Gaudreault and Phillipe Marion state, “The history of early cinema leads us, successively, from the appearance of a technological process, the apparatus, to the emergence of an initial culture, that of ‘animated pictures’, and finally to the constitution of an established media institution” (14). AR is currently in a transition period from a technological process to the emergence of an initial AR culture, one of ‘superimposed pictures’, with PoE as a characteristic of the AR apparatus that will impact stylistic modes, both inside and outside the medium, contributing to a larger Visual Culture.

Gaudreault and Marion identify key players in this process as: the inventors responsible for the medium’s appearance, camera operators for its emergence, and the first film directors for its constitution. ‘Camera operators’ around the world are beginning to contribute to AR’s emergence as a medium, and through this process, towards an articulation of a media language of AR. Mann, described as the father of wearable computing, has been a ‘camera operator’ since the 90’s. In 2013, Google Glass’s early adopter program selected 8000 ‘camera operators’ to explore these possibilities, with Kickstarter proposals since from directors for PoE film projects including both documentaries and dramas. What new stories will the AR apparatus enable? Like cinema before it, what novel genres, conventions, and tropes will emerge in this new medium towards its constitution?

Let’s continue the conversation on Twitter: I’m @ARstories.

Google’s Project Glass & Defining a New Aesthetics in Augmented Reality

Will Google’s Project Glass change the way we see & capture the world in Augmented Reality (AR)? What kind of new visual space will emerge?

Google recently shared first-person perspective photographs and video captured directly from the Project Glass wearable AR eyewear prototype featuring video of a Glass team member doing a backflip on a trampoline (above) and photographs that  would be quite difficult for you to take while holding a camera in your hands. In a presentation at the Google+ Photographer’s Conference, Tech Lead Max Braun stated, “I think this can bring on a new style of photography that allows you to be more intimate with the world you are capturing.” Braun also pointed out how Glass is a connected device and how the moments you capture can be immediately shared. While going through some of the photographs team members had taken using Glass, he noted, “Some of the shots make you really feel like you’re there.” He referred to Glass as “an evolution of cell phone photography. It’s the next step of the camera that’s always with you.”

Having worked with AR as both a PhD researcher and designer for the past 7 years, my interests are in how AR, as a new medium, will come to change the way we see, experience, and interact with our world. Although Project Glass is still in an early prototyping phase, and the photography and video that have been shared are not AR, they do offer intriguing possibilities for how such AR eyewear can alter how we come to capture and share images and video of our surroundings and what new aesthetic styles of imagery this may generate.

We are at a moment in AR’s emergence as a new medium when we can look both to the future and to the past: still seeing the previous forms that are shaping AR while paving new paths, contributing to novel styles and conventions. It is imperative for artists, designers and storytellers to work collaboratively with computer engineers in industry and academia to steer AR forward and contribute to a new aesthetics and language of AR.


Image: David Hockney, “Merced River, Yosemite Valley, Sept. 1982”, Photographic Collage.

The photographs (and even video) from Project Glass, for me, recall artist David Hockney‘s photographic collage work, known as his “joiners”, in which multiple individually shot photographs were layered to compose a larger image. In conversation with writer and friend Lawrence Weschler, Hockney states of his photocollages, “I realized that this sort of picture came closer to how we actually see, which is to say, not all-at-once but rather in discrete, separate glimpses which we then build up into our continuous experience of the world” (Weschler 11). Hockney continues, “There are a hundred separate looks across time from which I synthesize my living impression of you” (11). The act of seeing is a process of synthesis akin to Hockney’s combination of photographs, each square documenting a separate look to compose a totality of the cumulative experience of seeing “across time” and forming, as Hockney states, a “living impression”, shaping and growing continually. Hockney’s joiners were greatly influenced by the Cubist artists’ sense of multiple angles and movement.

Image: David Hockney, “The Scrabble Game, 1983”, Photographic Collage.

Hockney states an “ordinary photograph” is missing “lived time”, referring to photography as “looking at the world through the point of view of a paralyzed Cyclops – for a split second”, and hence not conveying a true “experience of living in the world” (9). He notes to Weschler, “If instead, I caught all of you in one frozen look, the experience would be dead – it would be like looking at an ordinary photograph” (11).

Is there something extraordinary, then, about the photographs captured with Project Glass? Yes, I believe there is, and we’re only beginning to see what might be possible. So what’s so special about these images then; how do they differ from “ordinary photographs”? It’s interesting to think of the Project Glass photos as being captured by a “Cyclops”, to refer to Hockney, because, well, that’s basically what they are: a single lens attached to your head that sees and captures the world from a first-person perspective. Yet, to continue with Hockney’s conceptualization, I believe these images get closer to conveying “a true experience of living in the world”, one where your experience in that moment is documented as is without having to stop and grab your camera. Braun comments on “how effortless and natural it is to do so”, with Google Co-Founder Sergey Brin adding, “I think this can bring on a new style of photography that allows you to be more intimate with the world you are capturing, and doesn’t take you away from it.” Project Glass captures what you are in fact seeing in that moment, and very close to how you are seeing it. The experience is still very much alive, not dead, as Hockney argued of ordinary photographs, and ‘living’ also in the sense that these experiences can immediately be shared with others over a network.

Hockney’s collages were an inspiration to some of my early creative experiments in AR, in a series I refer to as the AR Joiners, 2008-2009. The AR Joiners extended Hockney’s concepts to use 2D video clips in AR in a tactile composite form, of individual paper markers overlapping to create one larger AR collaged scene. Each of the short video clips that compose the AR Joiners was recorded over a series of separate moments, as opposed to one take that was cut into multiple fragments running on the same timeline. This was a conscious design choice: the AR Joiners were about the body moving in time (in both capturing the video footage and to later have the viewer reassemble it as an AR experience, piecing together the separate video clips across time with paper markers, akin to Hockney’s photocollage process), in distinct moments and views, which accumulate to combine a total memory of the space or experience across time. (The AR Joiners are discussed in the ISMAR 2009 paper “Augmented Reality (AR) Joiners, A Novel Expanded Cinematic Form ” published by IEEE).

Image: “Rome Colosseum”, Screen Capture of AR Joiner, Helen Papagiannis, 2009. Each square shows 2D planar video playing atop AR markers to create a composite of a total scene across time.

When I was working on the AR Joiners from 2008-2009, Microsoft’s Photosynth had also recently launched. Photosynth, for me, recalled the aesthetic of Hockney’s Joiners. At the time, Photosynth was a web-based photo visualization tool (now available as a panorama app on smart phones) that could generate a three-dimensional (3D) representation from a collection of two-dimensional (2D) photos of a place or object. Software analyzed the photos for similarities and then constructed a 3D layered display of the photos through which viewers could navigate and delve further into the scene. “Synths”, as they were referred to, created a totality of the cumulative experience of seeing “across time”, comparable to Hockney’s Joiners (1970-1986).

Image: Photosynth of the Colosseum in Rome by Photosynth user Protec.

So what does all of this have to do with Google’s Project Glass?

I believe Hockney’s Joiners, the AR Joiners, and Photosynth each contribute to an aesthetic that Project Glass, in documenting our lived experiences of the world, has the potential to extend into new visual conventions. I’d like to propose that each of the above projects applies a (neo)baroque aesthetic, one which I think is very important for AR and we will see more of as AR continues to evolve into a new medium beyond just a technology.

In the article, “Architectures of the Senses: Neo-baroque Entertainment Spectacles”, 2003, Angela Ndalianis writes,

“The baroque’s difference from classical systems lies in the refusal to respect the limits of the frame. Instead, it intends to invade spaces in every direction, to perforate it, to become as one with all its possibilities” (Ndalianis 360).

This description of the baroque aligns quite nicely with Hockney’s Joiners, AR Joiners and Photosynth, which each demonstrate ways of moving beyond the limits of the single frame, expanding in multiple directions, puncturing conventional space.

Image: “The Glorification of Urban VIII”, Painting, Pietro da Corona, Rome, 1633-1639.

Ndalianis identifies Pietro da Corona’s ceiling painting “The Glorification of Urban VIII” (Rome, 1633-1639) in the Palazzo Barberini as baroque, where “the narrative from one panel literally spills into the narrative of another” and “the impression is such that, in order to spill into the next visual and narrative space the figures and objects perceptually appear to enter into our own space within the Palazzo Barberini” (361). In contrast she notes, “A strictly classically aligned composition would, instead, have enclosed and kept discrete the separate narrative borders.” (361). AR enters into “our own space”, with the narrative of the augmented environment spilling into our physical surroundings.

Ndalianis discusses how the spectator is central in (neo)baroque space and vision. She writes,

“With borders continually being rewritten, (neo)baroque vision provides models of perception that suggest worlds of infinity that lose the center that is traditionally associated with classically ordered space. Rather the center is to be found in the position of the spectator, with the representational center changing depending on the spectator’s focus. Given that (neo)baroque spectacle provides polycentric and multiple shifting centers, the spectator, in a sense, remains the only element in the image/viewer scenario that remains centered and stable. It is the audience’s perception and active engagement with the image that orders the illusion” (358).

With AR, the position of the spectator is the center, with the possibility of changing the AR experience depending on the spectator’s focus, position, and context. Just as Ndalianis writes, it is the spectator’s “perception and active engagement” with the AR “that orders the illusion.” In a previous article, I described AR as primarily a lean back model; however, AR has great potential to become an interactive lean forward model, one in which “active engagement” will make the spectator’s context, interests, and motivations even more central to ordering and defining the illusion.

Although the photographs Google shared are 2D and not interactive in this early stage, Google’s Project Glass has the potential to impact a “lean forward” model in AR and contribute to a (neo)baroque style in AR where the spectator, through customized eyewear, will really be at the center in a new ordered visual space. The possibilities for capturing first-person perspective photography and video directly via the eyewear may come to help define a new set of aesthetics and stylistic tendencies, perhaps one closer to Hockney’s vision of “lived time” and a “record of human looking” in his joiners. I am intrigued to see how the current 2D images and video will expand beyond the frame once Project Glass is AR enabled, to quote Ndalianis again, extending “space in every direction, to perforate it, to become as one with all its possibilities.”

Image: Haptics Demo from the Magic Vision Lab, University of South Australia.

But, dear readers, I cannot leave you here on just a visual note. We shall not limit AR to strictly a visual experience, it must be fully sensorial as we push ahead as an industry and community of researchers to grow the medium. Ndalianis writes, “When discussing the neo-baroque we also need to consider an architecture and regime that engages the sensorium” (367). She refers to “haptic, gustatory, olfactory, to the auditory and the visual”, all of which AR as a new medium needs to experiment with and extend into, beyond computer vision and tracking. Adrian Cheok’s keynote at ISMAR 2011 in Basel, Switzerland addressed the need for AR to engage the other senses. With projects exploring taste and smell in AR, like Meta Cookie from the University of Tokyo, and work being done in AR haptics, such as at the Magic Vision Lab, University of South Australia, AR will continue to expand in new ways, beyond visual frames and into the full human sensorium, to truly become “one with all its possibilities”.

Let’s continue the conversation on Twitter (I’m @ARstories), or in the comments section below.


Ndalianis, Angela. “Architectures of Senses: Neo-baroque Entertainment Spectacles.” Rethinking Media Change: The Aesthetics of Transition. Cambridge, Mass.: MIT Press, 2003.

Weschler, Lawrence. Cameraworks, New York: Knopf, 1984.

*Update: May 30, 2012: Thanks very much to Bruce Sterling for picking up my article and posting it on today!

*Update Feb 21, 2013: New video from Google showing how Glass will work