The mind behind Minority Report is giving PowerPoint a sci-fi overhaul

John Underkoffler made a career out of dreaming up wild human-computer interfaces for Hollywood movies – until real-world businesses asked if he could actually build them

On November 23, 2000, John Underkoffler packed a bag, set off for the airport, and left Boston for good. Underkoffler was an MIT PhD research student obsessed by data representation and user interfaces and he was about to join his dream project as the official science advisor for a movie called Minority Report. The director was Stephen Spielberg and Tom Cruise played the lead role. The story was a long-gestating Philip K Dick adaptation set in the Washington DC of 2054, a future where crime has all but been obliterated, thanks to precogs – psychics who can predict crimes before they happen – and a specialist police unit, PreCrime, which is licensed to preemptively arrest criminals.

Embedded with the art department, Underkoffler spent a year helping to design the film’s futurescape. The brief from Spielberg was to make the film’s tech immediately visually legible. The director also didn’t want typical “sci-fi gadgets”. In his mind, the film was a noir, and the tech should be based on devices contemporary to 2001.

Underkoffler took the then brand-new E-Ink technology and extrapolated this to foldable digital newspapers seen in the film’s subway scene. Maglev transportation tech, then in development in Germany and Japan, became the basis for the self-driving cars that could move up the sides of buildings and dock with individual apartments. New technologies like fMRI – functional magnetic resonance imaging – inspired the complex headgear by which the dreams of the precogs were recorded.

The most challenging task facing the production, however, was the question of how to visually represent the precog visions, and how this data could be manipulated by Cruise, playing Chief of PreCrime John Anderton. For that purpose, Underkoffler decided to adapt a technology he had been developing called g-speak, a spatial computing programme that allows the user to control on-screen pixels with simple manual gestures.

G-speak had been inspired by the research of Colombian-American neurophysiologist Rodolfo Llinás. Llinás writes that as homo sapiens evolved as a species, its awareness of our surroundings increased so that we knew where to hunt, when to eat, and when to run. Everything we see, taste, hear, sense, and feel is new information, and as such, the more of these inputs we can glean from our environment, the better equipped we are to deal with it. Underkoffler believes the same rules apply to user interfaces. The more a user interface is able to replicate how humans interact with the world, the more our interactions with computers will come to feel natural and intuitive. G-speak allows data to be shared across multiple machines and surfaces – from screens to tables – and accessed by multiple users at once.

For Minority Report, Underkoffler used his early research around g-speak to develop a unique gestural interface unlike anything previously seen on screen. To demonstrate to Spielberg and the cast how g-speak would work in practice, Underkoffler excused himself from the set for a few days, jerry-rigged a green screen in a friend’s back garden and then filmed himself executing sequences of gestural commands. After watching Underkoffler’s video, Spielberg was enthused, ordering a script re-write in order to allow more characters to use Underkoffler’s interface.

G-speak takes centre stage in the opening scenes, in which Cruise needs to locate a would-be murderer as the clock ticks down. Donning a pair of interface-enabling gloves, Cruise raises his arms like a conductor, and the precog’s vision appears on a clear, curved screen in front of him. We see snatches of a woman in bed, an angry man raising his arm, then stabbing downwards. Cruise and his PreCrime team know who will die, and when – but not where the murder will take place.

Holding his left hand up, palm towards him, thumb and first two fingers outstretched like a pistol, Cruise is able to pause the video. By sweeping his hands to the right, as though he is trying to toss a paper ball into a bin, he dismisses the image on screen. By extending two fingers, then tracing a loop in the air, he spreads out a digital file containing mugshots of possible perpetrators. It’s a flurry of movements, at once fluid and natural, each bearing Underkoffler’s signature.

Cruise’s character is required to zoom in on a newspaper left on the lawn of the soon-to-be-murder victim’s home. Underkoffler had not designed a specific zoom movement so he and Cruise pondered what to do. “What if I did this?” Cruise said, extending his left arm toward the room's giant curved screen and bending his wrist so that his left hand formed a “stop” at the end of the arm. By sliding his right hand along his arm, he could access different zoom levels. “My arm is like a UI slider.”

After the critical and box office success of Minority Report, Underkoffler was recruited to work on a number of blockbusters. For Hulk (2003), he was asked to conceive the gamma-radiation accident that leaves Bruce Banner with an angry green giant residing inside of him. On Aeon Flux (2005), Underkoffler pondered what possible building material might be used in an isolated city in an otherwise ruined world — bamboo, it turned out.

Following the release of Minority Report in 2002, Underkoffler began receiving calls from Fortune 500 companies, including Accenture, Wells Fargo and Fujitsu, enquiring whether the tech they’d seen on the cinema screen was real. And if g-speak wasn’t real, well, could Underkoffler develop it for them? After the fourth or fifth phone call asking the same question, Underkoffler realised that it might be a good idea to get back in the lab and give it a go.

After six years in Hollywood rolled by, Underkoffler was also beginning to feel the urge to return to his unfinished research projects. “To spend a year of one’s life [working on a film]… that commitment should be more directly meaningful and more personal,” he says. “I’m a designer and engineer, and obsessed with user interface. I felt like I had to get back to it.”

His mission wouldn’t be restricted to developing data-sharing technology; he wanted to change how we interact with computers entirely. More than 30 years after the advent of the Apple Mac operating system, interfaces have mostly remained unchanged. Yet, since 1984, memory, graphics power, processor speeds, and disk capacity have been multiplied by between 10,000 and one million per cent. Despite these advances, how we interface with our computers has barely changed in decades.

The mistake, Underkoffler argues, wasn’t that we got user interfaces wrong back in 1984, but that they stopped evolving. An upgrade was long overdue, he believed. In 2006, he left Hollywood, founded Oblong Industries, and dedicated himself to bring the g-speak interface to the world.

Born on the last day of June 1967, Underkoffler spent his childhood on a farm 55 kilometres outside Philadelphia. His mother had trained as a nurse, and his father worked in a family-owned company manufacturing synthetic tights after the Second World War.

Growing up, Underkoffler and his two brothers had free rein over 44 hectares of fields and woodland – in the middle of which he came across an old dump, housing the refuse of the previous half-century. “There were these beautiful old glass bottles,” Underkoffler recalls. “It was a decaying record of life from the past five decades. It was an amazing history lesson.” To Underkoffler, these bottles represented a record of a past not restricted to a museum display case or a textbook, but something he could hold and feel in his hand.

In 1980, his parents invested in their first home computer, an Apple II Plus, and he began writing code in every spare moment. “There was a fantastic computer enthusiast magazine called Softalk,” he explains. “It would publish programmes in machine language and you’d type them in. The beautiful thing about these machines is that you could become the custodian of this entire, infinitely expandable world.”

Five years later, Underkoffler enrolled at the prestigious MIT Media Lab. An academic incubator with its origins at the MIT’s school of architecture, the Media Lab had been set up that year to encourage collaborative research across a range of disciplines, from technology to media, art, and design. “There was a perpetual amount of excitement,” Underkoffler says of the Lab. “It was like being in the inside of a neutron star.”

One of Underkoffler’s lecturers at the Media Lab was Muriel Cooper, a graphic designer who ran the Visible Language Workshop. Cooper believed that society was moving away from a focus on the mechanised processes prevalent in the early part of the century, and was now placing a new value on raw information. Such a shift, Cooper argued, required new ways of visualising and communicating data.

Cooper’s design philosophy was the inspiration behind the early versions of g-speak. In 1998, Underkoffler created the Luminous Room, a project in which the ordinary lightbulb was replaced with internet-connected projector-cameras, dubbed “I/O Bulbs”. The idea was that by enabling data to be projected onto any surface in a room, this data was liberated from the computer screen, and, for the first time, situated in the real world. This also meant that data could be manipulated without a mouse or keyboard. As such, it was one of the earliest hints of the capabilities of what would eventually become g-speak.

One exploration of the Luminous Room was the “Chess & Bottle system”, which allowed text, images, and live video to be displayed on screen, then, with a particular gesture (in this instance, turning a vase 180 degrees) the data would be incorporated into a vessel transported across the screen, and unpacked on the far side. If the glass bottles of Underkoffler’s youth brought him information from another time, the g-speak glass vessels were able to transport data of many different mediums in real-time.

Another project, Urp, was an architectural design tool that used the I/O Bulb to project digital shadows onto a workbench. The shadows would lengthen and shorten depending on the placement of small architectural models. This allows designers to see the shadow a building would cast at any particular latitude, season, or time of day. The simulated material could also be changed so that in one instant a shadow formed by a brick wall may be displayed, and in the next, the reflection from a glass partition of the same size.

Inspired by the Media Lab’s ethos, Underkoffler’s ideology around user interfaces drew from various sources. He cites science fiction author William Gibson’s writing on the “metaverse”, a shared virtual space that melds virtual space with virtually-enhanced physical objects. The 1981 Atari arcade game Tempest, in which the player shoots geometric shapes, was another influence. Both share the distinction of rejecting the “real world” we recognise every day in favour of surreal visuals that can be manipulated in unconventional ways, something Underkoffler made reality with the Luminous Room. Who previously had thought of storing videos in a vase, after all?

Oblong’s London showroom is located in an unassuming first floor office off Shoreditch High Street. Today, Oblong employs some 120 people, and provides software to 150 of the Fortune 500 companies. Padraig Scully, Oblong’s technical account manager for Europe, the Middle East, and Africa, leads me into a conference room, where three screens occupy the back wall, with three more set up on the perpendicular wall. This, Scully explains, is Oblong’s prime product, Mezzanine, a video-conferencing software that runs on g-speak and allows any and all team members to share and manipulate each other’s on-screen data live. Today, Mezzanine is employed by more than 150 customers on six continents, including JLL and Inmarsat in London, and Boeing and Nasa in the US.

The three 211 x 116 x 28 mm screens in front of us are arranged horizontally, end to end, while the secondary set of screens, on the wall to the left, are arranged vertically side by side. (This is Mezzanine 600, the second largest set-up after the nine-screen Mezzanine 650.)

Scully dials Oblong’s LA headquarters where John Underkoffler is waiting to take our call in a similar conference room, outfitted with its own Mezzanine set-up. Dressed in grey chinos and a light blue button-down, he reclines in an office chair as his image appears on screen. He wears his hair short, a grey goatee on his chin, owl-like eyebrows framing inquisitive eyes.

Mezzanine, Underkoffler tells me, is the antidote to irksome corporate meetings in which a single person hogs the only USB port, subjecting their colleagues to a dry PowerPoint presentation that they alone control. To demonstrate that, Scully pulls up fake architectural blueprints of an office building. They appear on the horizontal screens, to the right of our live link to Underkoffler, who can now also see them on his screen in L.A.

Mezzanine runs on g-speak, but instead of Cruise’s sensor-embedded gloves, it is controlled by a wand – a sleek remote which uses infrared sensors in the ceiling to interact with the screen. In order to use the wand to manipulate the data on the screen, each pixel is given an x, y, and z coordinate instead of the usual numerical code, allowing it to be controlled via three-dimensional movements. By pressing a button on the wand and moving the device towards the screen, I am able to zoom in on a section of the blueprints. Holding the same button down and drawing a square around the image produces a screen grab. The grab appears in a bank of saved images at the bottom of our screen. By clicking on it again, I’m able to drag it onto the left-most screen.

The three screens now display my screen grab, the live link with Underkoffler, and the original blueprints. In LA, Underkoffler can see the exact same information. Not only that, but using his own wand, he’s able to draw on top of my screen grab, highlighting a particular section of wall or floorspace.

Next, Scully turns to a whiteboard at the back of the room and writes a message to Underkoffler. Using the wand again, he’s able to photograph the board and transport the message to the screen. In LA, Underkoffler is able to use his own whiteboard to write on top of Scully’s words or scrub them out entirely.

Scully then pulls up a series of mock designs for a range of fictional fruit drinks. On screen, he is able to underlay different logos beneath mock iPhone, web and billboard adverts. He creates three options in total, using the wand to move each across to the three vertical side screens, where we’re able to instantly compare them, the electronic equivalent of pinning printouts to a cork board.

By allowing information to be drawn on, crossed out, zoomed-in on and otherwise manipulated by an entire creative team at once, Mezzanine seeks to make meetings more efficient. There is no need to wait while designers to go off to re-work an idea, or latecomers wait for an extra copy of a printout. As Underkoffler explains, “Projects that previously took five weeks can now be completed in five hours.”

Not only that, Underkoffler argues, but it allows corporations to share ideas in a 3D space. In LA Underkoffler holds up a smartphone. “It’s horrible that everything I want to see has to be on this little screen,” he says. “Some things require more seeing.”

Los Angeles’ Skid Row is still very much the home of the down-and-outs. Oblong Industries has had its headquarters in the neighbouring Art District since 2008, but it is only recently that other startups have moved to the area, along with sushi restaurants and a brewing company. It is an area frequently used by film crews, with fake shoot-outs and car chases taking place on a near-weekly basis. Recently, John explains, a gas station appeared on a nearby street corner, much to the elation of his staff. A week later, it blew up. It had been a movie set all along.

The Oblong HQ is a warehouse space with open-plan workstations and beautiful, 100-year-old wooden roof beams. In previous lives, Underkoffler has discovered, it was a sweatshop and a pornographer’s set. Today it is a bright, modern space with swathes of coloured paint on the walls, and a stack of employee cycles next to a dining area that has borrowed its interior design from a trendy coffee shop. Banks of computer monitors hold the attention of young employees in chinos and T-shirts, code scrolling across their screens.

In Underkoffler’s office, a small space on the top floor, he discusses the gradual transition from being an academic researcher to the leader of a company developing usable tech for a real-world market.

“As a researcher, it's your job to invent new ways of looking at the world, and you do so with a set of theories about what'll be useful,” he says as he pours coffee. “But you're not actually limited by on-the-ground details that would affect usability. In the commercial realm, those details are intensely critical.”

In other words, should your groundbreaking new tech design not fit the consumer requirements, it’s likely to be of little use. By the same token, Underkoffler believes that too much focus on real-world applications can limit creativity. As in most things, a balance between optimism and pragmatism may be the best approach.

A case in point is Oblong’s work for Saudi Aramco. The company’s GigaPOWERS system is the world’s most sophisticated oil and gas reservoir simulator. The problem was that the stakeholders, drilling foreman, and engineers all needed to access and interact with the system in real time, and on a massive visual scale. Three abutted high-definition projectors were used to visualise the relevant reservoir, allowing developers to ask questions like “What happens to production if we move that wellhead 500 metres north?” The system would let anyone pick up and move the wall on screen, and actually see what would happen.

For General Electric, Oblong constructed an interactive map of a smart grid energy management system. Problems to be solved included the best way to navigate the space, how best to allow users to zoom in from a national level down to street level to view downed power lines in real time, and how to allow a variety of workers to use the system at once, with each working on different tasks and using different input modalities including smart wands, tablets and web browsers.

The majority of such problems are tackled at Oblong’s R&D and Prototyping Warehouse, a short 15-minute walk away from the Art District office. Two-thirds of the warehouse space operates as a traditional warehouse, with stacks of wooden crates ready to be loaded into lorries and shipped around the world; the rest is where the developers live.

On a typical day, ten engineers work at the back of the warehouse space. In the corner of this zone stands a vast semi-circle comprising 45 screens, reaching some two metres high and all but enclosing a user inside. The set-up boasts over 90 million pixels.

To demonstrate its capabilities, engineer Pete Hawkins pulls up a representation of the Earth, with coloured dots hovering around the surface. These, he explains, denote seismic data. They are arranged by magnitude; dots further out are the less common, larger earth quakes, those closer are smaller, more frequent quakes. Different colours are associated with the depth of the quakes. Blue is shallow, red is something to worry about. The potential life-saving applications of such a system are immediately obvious, promoting analysis in a way that’s difficult to do with a spreadsheet.

“Our goal is to get beyond rows and columns of data,” Hawkins explains. “In an Excel spreadsheet, our experience with the data is severely limited. By putting this in human terms we get more of a human take.”

To date, Oblong’s most successful collaboration has been with IBM. In particular, building a visual face for its abstract Watson technology. The solution was the circular bank of 45 screens, with visuals displaying stock market data in real time as a swirl of brightly coloured pixels, each representing a particular market trend. Geometry had again afforded an elegant solution.

“People ask about developing a portable VR version [of the software], but that wouldn’t be a shared experience,” explains B Cavello, from Watson. “When you’re making strategic decisions, and checking people’s facial expression to check everyone is on the same page, that level of disconnect doesn’t really work for us. Having a space where you can have a conversation and navigate the content immersively is really valuable.”

Underkoffler believes that for g-speak to fully realise its capabilities – and ours – new user interface technology needs to appear everywhere, not just in conference rooms. Should larger tech companies get on board, he believes g-speak could become ubiquitous in as little as two years. Underkoffler mentions Microsoft in passing, but cannot discuss specific companies or details of discussions they may – or may not – have had.

“I’ve been wondering for a while what’s the right place to ignite the conversation around user interface and extending human capability,” he muses. “I’m not convinced that the place to do it is in a computer science context. It occurred to me that experimental architects are the minds that are [best] set up to talk about spacial interface, [and the] social and cognitive interactions that architecture already designs for.”

Whether or not we’re all using g-speak by 2021, the solution to the future of user interfaces is likely already in front of us, and may be far more simple than it seems. “If we don’t know how to design something, we ask what people would do in the real world, with other people?” Underkoffler asks. “That is always the answer.”

This article was originally published by WIRED UK