Remix, rewind, reinvent: predicting the future of media

WIRED US founding executive editor Kevin Kelly makes his predictions for the future of the creative industries

We are in a period of productive remixing

Paul Romer, an economist at New York University who specialises in the theory of economic growth, says real sustainable economic growth does not stem from new resources, but from existing ones that are rearranged to make them more valuable. Growth comes from remixing. Brian Arthur, an economist at the Santa Fe Institute who specialises in the dynamics of technological growth, says that all new technologies derive from a combination of existing technologies.

Modern technologies are combinations of earlier, primitive technologies that have been rearranged and remixed. Since one can combine hundreds of simpler technologies with hundreds of thousands of more complex technologies, there is an unlimited number of possible new technologies - but they are all remixes. What is true for economic and technological growth is also true for digital growth. We are in a period of productive remixing. Innovators recombine simple earlier media genres with later complex genres to produce an unlimited number of new media genres. The more new genres, the more possible newer ones can be remixed from them. The rate of possible combinations grows exponentially, expanding the culture and the economy.

We live in a golden age of new mediums. In recent decades, hundreds of media genres have been born, remixed out of old genres. Former mediums such as a newspaper article, or TV sitcom or a four-minute pop song still persist. But digital technology unbundles those forms into their elements so they can be recombined in new ways. Recent newborn forms include a web list article (a listicle) or a 140-character tweet storm. Some of these recombined forms are now so robust that they serve as a new genre. These new genres themselves will be remixed.

For instance, behind every bestselling book are legions of fans who write their own sequels or fanfic. They may mix elements from more than one book or author. Their chief audience is other fans. One fanfic archive lists 1.5m fan-created works to date.

Extremely short snips of video quickly recorded on a phone can easily be shared and re-shared with the Vine app. Six seconds is enough for a joke or a disaster to spread virally. These brief snips may be highly edited for maximum effect. In 2013, 12 million Vine clips were posted to Twitter every day and, in 2015, viewers racked up 1.5 billion daily loops. There are stars on Vine with a million followers. But there is another kind of video that is even shorter. An animated GIF is a graphic that loops through its small motion again and again and again. The endless repetition encourages it to be studied closely until it transcends into something bigger.

These examples can only hint at the outburst and sheer frenzy of new forms appearing in the coming decades. Take any one of these genres and multiply it. Then marry and cross-breed them. We can see the nascent outlines of the new ones that might emerge. With our fingers we will drag objects out of films and remix them into our own photos. A click of our phone camera will capture a landscape, then display its history in words, which we can use to annotate the image. With the coming new tools we'll be able to create our visions on demand.

The fungibility of digital bits allows forms to morph easily, to mutate and hybridise. The quick flow of bits permits one program to emulate another. To simulate another form is a native function of digital media. There's no retreat from this multiplicity. The variety of genres and subgenres will continue to explode. Some will rise in popularity while others wane, but few will disappear entirely. There will still be opera lovers a century from now. But there will be a billion video game fans and a hundred million virtual reality worlds. The accelerating fluidity of bits will continue to overtake media for the next 30 years, furthering a great remixing.

Hollywood's mashup future

At the same time, the cheap and universal tools of creation (phone cameras, YouTube Capture, iMovie) are quickly reducing the effort needed to create moving images and upsetting a great asymmetry inherent in all media. That is: it is easier to read a book than to write one, easier to listen to a song than to compose one, easier to attend a play than to produce one. Feature-length movies in particular have long suffered from this user asymmetry. A Hollywood blockbuster can take a million person-hours to produce and only two hours to consume.

To the utter bafflement of the experts who confidently claimed that viewers would never rise from their reclining passivity, tens of millions of people have in recent years spent uncountable hours making movies of their own design. Having a ready and reachable audience of potential billions helps, as does the choice of multiple modes in which to create. Because of new consumer devices, training, peer encouragement and fiendishly clever software, the ease of making video now approaches the ease of writing.

"The ease of making video now approaches the ease of writing"

This is not how Hollywood makes films. A blockbuster film is a gigantic creature, custom built by hand. Like a Siberian tiger, it demands our attention - but it is also very rare. Every year about 600 feature films are released in North America, or about 1,200 hours of moving images. As a percentage of the hundreds of millions of hours of moving images produced annually today, 1,200 hours is minuscule.

The handcrafted Hollywood film won't go away, but if we want to see the future of motion pictures, we need to study the swarming critters below - the jungle of YouTube, indie films, TV serials, documentaries, commercials, infomercials and insect-scale supercuts and mashups - and not just the tiny apex of tigers. YouTube videos are viewed more than 12 billion times in a single month; some videos have been watched several billion times each, more than any blockbuster movie. Over 100 million short clips with very small audiences are shared to the net every day.

Judged merely by volume and the amount of attention the videos collectively garner, these clips are now the centre of our culture. Some are made with the glossiness of a Hollywood movie; most are made by kids in their kitchen with a phone. If Hollywood is at the apex of the pyramid, the bottom is where the swampy action is, and where the future of the moving image begins.

The vast majority of these non-Hollywood productions rely on remixing, because remixing makes it much easier to create. Amateurs take soundtracks found online, or recorded in their bedrooms, cut and reorder scenes, enter text and then layer in a new story or novel point of view. Each genre often follows a set format. For example: remixed movie trailers. An unknown amateur may turn a comedy into a horror flick, or vice versa. Some fans create music videos made by matching and mixing a pop song soundtrack with edited clips from obscure cult movies. Or they clip scenes from a favourite movie or movie star, which are then edited to fit an unlikely song. These become music videos for a fantasy universe. Rabid fans of pop bands take their favourite songs on video and add the song's lyrics in large type. These lyric videos became so popular that some bands have started releasing official music videos with lyrics.

Remixing video can even become a kind of collective sport. Hundreds of thousands of passionate anime fans remix Japanese animations. They clip the cartoons into tiny pieces, some only a few frames long, then rearrange them with video editing software and give them new soundtracks and music, often with English dialogue. The new anime vids tell completely new stories. The real achievement in this subculture is to win the Iron Editor challenge. Just as in the TV cook-off contest Iron Chef, the Iron Editor must remix videos in real time in front of an audience while competing with other editors to demonstrate superior visual literacy. The best editors can remix video as fast as you might type.

An image stored on a memory disk instead of celluloid film has a liquidity that allows it to be manipulated as if the picture were words rather than a photo. Hollywood mavericks such as George Lucas embraced digital technology early (Lucas founded Pixar) and pioneered a more fluent way of film-making. In his Star Wars films, Lucas devised a method of movie making that has more in common with the way books and paintings are made than with traditional cinematography.

Typically, a film is planned out in scenes; the scenes are filmed (usually more than once); and from a surfeit of these, a movie is assembled. Sometimes a director must go back and shoot "pickup" shots if the final story cannot be told with the available film. With the new screen fluency enabled by digital technology, a movie scene is something more malleable - it is like a writer's paragraph, constantly being revised. Scenes are not captured (as in a photo) but built up incrementally, like paint, or text. Layers of visual and audio refinement are added over a crude sketch of the motion, the mix constantly in flux, always changeable. George Lucas's final Star Wars movie was layered up in this writerly way, his films were written pixel by pixel.

"In the last several decades, hundreds of media genres have been born, remixed out of old genres"

In the great hive mind of image creation, something similar is already happening with still photographs. Every minute, thousands of photographers are uploading their latest photos to Instagram, Snapchat, WhatsApp, Facebook and Flickr. The more than 1.5 trillion photos posted so far cover any subject you can imagine; I have not yet been able to stump the sites with an image request that cannot be found. Flickr offers more than half a million images of the Golden Gate Bridge alone. If you want to use an image of the bridge, there is really no reason to take a new picture. It's been done. All you need is a really easy way to find it.

The rise of database cinema

Similar advances have taken place with 3D models. On the archive for 3D models generated in the software SketchUp, you can find insanely detailed three-dimensional virtual models of most major building structures of the world. Need a street in New York? Here's a filmable virtual set. Out of these ready-made "phrases" a film can be assembled, mashed up from available clips or virtual sets.

Media theorist Lev Manovich calls this "database cinema", where component images form a whole new grammar for moving images. It's how authors work. We dip into a database of established words, called a dictionary, and reassemble these found words into articles, novels and poems that no one has ever seen before. The joy is recombining them. What we do now with words, we'll soon do with images.

For directors who speak this new cinematographic language, even the most photorealistic scenes are written over frame by frame. Film-making is thus liberated from the stranglehold of photography. Gone is trying to capture reality with one or two takes of expensive film and then creating your fantasy from whatever you get. Photography exalts the world as it is, whereas this new screen mode, like writing and painting, is engineered to explore the world as it might be.

But merely producing movies with ease is not enough, just as producing books with ease on Gutenberg's press did not fully unleash text. Real literacy also required a long list of innovations and techniques that permitted ordinary readers and writers to manipulate text in ways that made it useful. For instance, quotation symbols make it simple to indicate where one has borrowed text from another writer. We don't have a parallel notation in film yet, but we need one. Once you have a large text document, you need a table of contents to find your way through it. That requires page numbers. Somebody invented them in the 13th century. What is the equivalent in video? Longer texts require an alphabetic index, devised by the Greeks and later developed for libraries of books.

"If we want see the future of motion pictures, we need to study YouTube and TV serials"

With AI we'll have a way to index the full content of a film. Footnotes, invented in the 12th century, allow tangential information to be displayed outside the linear argument of the main text. That would be useful in video. And bibliographic citations (invented in the 13th century) enable scholars and sceptics to systematically consult sources that influence or clarify the content. Imagine a video with citations. These days we have hyperlinks, which connect one piece of text to another, and tags, which categorise using a selected word or phrase for later sorting.

All these permit any literate person to cut and paste ideas, annotate them with her own thoughts, link them to related ideas, search through vast libraries of work, browse subjects quickly, resequence texts, re-find material, remix ideas, quote experts and sample bits of beloved artists. These tools, more than just reading, are the foundations of literacy.

If text literacy meant being able to parse and manipulate texts, then the new media fluency means being able to parse and manipulate moving images with the same ease. But so far, these "reader" tools of visuality have not made their way to the masses. We don't yet have the equivalent of a hyperlink for film. With true screen fluency, I'd be able to cite specific frames of a film or specific items in a frame. Perhaps I am a historian interested in oriental dress, and I want to refer to a fez worn by someone in the movie Casablanca. I should be able to refer to the fez itself (and not the head it is on) by linking to its image as the hat "moves" across many frames, just as I can easily link to a printed reference of the fez in text; I'd like to annotate the fez in the film with other film clips of fezzes as references.

With full-blown visuality, I should be able to annotate any object, frame, or scene in a motion picture with any other object, frame, or clip. I should be able to search the visual index of a film, or peruse a visual table of contents, or scan a visual abstract of its full length. But how do you do all these things? How can we browse a film the way we browse a book?

The first visual literacy tools are already emerging in research labs and on the margins of digital culture. Take, for example, the problem of browsing a feature-length movie. One way to scan a movie would be to super-fast-forward through the two hours in a few minutes. Another way would be to digest it into an abbreviated version in the way a theatrical movie trailer might. Both these methods can compress the time from hours to minutes. But is there a way to reduce the contents of a movie into imagery that could be grasped quickly, as we might see in a table of contents for a book?

Some popular websites with huge selections of movies (such as porn sites) have devised a way for users to scan through the content of full movies quickly in a few seconds. When a user clicks the title frame of a movie, the window skips from one key frame to the next, like a flip-book of the movie. The holy grail of visuality is findability - the ability to search the library of all movies the same way Google can search the web. You want to be able to type key terms, or simply say, "bicycle plus dog", and then retrieve scenes in any film featuring a dog and a bicycle. In an instant you could locate the moment in The Wizard of Oz when the witchy Miss Gulch rides off with Toto. Even better, you want to be able to ask Google to find all the other scenes in all movies similar to that scene. That ability is almost here.

Google's cloud AI is gaining visual intelligence rapidly. Give it a picture of a boy riding a motorbike on a dirt road and the AI will label it "boy riding a motorbike on a dirt road". Both Google's and Facebook's AIs can look at a photo and tell you the names of the people in it.

Now, what can be done for one image can also be done for moving images. We'll be able to search video via AI. As we do, we'll begin to explore the Gutenberg possibilities within moving images. "I consider the pixel data in images and video to be the dark matter of the internet," says Fei-Fei Li, director of the Stanford AI Laboratory. "We're now starting to illuminate it."

As moving images become easier to create, store, annotate and combine into complex narratives, they also become easier to be re-manipulated by the audience. This gives images a liquidity similar to words. Fluid images flow rapidly on to new screens, ready to migrate into new media and seep into the old. Like alphabetic bits, they can be squeezed into links or stretched to fit search engines and databases. Flexible images invite the same satisfying participation in both creation and consumption that the world of text does.

From findability to rewindability

In addition to findability, another ongoing revolution within media can be considered "rewindability". In the oral age, when someone spoke, you needed to listen carefully, because once the words were uttered, they were gone. The great shift from oral to written communications gave the audience (readers) the possibility to scroll back to the beginning of a "speech", by rereading it. One of the revolutionary qualities of books is their ability to repeat themselves for the reader.

In fact, to write a book that is reread is the highest praise for an author. And in many ways authors have exploited this characteristic. They may add plot points that gain meaning on second reading, hide irony that is only revealed on rereading, or pack it full of details that require close study and rereading to decipher. Vladimir Nabokov once claimed, "One cannot read a book: one can only reread it."

Our screen-based media in the last century had much in common with books. Movies, like books, are narrative driven and linear. But unlike books, movies were rarely rewatched. In the century before videotape, there was no replaying. Television was much the same. A show broadcast on a schedule. You either watched it at the time or you never saw it. Because of this "oral" characteristic, shows were engineered with the assumption they would be seen only once, which forced the narrative to convey as much as possible in the first impression. But it also diminished it, because so much more could be crafted to deliver on second and third encounters.

First VHS, then DVDs, later TiVos and now streaming video make it easy to scroll back screenworks. If you want to see something again, you do. Often. If you want to see only a snippet of a movie or television programme, you do, at any time. This ability to rewind also applies to commercials, news, documentaries, clips - anything online, in fact. More than anything else, rewindability is what has turned commercials into a new art form.

"The ability to scroll back easily, precisely and deeply might change how we live in the future"

We are now witnessing the same inevitable rewindability of screen-based news. TV news was once an ephemeral stream of stuff that was never meant to be recorded or analysed. Now, when we scroll back news, we can compare its veracity, its motives, its assumptions. We can share it, fact-check it, mix it. Because the crowd can rewind what was said earlier, this changes the posture of politicians, of pundits, of anyone making a claim.

The rewindability of film is what makes 120-hour movies such as Lost, or The Wire, or Battlestar Galactica possible and enjoyable. They brim with too many details ingeniously moulded into them to be apparent on initial viewing; scrolling back at any point is essential. Music was transformed when it became recorded, rewindable. The ability to scroll back to the beginning and hear music again - that exact performance - changed music forever. Songs became shorter, more melodic and repeatable.

Games now have scroll-back functions that allow replays, redos, or extra lives, a related concept. All major software packages have an undo button. The complex pieces of consumer software, such as Photoshop or Illustrator, employ what is called nondestructive editing, which means you can rewind to any particular previous point. The genius of Wikipedia is that it also employs nondestructive editing - all previous versions of an article are kept forever. This "redo" function encourages creativity.

We are likely to get impatient with all the experiences that don't have undo buttons, such as eating a meal. We can't really replay the taste and smells of a meal. But if we could, that would certainly alter cuisine.

But the perfect replication of media in terms of rewinding is less explored. As we begin to lifelog our daily activities, to capture our live streams, more of our lives will be scrollable. This will shift what we do the first time. The ability to scroll back easily, precisely and deeply might change how we live in the future.

In our near future we'll have the option to record as much of our conversations as we care to. Some people will record everything as an aid to their memory. The social etiquette around recall will be in flux; private conversations are likely to be off-limits. But more and more of what happens in public will be recorded - and re-viewable - via phone cams, dashboard-mounted webcams on every car and streetlight-mounted surveillance cams. Police will be required by law to record all activity from their wearables. Rewinding police logs will shift public opinion, just as often vindicating police as not.

Rewindability and findability are just two Gutenberg-like transformations that moving images are undergoing. These and many other factors of remixing apply to all newly digitised media, such as VR, music, radio and so on.

Transformation trumps copyright

Remixing - the rearrangement and reuse of existing pieces - plays havoc with traditional notions of property and ownership. If a melody is a piece of property you own, like your house, then my right to use it without permission or compensation is very limited. But digital bits are closer to ideas than to real estate. How does one "own" a melody? When you give me a melody, you still have it. Yet in what way is it even yours to begin with if it is one note different from a melody a thousand years old? Can one own a note? If you sell me a copy of it, what counts as a copy? What about a backup? These are not esoteric theoretical questions. Music is a multibillion-dollar industry, and the dilemma of what aspect of intangible music can be owned and how it can be remixed is at the front and centre of culture today.

Legal tussles over the right to sample - to remix - snippets of music, particularly when either the sampled song or the borrowing song make a lot of money, are ongoing. The appropriateness of remixing, reusing material from one news source for another is a major restraint for new journalistic media.

Many aspects of IP laws are out of whack with the reality of how the underlying technology works. So what should the new laws favour in a world of remixing?

Appropriation of existing material is a venerable and necessary practice. As the economists Paul Romer and Brian Arthur remind us, recombination is really the only source of innovation and wealth.

I suggest we ask, "Has it been transformed by the borrower?" Did Andy Warhol transform the Campbell's soup can? If yes, then the derivative is not really a "copy"; it's been transformed, mutated, improved, evolved. The answer each time is still a judgment call, but it is the right question.

"Transformation" is another term for becoming. It acknowledges that the creations we make today will become something else tomorrow. Nothing can remain untouched, unaltered. By that I mean every creation that has any value will eventually and inevitably be transformed - in some version - into something different. Sure, the version of Harry Potter that JK Rowling published in 1997 will always be available, but it is inevitable that another thousand fan fiction versions of her book will be penned by avid amateurs. The more powerful the invention or creation, the more likely and more important it is that it will be transformed by others.

In 30 years, the most important cultural works and the most powerful mediums will be those that have been remixed the most.

This extract was taken from Kevin Kelly's new book, The Inevitable*, out now (Viking)*

This article was originally published by WIRED UK