Categories
Uncategorized

Neural Networks and New Regularities

In my first post, I argued reality is full of regularities. Exploiting these regularities requires information about them that can be conveyed and communicated. I call this information representations, of which there are five categories: rules, probablistic statements, metaphors, neural networks, and instantiations. Today I want to talk about neural networks.

Neural Networks: A Very Basic Primer

If you are already familiar with neural networks, feel free to skip this section. This is a really simplified explanation, and I’m omitting a lot of detail . If you want to learn more, the chapter on neural networks in The Master Algorithm by Pedro Domingos is a recent overview. The medium series “Machine Learning is Fun!” is an excellent primer if you want to get your hands a bit dirty.

Neural networks were originally inspired by the structure of our brains. Somehow, these arrangements of biological matter are capable of thinking, computing, and learning. What is it about our brains that makes that possible?

Our brains are a network of neurons. Neurons are complicated creatures, but three of their most important features are:

  • They are connected to each other.
  • They can send signals of varying intensity (including negative or inhibitory signals) to each other.
  • They can turn more or less “on” as a function of the incoming signals

Neural networks jump off from there, replacing actual cells with simplified virtual counterparts. In a modern neural network, the “neurons” can belong to one of three layers: input, output, and hidden.

Input layers code for “features” of something. For example, if the neural network is designed to classify black and white images, each pixel in the image could be a feature associated with an input neuron. The brighter the pixel, the more “on” the input neuron. If the neural network recognizes speech, the intensity of sound waves at a certain frequency might be a feature. More intensity at that frequency might correspond to being “on” for one of the neurons. If the neural network is designed to play Go, the presence of a white stone at each point on the game board might be a feature. The neuron is “on” if a white stone is present at the input neuron’s associated point on the game board.

Output layers are how the neural network communicates with its users. If the network is an image classifier, there could be an output neuron for every class of photo. How “on” the neuron is could convey the confidence that the image belongs to that class. If the neural network is a speech recognizer, the output neurons could be text (words or letters). If the neural network is designed to play Go, the output neurons communicate the chosen move (there may be a neuron for every space, and the one the network would like to place a stone turns on).

Hidden layers lie between the inputs and output layers, and do the “thinking” of the network. In modern neural networks there can be many hidden layers of neurons. Together, they form a complex web of connections, each with potentially positive or negative weights, and with each neuron having a potentially different threshold for activation. The operation of a neural network is about the propagation of signals forward through the network. The input neurons corresponding to the data’s features turn on and send signals to connected hidden neurons. Each of those neurons adds up the strength of the incoming signals (which can each be positive or negative), and depending on how much the sum exceeds a threshold, turn more or less on. The hidden layer neurons then send signals to the neurons connected to them. This process propagates forward until some of the output neurons are activated, communicating the “thoughts” of the neural network.

That’s it. Nothing magical is happening. The propagation of signals at each step happens according to relatively simple rules. The big picture is complicated, but only because there are so many of these small steps, and because they build on each other.

A useful thing about neural networks is that you don’t actually have to set the connections, signal strengths, and thresholds. If you have enough data, there are algorithms that can do all that automatically. Give the neural network an example set of features and see what its output neurons say. If they are wrong, propagate back from the outputs adjustments to the signals and thresholds so that the network is more likely to make the correct identification the next time it sees similar features.

Do our brains really work this way? I certainly don’t sense neurons signalling to each other and adding up thresholds. To the extent our brains work this way, the hidden layers are unconscious and we are only conscious of the output. We see a face, and we experience the thought “oh, that’s Grandma” leaping into our head unbidden. But underlying this recognition is (possibly) a biological neural network trained from birth to identify and categorize faces. The light coming into our eyes gets shunted off to various input neurons, which then propagate that information through hidden layers until the correct output neuron (or set of neurons) lights up. At that point, I have the conscious thought “Grandma!”

Neural Networks and Representations

At a more abstract level, neural networks represent local regularities in the data they are trained on with the structure of the network (including its signal strengths and each neuron’s threshold). This turns out to be a very flexible way of representing regularities. Indeed, neural networks can represent regularities that are difficult to concisely express in alternative schemes such as rules, probabilistic statement, and metaphor.

There are two sides to this coin. On the one hand, neural networks are capable of representing regularities that simply can’t be concisely represented any other way. On the other hand, the very nature of these regularities is such that they defy translation into alternative forms of representation. There is no metaphor, rule, or probability that “explains” the decision-making of the neural network (except a rule that tediously describes the neural network’s complex inner-structure).

Take this excerpt from Part IV of “Machine Learning is Fun!”, which describes how to train a neural network to identify faces:

…the neural network learns to reliably generate 128 measurements for each [face]. Any ten different pictures of the same person should give roughly the same measurements.
…So what parts of the face are these 128 numbers measuring exactly? It  turns out that we have no idea. It doesn’t really matter to us. All that we care is that the network generates nearly the same numbers when looking at two different pictures of the same person.

The tradeoff is that neural networks allow us to learn regularities that can’t easily be represented in our favored modes, but at the cost of unintelligibility.

Neural networks are also capable of innovating, in the sense of exploiting these regularities to step into the unknown and make a choice. But since the regularities they exploit are foreign to us and our very way of thinking, the innovations they come up with may seem mysterious and almost magical.

This is demonstrated really well in the netflix documentary AlphaGo.

AlphaGo and Unknown Regularities

AlphaGo’s 2016 match against Lee Sedol, as captured in the documentary AlphaGo, is a great example of how neural networks can exploit local regularities we don’t understand. AlphaGo is a program designed to play the board game “Go.” The game is relatively straightforward: two players take turns laying stones on a 19×19 grid. The goal is to enclose more territory than the opponent. However, because the board is so large, the set of possible games is enormous. There are 361 possible positions for the first stone, 360 possible positions for the second, 359 for the third, and so on. There are over 5.9 trillion possible positions of the first five stones. This enormous space of possible games has long meant that brute force calculation doesn’t work very well, even for computers. Go has been played for so long, and by so many though, that a large number of local regularities related to the game have been identified.

AlphaGo does not rely completely on neural networks, but they are a prominent component of its programming. In 2016 AlphaGo faced off against Lee Sedol, a legendary Go player considered to be the greatest of the last decade. Throughout the five game match, AlphaGo made surprising moves that baffled commentators, but later paid off. Move 37 in game 2 is a wonderful illustration:

Commentator #1: Oh, wow.
Commentator #2: Oh, it’s a totally unthinkable move.
Commentator #1: Yes.
Commentator #3: The value… that’s a very… that’s a very surprising move.
Commentator #4: (chuckling) I thought it was a mistake.
Fan: When I see this move, for me, it’s just a big shock. What? Normally, humans, we never play this one because it’s bad. It’s just bad. We don’t know why, it’s bad.

Fan (a European Go champion) makes my point very well. Humans have played Go a long time, and they have internalized certain regularities (indeed, in this case, the regularity that this is just a bad move is known without being understood why!). AlphaGo is playing a move based on regularities unappreciated by human players. In fact, one of the creators later peers inside AlphaGo’s program and learns AlphaGo assigns a probability that a human would play this move at 1 in 10,000.

As the game unfolds, the brilliance of the move becomes clear. Lee Sodol later discusses this move:

Lee Sodol: I though AlphaGo was based on probability calculation and it was merely a machine. But when I saw this move, I changed my mind. Surely, AlphaGo is creative. This move was really creative and beautiful… This move made me think about Go in a new light. What does creativity mean in Go? It was a really meaningful move.

AlphaGo goes on to win the game, with move 37 eventually seen as a turning point.

I don’t the nature of the regularity that AlphaGo exploited. It might have been the kind of thing easily explained but something human players had simply missed for millenia. That strikes me as unlikely. It might have been something easy to explain (if AlphaGo had been trained to do so), but only exploitable if you have the capacities of AlphaGo (e.g., the ability to track dozens of positions in parallel). I prefer to believe it was something new: a regularity impossible to express in our favored forms.

Uncanny Genius

While neural networks are inspired by the brain, it’s not clear the extent to which our brains actually work that way. I don’t have the expertise to weigh in on this debate. But, to conclude, let’s assume the picture painted above is broadly applicable to the human brain. Doing so can provide a plausible explanation for the mysterious judgments of geniuses, when they simply intuit an answer with uncanny precision, unable to provide an explanation for their insight.

Earlier, I gave the example of how brain structures, organized like neural networks, could identify your grandma from a sea of faces. The interesting thing here is that we experience this as automatic. We just “know” that’s grandma, without access to the underlying categorization process in our own heads.

Our ability to just “know” Grandma’s face doesn’t strike us as particularly mysterious. But when well trained neural networks in our brains do less common things, it can seem mysterious and magical. An expert in a particular domain – mathematics, engineering, science – sees countless examples in their domain over a career. Each example trains their internal representation of the domain. Then, one day, facing a novel situation they just know what to do. And their inability to explain themselves leaves observers dumbstruck and in awe.

I’ll close with an example of this from Gary Klein’s study of firefighters (recounted in Superforecasting by Phillip Tetlock and Dan Gardner). An experienced firefighter commander is combatting a routine kitchen fire that is behaving a bit strange. The commander is seized with an uneasy feeling. He orders everyone out of the house. Moments later, the floor collapses. It turns out the true source of the fire had been the basement.

How did he know trouble was afoot? We can imagine the neurons and connections in the firefighter’s head were tuned by countless experiences with fire, until their structure encoded regularities in fires impossible to clearly express in rules, metaphor, or probabilities. Just like AlphaGo the commander was unable to explain how he knew to get out. He described it as ESP.

 

Categories
Uncategorized

What is innovation? Fundamentals

 

“The thing that hath been, it is that which shall be; and that which is done is that which shall be done: and there is no new thing under the sun.” – Ecclesiastes 1:9 (King James Bible)

Ecclesiastes is wrong. New things happen all the time these days. Airplanes, iPhones, wikipedia, nuclear weapons, chemotherapy, and skyscrapers were new things. Evolution, quantum mechanics, general relativity, Marxism, and computer science were new things. Facebook, Amazon, SpaceX, Microsoft, and Walmart were new things. Star Wars, Harry Potter, Guernica, and YMCA (the song) were new things. Shakespeare would have been new to the cynical narrator of Ecclesiastes. And new things had been emerging for millions of years when Ecclesiastes was written. True, at the time it would have been hard to see novelty. But mammals, fish, insects, and multi-celled life were all new long before Ecclesiastes was new.

Did these things bear no resemblance to that which came before? Of course not. Do they resemble various antecedents? Certainly. Much of this series will explicitly explore these resemblances. But to say these are therefore not “really” new is to miss something vital.

This is a blog about the new things under the sun. It is a blog about innovation and the emergence of reproducible novelty.  This is the first in a planned series of posts answering the question “how does innovation happen?” This introduction sets the series up and lays out the way I think about innovation. With that foundation established, I hope the remaining posts in the series will stand on their own.

Defining innovation

By innovation, I mean something quite broad. For sure I mean to include new physical technologies, such as the iPhone. But I want innovation to also encompass a much wider class of human creation. By innovation, I also mean to include academic theories. And art. And I want to include new ways of organizing people – e.g., into armies, churches, and companies. And when I say innovation, I don’t even want to limit it to humans. Nature innovates in the proliferation of countless new species and life forms. And one day humans might outsource their innovation to artificial intelligence.

For the purposes of this blog, I adopt a big tent definition: innovation is the emergence of things that are novel, interesting, and reproducible. “Novel” and “interesting” are self-explanatory, albeit a bit in the eye of the beholder. In contrast, it may not be immediately obvious why reproducibility ought to matter in a definition of innovation. Reproducibility implies two main things.

First, it limits us to things that can, in principle, be classes rather than specific instances. If the thing is a complete one-off then it’s not an innovation, even if it’s new and interesting. For example:

  • The iPhone was an innovation because Apple can (and does) churn them out by the millions. Steve Jobs’ personal iPhone might be interesting (for example, a museum might want to buy it). It was new at one time. But we can’t make more of them, so it’s not an innovation. It’s an artifact.
  • The “ride-share platform firm” is an innovation with several examples (Uber and Lyft being most prominent). Uber itself isn’t an innovation because it’s a singular firm. An exact copy isn’t possible and wouldn’t be Uber… it would be a ride-sharing firm. But if Uber develops a new strategy that others can copy, at least in principle, then that new strategy would be an innovation.
  • The human species was an innovation when it emerged. Jeff Bezos is not an innovation because we can’t make more of him, even if he was new at one time and is an interesting individual.

So when I speak of innovation, I am talking about the creation of a new blueprint for a category of things. I’m not talking about a new and interesting singular instance. However, if I were to leave it at that, I would be leaving in a lot of things we usually don’t think of as innovation. Specifically, I would be leaving in anything new and interesting that happens repeatedly. Things like car crashes, volcanic eruptions, and sunsets.

This is where the second implication of reproducibility comes in. When I say reproducible, I mean the replication requires access to information embodied in the thing. It requires access to a blueprint, or, failing that, access to the thing itself. In this specific (idiosyncratic?) sense, car crashes, volcanic eruptions and sunsets are not reproducible. Instead, they are just things that happen whenevever physical conditions are right. This might be often, so that they happen repeatedly. But in each instance, there is no reference to earlier instances or blueprints. Reproducing a new thing means using information to recreate it without rediscovering it.

In this series, the primary objects of study will be organisms, technologies, organizations, and ideas. Sometimes the boundaries between these objects is fuzzy. Don’t get hung up on that. The distinctions aren’t really important, so long as they are new things under the sun.

What’s the challenge?

So innovation requires creating something new, interesting, and reproducible. Of these, the first two are the really hard part.

It’s not very hard to come up with something new. Kick a bucket of blocks, scatter paint on a wall, or string letters together in random gibberish. With enough stuff, almost any bit of disorder is an unprecedented configuration of atoms. Neither is it very hard to make things reproducible. If you make the incentives strong enough, a dedicated person can carefully place the blocks, paint the scattered pattern with a brush, or type out letters. Indeed, with enough care and description, lots of disorder can be reproduced. But if it’s not interesting, why would anyone want to?

“Interesting” things are rare in the universe. A blind leap into the unknown will frequently be new, and if you take good notes, reproducible. But a blind leap into the unknown will almost certainly fail to discover something interesting.

This intuition underlies a common (but misguided) argument against evolution by natural selection and for intelligent design (https://en.wikipedia.org/wiki/Junkyard_tornado): the chances a randomly swirling cloud of atoms will settle into the shape of DNA (much less a living cell) is lower than the probability of a tornado whirling through a junkyard and assembling a Boeing 747 by chance. As a basic premise, I think that’s probably right. And it seems to be a fundamental operating principle of the universe. A bunch of monkeys banging on keyboards won’t produce Shakespeare by chance. Well, unless you have a long time to wait and a lot of monkeys. And only children think you can invent something useful by randomly connecting disparate bits of technology.

Thus the challenge of innovation is finding a way to take a leap into the unknown and to find something interesting when you land. The key is, innovation is not a random leap into the unknown. Innovation is a considered leap into the unknown.

What guides our leaps is knowledge of regularities in the world.

A regularity is a pattern in reality. The laws of physics are regularities, but most regularities do not rise to this universal level. A regularity may only hold in your local environment. It may be temporary. It may be inconsistent. To be useful to an innovator, it just needs to be exploitable. With information about regularities, innovators do far better than random chance.

This will be clearer with some examples.

Regularities in technologies, organisms, organizations, and ideas

Let’s start with physical technologies. In The Nature of Technology: What it is and How it Evolves, economist Brian Arthur makes the observation that all technologies are built from sub-components. These sub-components are themselves composed of sub-sub-components, which are in turn composed of sub-sub-sub-components and so on. The F-35 jet, for example, is composed of:

the wings and empennage; the powerplant (or engine); the avionics suite (or aircraft electronic systems); landing gear; flight control systems; hydraulic system; and so forth. If we single out the powerplant (in this case a Pratt & Whitney F135 turbofan) we can decompose it into the usual jet-engine subsystems: air inlet system, compressor system, combustion system, turbine system, nozzle system. If we follow the air inlet system it consists of two boxlike supersonic inlets mounted on either side of the fuselage, just ahead of the wings. The supersonic inlets… (Arthur 2009, pg. 40)

And on and on until we arrive at some “raw element” of technology. Arthur argues that these fundamental elements are “captured natural phenomena.” I prefer the term exploited regularities. Arthur provides a non-exhaustive list of regularities (pg. 52) that are commonly exploited in technology (I have broken them up into bullets):

  • A fluid flow alters in velocity and pressure when energy is transferred to it (used in the compressor);
  • certain carbon-based molecules release energy when combined with oxygen and raised to a high temperature (the combustion system);
  • a greater temperature difference between source and sink produces greater thermal efficiency (again the combustion system);
  • a thin film of certain molecules allows materials to slide past each other easily (the lubrication systems);
  • a fluid flow impinging on a movable surface can produce “work” (the turbine);
  • load forces deflect materials (certain pressure-measuring devices);
  • load forces can be transmitted by physical structure (load bearing and structural components);
  • increased fluid motion causes a drop in pressure (the Bernoulli effect, used in flow-measuring instruments);
  • mass expelled at a velocity produces an equal and opposite reaction (the fan and exhaust propulsion systems).

Technological elements exploit these and many other regularities to do something interesting. In Arthur’s framing, the interesting thing technologies do is “fulfill a human purpose.” These elements are then combined with others, and built up into modules and assemblies and other subcomponents, which are themselves rearranged and recombined with others, until a set of interdependent regularities are coordinated and arranged to perform some complex desired task. The exploited regularities in a technology are like the disparate voices of instruments in an orchestra, coordinated by score and conductor to produce music.

It isn’t only the “atomic elements” of technologies that exploit regularities. Frequently, a suite of components collectively allows for new regularities to be exploited. For example, the atomic bomb taps into regularities about the behavior of highly concentrated Uranium isotopes. At a certain density, naturally occuring atomic decay can trigger a chain reaction, with the release of tremendous energy as a side-effect. Accessing this regularity, however, requires an entire mini-industry of other technologies to create the uranium and push its density past the critical point.

And so technologies fundamentally rely on the exploitation of regularities to do something interesting. But this applies to organisms as well as physical technologies. After all, in a sense, organisms are nothing but very complicated machines. They even share the hierarchical nature of technologies. Our bodies are built from a set of organ systems, which are themselves composed of differentiated tissues, which are themselves built from an army of distinct cell types, which are themselves assembled from a huge collection of molecular machines.

And these tiny molecular machines run on regularities in nature, just like human technologies. Life’s Ratchet: How Molecular Machines Extract Order from Chaos by Peter Hoffman provides an overview of the regularities exploited by the smallest units in our bodies. These regularities differ from the kind exploited by human technology, because they exist only at the tiny scale within a cell. Yet, in large numbers, carefully orchestrated, they produce us. Examples include:

  • electric charges of the same sign repel, opposite ones attract (used to “lock” proteins and molecular machines into stable configurations)
  • thermal energies are random (used to jostle molecular machines out of stable configurations)
  • at the nano-scale, thermal and electrostatic forces are approximately equal (used by molecular machines to move from one stable configuration to another)
  • events that individually occur with very small probability almost certainly occur at least once, given many chances (used by cells to ensure desirable small probability events happen at least once)
  • In a positive feedback loop, a small initial cause can have a large final effect (used by cells to amplify desirable events that happen rarely)

The regularities listed above have the character of universal natural laws. But organisms also rely on mere local regularities. For example, many plants cannot grow well when moved to new latitudes. They use the length of day as a clock to coordinate their growing cycles and moving them to a new latitude, where the length of day changes differently over the year, disrupts these signals. As an aside, note again that the large-scale organism exploits regularities (e.g., length of days over a year) that are not used by the individual molecular machines that comprise it. The sum is more than its parts.

And neither are organisms the only things that rely on local regularities. Arthur, for instance, points out that a whole host of technologies break down when moved into space. They require gravity to function as intended. More broadly, there is a rich literature in the economics of growth about “local technologies.” Most technologies are invented in wealthy countries, and are often less productive when deployed in poorer ones. Tacit in their successful operation are all sorts of assumptions about regularities in the prices of inputs, the knowledge of users, the tasks they will be asked to perform, and the availability of complementary goods and services.

Next, what we have said already about technologies and organisms can also be applied to human organizations. They too organize themselves in branches, divisions, phalanxes, units, and so on. And they too exploit regularities. It’s just that these regularities pertain to human society. More so than natural phenomena, the regularities in society are local. Here are a few regularities in human society, ordered roughly from the more to less universal:

  • humans are adept at learning to copy new tasks, if shown how
  • humans can work with attention to detail for 4-8 hour blocks
  • more of a good or service will be demanded if the price declines
  • human desires for specific goods and services are relatively stable
  • people will accept fiat currency as a unit of exhange
  • money can safely be invested and earn a 2% return
  • the electricity will only rarely go out
  • most consumers know how to use the apple app store

Lasly, art too, relies on regularities. Being “interesting” in art can mean being “thought-provoking”, “beautiful”, “novel”, and other things. Art relies on regularities at several levels. Audio-visual forms exploit regularities in physical laws to transmit sound and light. They may also exploit regularities in human psychology to elicit reactions in viewers.  For example, certain techniques in horror movies can regularly elicit feelings of tension and surprise in audiences. At another level, art can exploit cultural regularities. Certain symbols, motifs, and themes may be more or less recognizable to the audience.

I’ll stop the tour at this point. We’ve seen briefly how technologies, organisms, organizations, and art all exploit regularities. Some of these regularities hold fairly universally. Others are highly local. Later, I will assert that innovations in the narrative arts and knowledge itself rely on regularities. But first, we need to introduce one more concept: representations.

Representations and Regularities

Regularities are “out there” in nature, waiting to be exploited. But to reproduce an innovation, you need a way to communicate the regularity to the reproducer. A representation is information about a regularity that can be communicated and conveyed.

There are several kinds of representation. In this series I focus on five. Their borders are fuzzy, rather than sharp. In later posts, I’ll talk about each in more detail. Here I briefly introduce them.

Rules

This includes logical statements, causal assertions and quantitative equations. At bottom rules insist things are related in a certain, definite way. The discussion of regularities in the preceding section frequently used rules as representations. Examples included:

  • “a fluid flow alters in velocity and pressure when energy is transferred to it”
  • “electric charges of the same sign repel, opposite ones attract”
  • “more of a good or service will be demanded if the price declines”

Other examples of rules include:

  • Tuesday comes after Monday.
  • All mammals are warm-blooded, have hair of some sort, and mammary glands.
  • Every reaction has an equal and opposite reaction.
  • The pythagorean theorem.
  • E = mc^2
  • Light travels at 299,792,458 meters per second.

Humans seem to prefer expressing regularities as rules whenever possible. Rules are clear and straightforward. They can be chained together to form new rules. Alas, many of the regularities in nature cannot easily or concisely be expressed as rules.

Probabilitistic statements

In a host of domains definite ironclad rules don’t exist. Things “usually” work one way, or “sometimes” work. Some notion of uncertainty, randomness, and probability is necessary to communicate many regularities. Probabilty allow us to exploit regularities in domains where we lack the information or computational power to develop rules. We also used probablistic statements in our preceding dicussion of regularities. Examples included:

  • “technologies [invented in rich countries] are often less productive when deployed in poorer ones”
  • “events that individually occur with very small probability almost certainly occur at least once, given many chances”
  • “the electricity will only rarely go out”

Other examples of probablistic statements include:

  • There is 50% chance of getting heads when you flip a fair coin.
  • Hillary Clinton has a 70% probability of winning the 2016 presidential election.
  • Thunderstorms sometimes generate tornadoes.
  • Most people are less than 7 feet tall.
  • The probability of a given quantium state is equal to its amplitude squared.

Despite their utility, humans are often resistant to probabilistic thinking, and frequently compress probabilistic statements into rules (for example, many interpreted a 70% probability that Hillary Clinton would win simply as “Hillary is going to win”). The book Superforecasting by Philip Tetlock and Dan Gardner provides a good overview of human resistance to thinking probabilistically.

Nonetheless, rules and probabilistic statements are the preferred representation in the quantitative sciences. There are a whole host of rules and probabilistic statements (the field of statistics) that can be applied to them. We can use these meta-rules to combine, transform and derive new probalistic statements from old ones. But rules and probalistic statements are relatively new in the history of the universe. We now turn to older forms of representing regularities.

Metaphors and analogies

A fuzzier but even more widespread form of representation is the metaphor or analogy: this thing you don’t know so well is like this other thing you do know. Metaphors and analogies tie together bundles of attributes and properties. They allow us to map them from one setting to another that shares some of the same attributes and properties. There are some who argue metaphors and analogies are the basic structure of thought (Surfaces and Essences: Analogy as the Fuel and Fire of Thinking by Douglas Hofstadter and Emmanuel Sander is one example I hope to write about later).

I too have used metaphor and analogy to represent regularities in this post. Examples include:

  • “the challenge of innovation is finding a way to take a leap into the unknown and to find something interesting when you land.”
  • “like the disparate voices of instruments in an orchestra, coordinated by score and conductor to produce music.”
  • “After all, in a sense, organisms are nothing but very complicated machines.”

Innovation is not literally about jumping into strange places. But that is an idea we have some intuitions about and those intuitions can be tranferred. Technologies aren’t literally orchestras. But our intuitions about the scale and complexity of how an orchestra unifies disparate sounds is a useful intuition to transfer. There are differences between organisms and machines. But not for the purposes of the discussion at hand.

Metaphors are natural to us and capable of capturing complex regularities not easily expressed as rules and probabilistic statements. And they are useful for guiding our excursions into the unknown.  For example, someone who is familiar with dogs knows they generally have a bundle of certain attributes: hair, four legs, sharp teeth, etc. Suppose that person encounters an unknown animal with many of these same attributes, perhaps a wolf. If it starts growling at them, they can use the analogy of dog behavior to infer what might happen next (they might get bit). But the analogy gets less useful the farther the new example is from the category’s archetypes. Do lions growl before they strike? What about alligators? What about a man in a gorilla suit?

Neural Networks

The previous three types of representation are familiar to us, because they can all be conveyed in language, and this is the primary way human convey information to each other. But there are other ways to encode regularities as information. Our brains, for example, gives us intuitions and “feelings” we cannot always express in language. And babies and animals can learn regularities without language too.

It turns out regularities can also be represented in the structure of special kinds of interconnected networks. The brain probably operates, at least in part, on these principles. Even if does not, recently, this method of representing regularities has been highly successful in artificial intelligence. We will come back to this form of representation, because it is not easy to describe succinctly if one is not already familiar with the concept.

Instantiations

Finally, we come to the oldest, simplest, and most robust way of representing regularities: as an instantiation of whatever is exploiting the regularity.

For example, suppose I discover an alien technology operating on principles completely foreign to me, and lacking any explanatory text. I may still be able to reproduce it by exactly replicating the technology (atom by atom if necessary). More prosaically, there are frequently techniques and processes that are very difficult to convey in language. For example, teaching someone how to shoot a basketball is probably much easier to do by demonstration than by any form of written or verbal communication. Complicated lab techniques may also need to be demonstrated rather than communicated. It may even be that the performer can’t explain why this technique works. They simply know “if you copy this technique, it will work.”

This is the type of representation most frequently used by nature itself. Nature stores the regularities it has discovered in the physical design of its organisms, who pass on instructions about how to replicate themselves, rather than instruction about the regularities they exploit.

Narratives and Knowledge; Representations, and Regularities

We now return to knowledge and narratives, which I earlier asserted also rely on regularities. Let’s begin with knowledge. When we think of innovation in knowledge, I am essentially referring to new explanations and theoretical constructs, rather than the documentation of new phenomena. For my purposes, “explanations” are essentially a more informal form of “theories” (the kind of thing often laid out in an academic article or book).

The unusual thing about explanations and theories is that the raw materials that they are made from are representations themselves. Explanations and theories in their simplest form essentially are representations. In more complex forms, they are large sets of ordered and interacting representations. They deploy rules, probabalistic statements and metaphor to convey information about regularities.

Inventing new explanations and theories is a matter of discovering new representations, or more likely, combining previously known representations in novel and interesting combinations. What makes an explanation or theory insteresting? Here, being “interesting” might mean the explanation “allows one to make predictions about the world” (the classic Popperian model of a good theory). But it could also mean the explanation is “plausible,” “provocative,” “beautiful,” “thought-provoking,” “widely applicable”, and so on.
Moreover, just as a random collection of technological components is unlikely to do something interesting, a random collection of representations is also unlikely to be interesting. Interesting explanations and theories exploit meta-regularities about the relationships between different representations. The rules of logical inference, for example, form a set of meta-regularities about representations. One of these meta-regularities, represented as a rule, might say “If you have a representation of the form ‘if A then B’ and a representation of the form ‘if B then C’, then the representation ‘if A then C’ corresponds to a regularity as well.” Mathematical and statistical operations of all stripes fall under a similar meta-regularity. Other examples inclue:

  • Occam’s Razor: the simplest representation is most likely to best express a regularity
  • Accordance with data: representations that generate predictions that match data are more likely to accurately represent regularities.

Finally, what of narrative art? As noted, to be interesting in art can mean a lot of things: “thought provoking”, “revelatory”, “beautiful”, “thrilling”, “surprising”, etc. Narrative arts can use a variety of techniques to elicit these responses, but a plausible match with regularities in the world forms a unique aspect of narrative art. Do the character’s motivations “make sense” in the context of the world? That is, do the character’s actions and thoughts align with the audience’s pre-existing internal representations about how people act? In some settings, a book explores unfamiliar settings, in which the audience knows few regularities to judge the plausibility of the narrative. In these kinds of works, sometimes the interest is the uncovering of new regularities in the fictional world. It is a regularity in our nature that we enjoy discovering regularities! (Indeed, I’m banking on that to find an audience for these posts!)

A Theory of Innovation

To summarize.

Interesting things tend to be interesting when they are hierarchical systems of interacting components that exploit regularities in the world. Creating new interesting things (innovation) requires leaping into the unknown and assembling a new system. If we had nothing to guide us, these leaps would rarely pay off. Innovation is possible largely because it uses regularities about the way the world works to guide its leaps. It assumes the regularities that hold in the world we know will continue to hold in the unknown (a leap of faith not always validated!). Innovators do not directly perceive regularities, but rather representations of them. Representations can take several forms, and I focus on five: rules, probabilistic statements, metaphors, neural networks, and instantiations. Finally, representations can themselves be woven together into hierarchical systems called explanations and theories. These “innovations” are themselves interesting when they exploit meta-regularities about representations.