Innovation: Why do Metaphors Work?

The world is full of regularities. One way we encode information about these regularities is with metaphor.  When I write about “metaphor”, I don’t mean poetic comparisons. In this post, metaphors or analogies (I use the term interchangeably) encode information about regularities. They assert something we don’t understand is similar to something we do. Innovators then use these metaphors as maps to guide their travels in the unknown.

Some famous examples:

  • Using Uber as a template for a new kind of business (“uber for X”)
  • Applying the lessons of lean manufacturing to start ups (the “lean start up” model)
  • Viewing the atom as a miniature solar system (the Bohr model)
  • Taking the principle of “no privileged frame of reference” to accelerating and speed of light travel (special and general relativity)

In each case, a more familiar domain (existing business or scientific models) is assumed to apply in a domain that differs in fundamental ways.

This is utterly commonplace. But if you step back a bit and think about it, it’s puzzling.  Why does this leap of faith ever work? More specifically, why does it work better than chance? In some cases, we can maybe assert that the same phenemena underlie the two cases. For example, maybe all “Uber-type businesses” rely on the same underlying regularities. In that case, using Uber as a metaphor for another Uber-type business is really just a way of drawing lessons from the broader category of “Uber-type businesses.”

But I think there are many more cases where the examples do not appear to draw on the same underlying phenomena in a meaningful way. The Bohr model is particularly egregious; why should the behavior of planets have anything to do with the behavior of the atom? In fact, they differ in really important ways; yet it was a fruitful metaphor.

Before we answer these questions, let’s take a detour.

Economic Modeling

I’m an economist. Much of my professional life has revolved around little mathematical models of social phenomenon. These models are simplistic. They are hard to understand without training. And they don’t give us predictive ability with anything like the accuracy of physics. So why do we bother?

In a wonderful little book on economic methodology, Dani Rodrik provides some reasons. First off, the reason economists use math is not because we are so clever, but rather because we are so dumb. Math forces a model to have internal consistency. You have your assumptions, you have your conclusions, and there are unamibiguous rules for deriving the one from the other. Many things that seem obviously true when expressed in language are revealed to be internally inconsistent when expressed in math.

That explains the math, but not the simplicity. Why not more complicated and realistic models, built from math? There are a few reasons.

First, let’s discuss why simple models work at all in a complex world. Let’s assume the outcomes of any social process are derived from the interaction of underlying factors. There might be a huge number of these factors, and they can interact in all sorts of different ways. However, there’s no reason to believe these factors are equally important to the outcome. In all cases, some features will matter more than others. If a small number of interacting factors plays a big role relative to the others, then understanding those goes a long way to understanding the situation. If not, we label the problem “chaotic” or “random” or “complex.”

So there’s a selection issue at play. In most cases, a small number of factors will matter more than the rest. If we model those and say the rest is random, then we will do about as good as we can. In cases where a small number of factors is not sufficient to make meaningful predictions, we just don’t model it and we just make decisions at random.

So simple models can be useful. But shouldn’t a more complex model be even more useful? The answer would be yes, if we started with the correct simple model and then added some complexity. The problem is that as models get more complicated, it gets harder and harder for practitioners to identify which features matter most. The second reason economists use simple models (according to Rodrik), is to isolate important causal mechanisms. If you are going to get something right, it better be the most important part. You want to get the skeleton right, so to speak, not the elasticity of the skin.

What do you want to get right? Rodrik uses the term “critical assumptions.” For Rodrik, the “critical assumptions” in an economic model have a specific meaning. It is those assumptions whose modification produces substantive differences in the model’s conclusions. For example, if you want to know what will happen to employment when you raise the minimum wage, you can choose between at least two models. In a perfectly competitive model, an increase in the minimum wage will lower employment. But in a model where firms have market power over hiring (a monopsony model), an increase in the minimum wage may raise employment. Both models are “correct” so long as their critical assumptions are met. In this case, the critical assumptions pertain to the degree of market power for hirers.

Unlike particle physics, the social world is too complex to capture with “the one true model.” Instead, theoretical economics advances by adding many simple models to the library of economic knowledge. Simple models are preferred because they are frequently good enough and because the art of an applied economist is knowing which model to use. Simplicity makes it easier to choose the right model. A good economist understands the critical assumptions that must apply for a model’s predictions to play out in the real world.

Metaphors as Amateur Models

Back to metaphors. What makes for a useful metaphor? A metaphor needs to match features of the object/event to be explained. However, in any real world situation, there are a huge number of features that you could use to match. There are a correspondingly large number of possible metaphors. Good metaphors match the deep/structural features, rather than the surface ones. For example, suppose we encounter the following animal:

Figure 1. An Unknown Animal. What is the best analogy?

We’ve never encountered this creature before. We’re in the unknown. But we can make some reasonable predictions about it’s behavior by drawing analogies with things we do know. And we have a lot of features for metaphor/analogy. Surface level analogies such as the following may lead us astray:

  • Size: The animal is as big as a whale. Inference: like a whale, the animal is probably harmless.
  • Covering: The animal is feathered like a bird. Inference: like a bird, the animal does not view us as a food source.
  • Color: The animal is brown like my dog. Inference: like my dog, the animal is a friendly ally.

In contrast, the deep/structural feature that matters most is:

  • Predator: The animal is a large predator. Inference: like a bear/lion/alligator, I am in danger!

What makes a feature “structural?” In their exhaustive book Surfaces and Essences: Analogy as the Fuel and Fire of Thought, Hofstadter and Sander write:

In the case of problems to be solved, structural features are those whose alteration would change the goal of the problem or the pathways to solving the problem. [p. 340]

So the deep features are the ones that, if different, would make a big difference. An alternative perspective on what makes a good metaphor is about “structural” features. In The Stuff of Thought, Steven Pinker argues:

the power of analogy doesn’t come from noticing a mere similarity of parts… It comes from noticing relations among parts, even if the parts themselves are very different. [p.254]

Again; the key is to match the “deep” features, not surface similarities.

This language is remarkably similar to Rodrik’s thoughts on good economic modeling. Just as economists have a large set of models to choose from, we all have a practically boundless set of possible metaphors. And just as the key to a good economic model is getting the critical assumptions right, the key to a good metaphor is getting the structural features (and their interactions) right. Hofstadter and Sander even define structural features in a way quite similar to Rodrik’s critical assumptions. Both are the features that can’t be changed without significantly impacting your inference.

Indeed, modeling and metaphor appear to be part of the same extended family. Metaphors are like amateur models. And the utility of models in economics is that they serve as useful metaphors for complicated social phenomena real world.

Why do metaphors work at all?

If modeling and metaphor belong to the same family, then this provides some insight into why metaphors can be so useful. Economic modeling is a practice designed to give us (limited) insights in very complex settings. It does this by stripping things down to a small number of important causal mechanisms.

Just as with social phenomena, in any real world situation where we are searching for a metaphor, there will be a huge number of potentially relevant factors. By chance, some of these will matter more than others in determining the outcome we care about. Those are the ones we should get right. We do that by matching the deep features of the object to something we already understand and which shares the same deep features.

Metaphors work because in situations that we don’t shrug off as fundamentally unpredictable, a small number of features interact and drive the outcome. When metaphor works, I suspect it’s because the number of situations in the universe where a small number of features matter is much larger than the number of qualitatively different ways a small set of features can interact. For any possible way a small set of features can interact, there are probably a large number of corresponding examples. Each of these is a candidate for a useful metaphor. Each captures the way the small set of features can interact.

Take the Bohr Model as an example. If we care about the outcome “stability of an atom,” there are many possible features we could investigate: position of the atom in the universe, duration of the atom, number of constituent elements, size of elements, mass of elements, velocity of elements, etc. Some of these matter (number, mass, velocity), and others do not (position in the universe, duration of the atom). The set of features that drives the outcome is small, and so finding another example where similar features drive the outcome may be fruitful. The solar system is one such example, but there could have been others.

Why not models?

Economic modeling and the utility of metaphors both rely on a small enough set of features and interactions for our brains to track.  However, unlike modeling, metaphors bring with them the baggage of a large number of other “irrelevant” features. So why do we bother at all with metaphors? Why don’t we leap straight to models of features sets, as we seem to in economics? Why do we add the extra confusion of a second real world example, and all it’s baggage of irrelevant features?

Again, I think economic modeling has some lessons. When using a model to make inferences, there are two different mistakes we can make:

  1. Our model can be internally inconsistent
  2. Our model can be internally consistent, but we choose the wrong one.

Metaphors limit our possible mistakes to the second case.

Economists use math to ensure their models are internally consistent. I suspect metaphors perform the same function. A metaphor based on real world phenomena must be internally consistent, if only because it’s happened in the real world. If I’ve encountered A, then all the features of A must be able to fit together without contradicting each other. The existence of A in the real world has proven those features and their hypothesized interactions are internally consistent. Going forward, if I use A as a metaphor for a new situation, it’s possible that I’ve chosen the wrong metaphor, but at least I haven’t chosen something that’s impossible.

Metaphors have a couple of other useful features. Models and scientific theories are built up from the orderly interaction of different assumptions. With the exception of computer simulations and other black box techniques, we usually understand exactly what is happening in the model. This is necessary to maintain internal consistency and helps us identify critical assumptions. But it’s also limiting. The need to keep models tractable imposes severe constraints on the kinds of assumptions we can make, at least in the case of economics.

In contrast, metaphors are possible to use without true understanding. We can be familiar with something (e.g., the human body, love, traffic jams) and use it as a metaphor without having a deep understanding of how it actually works. This vastly expands our set of tools for inference, but at the cost of making it harder to identify which features are deep/structural.

Second, the need to maintain internal consistency means economic modeling is done in the language of math. More generally, models are built from  the interplay of rule-like propositions. Again; this is useful to insure internal consistency, but is simultaneously limiting. There are many other regularities and interactions in nature that are awkward to express in terms of rules. Metaphor lets us encode information about these regularities.

The example of the predatory dinosaur, from above is one such example. We can make inferences about how it might act, even though it would be awkward (I don’t say impossible) to derive these inferences from a mathematical model.


To sum up: metaphors are useful because the outcome of many situations is most determined by the interaction of a small number of factors. There aren’t that many ways a small number of factors can interact, so there are frequently real-world examples we can draw on that exhibit similar underlying interactions. Using real world examples (instead of models) is useful because it (1) ensures our model in internally consistent, (2) lets us use examples even when we don’t understand exactly how they work, and (3) frees us from the awkwardness of writing everything up in rules/math/logic.


Leave a Reply

Your email address will not be published. Required fields are marked *