Are Horror Movies Getting Scarier?

Number of films released in previous 10 years mentioned on any “Scariest Movies of All Time” list, and Number of top 25 such films

A few weeks ago, I ran across the following argument from Scott Sumner’s blog:

“The film industry is in long-term decline, which happens to all art forms after they express their most potent ideas.  Painting peaked in the 1500s and 1600s.  Pop music in the 1960s and 1970s.  So on film I’m a pessimist.”

Sumner isn’t the only one who seems to think film has peaked. The BFI Sight and Sound top movies poll gives only  4 out of 100 spots to movies made since 1990(!).

This is a bit puzzling for two reasons. First, technology in general continues to improve year after year. Movies are sort of a technology, so why don’t they also get better? After all, surely film-makers are learning all the time from what worked in the past. And they have much improved technical tools. And there’s just a lot more movies being made by a lot more people around the world. Why doesn’t that translate to steady improvement in film?

Second, some movie genres don’t seem to have peaked. In particular, one of my favorite genres – horror – appears to be in the midst of a golden age. If horror movies are getting better, why not the rest?

The problem, of course, is that producing great art is about both craft and originality. Craft may indeed be improving all the time, but it gets harder to be original with every passing year. The need to be original is a handicap not faced by technology in general, and likely accounts for the difference between film and technology.

But it would still be interesting to see if the craft of film-making is improving. How do you measure the advance of film making craft though? Well, with horror, I think it’s pretty obvious. A well executed movie should be scary. There are lots of techniques related to framing, sound design, and editing that can put an audience on edge or have them jump in fright. If these techniques have to be discovered, then later film-makers have advantages over earlier ones, since they can copy them.

So I decided to see if there’s any evidence horror movies are getting scarier. (Of course this is not a decisive proof about anything, but it was a fun evening project).

Ideally, I would have some human subjects watch horror movies across different eras while being hooked up to instruments measuring their physiological responses: heart rate, sweat, goosebumps, etc. As far as I can tell, no one has done this. And since I doubt the NSF will go for it if I submit a proposal, I’m taking a simpler approach.

I spent my lunch hour searching the internet for lists of the “scariest movies of all time.” I restricted my attention to lists published in the last two years (so I can capture the new golden age of horror), and put together by staff at publications. So no fan lists. I found 8 such lists from Harper’s Bazaar, Esquire, Reader’s Digest, NME, Complex, Newsday, Consequence of Sound, and Hollywood Reporter. These lists ranked (or simply listed) between 10 and 100 movies each, for a total of 302 rankings.

There are 144 unique movies included in these 8 lists. These are all movies that someone thought were scary enough to merit inclusion on a list of top scariest movies of all time. In the figure up top, I plot the number of such movies released in the previous 10 years, from 1929 to 2017. This is the line in black.

By this metric, there was relatively steady progress in making scary movies up through the 1980s, followed by a major retrenchment in the 1990s. While progress has not been steady (at all), we are indeed in a new golden age of scary movies. The previous peak of 1982 (35 scariest movies released in previous 10 years) was passed in 2010 (37 scariest movies released in previous 10 years) and shows no sign of abating.

By another measure though, horror has indeed peaked. To try and identify the top 25 scariest films, I gave each film a vote for each list it appeared in, where it was ranked in the top 25. There were two lists that named 32 and 35 films, but did not rank them, so in these cases I assumed each film in the list had a 25/32 and 25/35 probability of being in the top 25. I gave them votes weighted accordingly. I took the 25 films with the most votes to assemble a list of the top 25 scariest movies of all time.

I then repeated the same exercise, plotting the number of top 25 films released in the previous decade. That’s the red line above.

Here we see that the first golden age of horror has not been surpassed. Twelve of the films most likely to appear in the top 25 were released between 1973 and 1982. That’s nearly half in one decade! While there has been a minor resurgence in the 2000s, we are far below the peak.

So which is it? Are movies getting scarier or not? I actually think these figures support the general idea that the craft of making horror movies has risen. To see why, let’s take a look at these top 25 horror movies.

Psycho (1960)
Night of the Living Dead (1968)
Rosemary’s Baby (1968)
The Exorcist (1973)
Black Christmas (1974)
The Texas Chainsaw Massacre (1974)
Jaws (1975)
Carrie (1976)
The Omen (1976)
Halloween (1978)
Alien (1979)
The Shining (1980)
The Evil Dead (1981)
Poltergeist (1982)
The Thing (1982)
A Nightmare on Elm Street (1984)
The Fly (1986)
The Silence of the Lambs (1991)
Ringu (1998)
Audition (1999)
The Blair Witch Project (1999)
The Descent (2005)
Paranormal Activity (2007)
The Strangers (2008)
Hereditary (2018)

These are pretty scary movies! But they’re not just well crafted movies. Many of them are also original or ground-breaking in an important way. Even though these lists are supposedly about what is scary, not what constitutes great art, I don’t think the presence of so much originality is necessarily (or only) because “great” movies are sneaking onto a list of movies that are just supposed to be scary. Instead, I think it’s because we are scared by the unknown, the unfamiliar, and the shocking. The scariest stuff tends to have something original that keeps us on the edge of our seats.

It may be that film-makers today have an edge over their peers in the past, when it comes to making a pretty scary movie. The craft has improved, and it’s never been more likely that a movie released in the recent past is very scary. That’s the black line above. But to make the scariest movies, good craft isn’t enough. You need an original and frightening idea, something the audience hasn’t grown used to. And as time goes on, the threshold for originality keeps getting raised. Hence, the red line.


As we approach the end of the 2010s, my list of the greatest (not scariest) horror movies released during the golden age of the last 10 years:

  1. It Follows
  2. Midsommar
  3. Under the Skin
  4. Hereditary
  5. The Wailing
  6. The Invitation
  7. The Witch
  8. Cabin in the Woods
  9. Get Out
  10. The Babadook

Happy Halloween!

Economics of Innovation: Detailed Reading List

This semester I’m teaching a new class on the economics of innovation targeted to interested undergraduates from a wide range of backgrounds (the only prerequisite is Econ 101) and I thought others might find the reading list interesting. The readings are usually anchored to recent research in the economics of innovation, but to make the course accessible to people without degrees in economics I’ve tried to find and create accessible overviews of the research. So I hope this is useful for anyone interested in the topic, regardless of their background.

Caveat: these readings are selected to illustrate a big concept in the economics of innovation and to lead to a good discussion. They aren’t necessarily the “best” or “most convincing” paper on the topic and of course this list is far from exhaustive. Needless to say, I don’t necessarily endorse all the arguments made either.


Why study innovation?

Where do ideas come from?

Is necessity the mother of invention?

If people randomly come up with new ideas, does more people = more ideas?

Do new ideas come from new knowledge?

Do better ideas come from experience and learning?

Is innovation just another form of evolution?

Is innovation about combining different things in a new way?

Putting it all together:

What background factors matter for innovation?

Why do big cities generate more ideas?

How is innovation affected by institutions?

Why do so many people accept things as they are?

How can we incentivize innovation?

What are spillovers and what challenges do they present to businesses doing R&D?

What if we just innovate without using the market at all?

Which is better: secrecy or intellectual property rights?

What about innovation prizes?

Can we use tax policy to steer innovation where we want it to go?

Can we boost innovation by increasing the supply of scientists?

How can we encourage scientists to take bigger risks?

Do we need to stop thinking of individual incentives to innovate?

How should we organize to make innovation happen?

Putting it all together:

What’s the past and future of innovation?

How has the USA organized research in the past?

What have been the big inventions?

Is innovation getting harder?

Or is the singularity approaching?

  • “The AI Revolution: The Road to Superintelligence: Parts 1 and 2” (2015) by Tim Urban.

A Beginner’s Guide to #EconTwitter

My #1 advice to new economists is to get on twitter and plug into the #EconTwitter community. This document is a guide on how to do that. It’s a bit geared towards academics, since that’s what I know best, but #EconTwitter is a lot more than academics and I’ve tried to write so it’s useful to people outside academia too.

What’s #EconTwitter? 

Literally? Twitter is a website that lets users broadcast 280 characters of text to other users. These are call tweets, and they can also include photos and stuff. You can “follow” other users so that you automatically see their tweets. #EconTwitter is a subnetwork of twitter users who tend to be economists (academic, professional, aspiring), who tweet about economics, and who follow each other. 

Metaphorically? Twitter is the hallway of a global economics department. Conversations break out spontaneously in the hall, you overhear it, and go over to listen or join. People post job and seminar announcements up on department corkboard. But it’s a lot more than academia, so maybe a better metaphor is a pub or coffeeshop where users hang out to talk about whatever interests them; but it’s a pub where there’s always an open chair for you. 

Why join #EconTwitter? 

You’ll learn a ton. It’s a forum people use to share interesting research (their own and others), job notices, grant opportunities, calls for papers, conferences, and other opportunities. It’s a place people rigorously discuss methodological questions. It’s a place for experts to talk about current events. And it’s a place where the hidden curriculum of economics – the advice about how to thrive, not usually disseminated in articles – is freely discussed.  

You can publicize your own work. It’s common for #EconTwitter people to announce publication of their papers and write short summaries. When other users find your work interesting, it gets “retweeted” (re-broadcast to the followers of your followers), expanding awareness of your work to an audience who might otherwise never encounter it. 

You’ll be part of a community. Being an economist has its own set of unique frustrations, whether those relate to grad school, research, publishing, teaching, the academic hierarchy, public sector employment, private sector employment, outright discrimination; you name it. #EconTwitter if full of people who know what it’s like, who are friendly, and who support each other. And there’s nothing stopping connections that start on #EconTwitter from moving into the real world. The conference circuit, to take one example, is a great place to meet up in real life. You might even find a collaborator!

Why is there a “#”? 

Twitter lets you attach keywords to your tweets by putting a hashtag (#) in front of them. In theory, you can find tweets about #EconTwitter by searching for it in twitter. It doesn’t actually work that well (people don’t usually attach the #EconTwitter keyword). But don’t worry, you’ve got this guide! 

Setting up a Twitter Profile 

Joining twitter is like joining any other social media website. It’s free. 

One of your first choices is your choice of handle and name. Like an email address, your handle (or username) is unique to you. It’s preceded by an “@” sign. Mine is @mattsclancy. It’s public so choose one that you don’t mind associating with your professional identity. Your name is not unique and can be anything. Mine is just my real name: “Matt Clancy.”  You can change your name with little consequence (and people sometimes do for whimsical reasons).

The norm on #EconTwitter is to use your real name, since you are usually intending to link the account to your professional identity as an economist. There are exceptions (@pseudoerasmus being the most prominent example to my mind), but they’re rare. 

Next, pick a picture you like for your profile. Again, the norm is to use a real picture of yourself, but not nearly as universal as the norm of using your real name. Your picture will be small and attached to every tweet you make, so if you do your whole body, it will be hard to see your face.  

You also have space to write a little bit about yourself and to link to a website. This is the main way people who don’t know who you are going to find out about you, so tell them what you want them to know as succinctly as possible. Most people aren’t going to scroll through your tweets to learn who you are. Some archetypal examples:

“Economist studying [topic]. Asst. prof at [short name for university]. [one other thing]”

“[Country] treasury economist. Tweets about [list of areas of particular interest/expertise]”

“PhD Econ student at [university]. [Areas of interest]. [One more thing]”

“Researcher at [company]. [non-professional identity (e.g., “Dad”)]. [topics]”

Following People 

Once you set up your account, the one thing that will most impact your twitter experience is who you follow. When you sign into twitter, three kinds of content will form the bulk of what you see: 

  1. The tweets of people you follow. This is the majority of the content you’ll see, so who you follow will significantly shape your perception of the site. 
  2. Tweets of people you do not follow, but which have been retweeted by people you do follow. 
  3. Tweets the twitter algorithm thinks you would like.  

Getting Started

When you first join, twitter is going to recommend a bunch of popular people to follow after it asks you some general questions. You do you, but my advice is to ignore all of twitter’s suggestions for now. #EconTwitter is an unusually warm and welcoming part of twitter; the rest of the site is… not. It’s frequently called “this hell site” by its own users. Better to avoid all the toxicity and drama until you find your feet. Start with #EconTwitter. 

You might think you should just follow economists you’ve heard of who happen to be on twitter. Your Paul Krugman’s, Ben Bernanke’s, the top people in your field, etc. There’s nothing wrong with that, but you may be surprised that this won’t get you very far into the #EconTwitter community. One of the most interesting things about #EconTwitter is that to the extent it has a hierarchy, it’s a hierarchy with very low correlation with the traditional econ hierarchy. The people everyone follows aren’t necessarily the same people that are on the short list for this year’s Nobel. 

Who should you follow? This was recently asked on #EconTwitter, so I stole the list of recommendations (which I definitely endorse) and made a public list. To find it, navigate to my profile by searching for @mattsclancy, then go to my lists. Click on the #EconTwitter Starter Set list to see a good group of people to get you started. This is very much a non-exhaustive list, but I don’t want to get into the game of picking and choosing who belongs on it, so I’m just going to leave it as it (unless anyone wants off). There is no easy way to “add all” with twitter, so you’ll have to follow each person by hovering over their name and clicking follow. I’ve also listed everyone on the list at the bottom of this post, in case the list ever goes down. 

Naturally, you should also follow anyone you know in real life if they’re on twitter and you want to get in touch. 

Some additional lists:

After You’re Set Up

Once you have a community started, the main way you’ll learn about new people is: 

  1. Retweets from people you follow 
  2. Replies to your tweets and the tweets of people you follow 

Whenever you run across someone who seems interesting, follow them. You can unfollow at any time, so if someone is annoying you with tweets that aggravate you, just unfollow. They can figure out that you unfollowed if they notice your profile no longer says “following you” or if they see you are missing from their list of followers, but twitter won’t give them any kind of specific notification that you have unfollowed them. 

Finally, twitter at it’s best is a serendipity machine that lets you encounter ideas you wouldn’t normally see. The best kinds of innovation are frequently born from seeing a connection between previously unrelated ideas. So follow widely, outside your field and indeed, outside economics, so long as the people you follow enhance rather than detract from your twitter experience.

Getting Followers 

You can get plenty out of #EconTwitter just by following other people. You don’t have to interact. But if you want to participate in the discussions or toss your own thoughts out there, you’ll probably want some followers – otherwise, you’re just talking into a void.  

What follows is some common ways to get followers. It’s worth knowing about them, but to be honest, I wouldn’t recommend being too calculating or strategic about using this information. People can usually tell when you’re only doing something to get a follower; best to be genuine! Still, on the margin… 

One way to get followers is simply to follow other people. Anytime someone follows me on twitter, I get a little notification. When you only have a couple hundred followers like me, a new follow is interesting, so I’ll click over to see who they are. If they seem interesting, I’ll follow them back. I can always unfollow in the future if they turn out to be annoying (almost never happens). This is one reason it’s important to be clear about who you are in your profile, since that’s how people will learn who you are and decide if they want to follow back. 

I also get notifications whenever someone “likes” one of my tweets or replies to it, so I’ll usually check out who is responding and follow if they seem interesting. 

Note that this strategy only really works for people with a limited number of followers. For me, a new follow is interesting. But for someone with an order of magnitude (or two) more followers, follow requests come so fast and furious that I’m sure most are never investigated. Other people never check their notifications, or just don’t use twitter much. So don’t be offended if no one follows you back. It’s not personal! 

The other thing to do when you get started is to send out a first tweet. Something like “Hello! I’m an economist working on [topic] at [org], excited to join #EconTwitter!” If you have some followers already (from above), they might retweet you to their own followers, essentially introducing you to the community. If you want some help, tack an @mattsclancy to the end of your first tweet, so that I’ll see it, and I can retweet you. 

After you get set up, the main way to get followers is to tweet interesting stuff. If your tweet gets retweeted by people who follow you, your tweet is exposed to a new audience of potential followers.  

Another avenue is to reply to other people’s tweets. Your reply is viewable by other people reading the conversation. For example, suppose I tweet out a link to my latest publication. You and I follow each other, so you see my tweet. You reply to it (see how below) “Cool article. I’ve been thinking about this topic too.”  

Anyone who follows me also sees my original tweet. They’ll also see a notification that there are replies to my tweet. If they click on my tweet, they will see all the replies, including yours. And maybe they are also interested in the same topic, so they decide to follow you. 

Finally, if applicable, you can get yourself added to the RePEc lists of twitter users. Follow the instructions when you click “How to get listed” on this page.

How to Tweet 

There are four main types of tweet. 

  1. The most common is a standard text tweet. You have 280 characters of text. Twitter uses a URL shortener that reduces the number of characters in any URL to 23 characters. You can add up to 4 pictures to a tweet, or 1 gif. 
  2. If you want to make a larger point, you can thread several tweets together. When you do this, your followers will typically see the first tweet in your thread, along with an indicator that there are more tweets in the thread. If they click on the first tweet, the thread is expanded and they see all your tweets in sequence. It’s kind of like reading a post sentence by sentence. Works surprisingly well!
    There are two ways to make threads. One option is to hit the little “plus” button in the corner when you compose your first tweet. You can do this as much as you like, drafting all the tweets in your thread at once and editing them before you hit “tweet all.” Alternatively, you can just compose your thread on the fly by hitting “reply” to the latest tweet in your thread. If you do this, followers will see each of your tweets as they come, instead of just the first tweet with an indicator that there is a thread below. But when you thread by replying to your own tweets, you can’t go back and edit the earlier tweets in the thread. 
  3. If you like what someone else said and want to share it with your own followers, you can retweet it by hitting the retweet button under the tweet. When you do this, you’ll have the choice to simply retweet or to retweet with comment. If you simply retweet, your followers will see the tweet in its original state, with an indicator that it was retweeted by you. If you retweet with comment, your followers will see the original tweet in a little box, embedded in your own tweet, where you have the standard 280 characters to make a comment on it. 
  4. You can also reply to other people’s tweets. Just click on the “reply” icon under the tweet. You’ll have the usual 280 characters to respond. Twitter will notify the person whose tweet you are replying to that you have replied, but it will also notify anyone “tagged” in that tweet. The etiquette of this is discussed a bit in the “best practices” section. In general, your followers will not see your reply unless they also follow the tweet you are replying to (but if they are really curious, they can find them by going to your profile and looking at your “tweets and replies”). People who follow the tweet you are replying to will not see your reply either, unless they choose to read the replies to that tweet (which is common). 

There’s also an option to conduct a poll with your tweet. This lets you ask your followers a multiple-choice question with up to 4 answers. Respondents have between 5 minutes and 7 days to respond (your choice, but default is 24 hours). After that interval the results of the poll are visible to all, but individual responses are anonymous (except you can infer they come from the population of people who see your tweet). 

It’s important to realize that twitter does not allow you to edit your tweets after they are posted. This can be really annoying if you make a typo on a tweet that becomes really popular, but it’s designed to prevent various abuses. Twitter (not #EconTwitter) is full of trolls who would likely abuse the ability to edit tweets, for example, by changing a benign message to an abusive one. Everyone who retweeted or liked the tweet would then be seen to be liking or retweeting an abusive message. If you really want to rephrase your tweet, you can always delete it and try again. 

Lastly, it’s also possible to “like” tweets. Socially, this has roughly the same function as liking on any other social media. As noted above, users will usually be alerted that you like their tweet. 

What to Tweet 

Whatever you want! If you want to get an idea of what kinds of things people tweet on #EconTwitter, check out what the tweets from people in the #EconTwitter Starter Set.

Best Practices 

Adhering to a couple best practices will also make your experience and the experience of everyone else better. 


Probably the most important thing to realize is that #EconTwitter is not Econ Job Market Rumors (EJMR). It’s much friendlier and supportive. The community is quite diverse and everything you say is publicly viewable. Don’t be a jerk. Don’t punch down. Think carefully about how you phrase critical comments and tweets that can be misinterpreted. Bigoted comments will be called out. 

Two things differentiate #EconTwitter from EJMR. First, everyone knows who you are. Yes, you can use a pseudonym, but it’s hard to get many followers that way. Anonymous trolls don’t get much traction. Second, everyone on twitter curates their own feed by choosing their followers. If they don’t like what you’re putting out, they can un-follow, mute, or block you (see below). Nobody has to follow anyone. 

Blocking and Muting 

As noted above, if another user irritates you, there are a hierarchy of responses.  

Suppose @troll is bothering @economist. First, @economist can unfollow @troll. They’ll no longer see @troll’s posts, but this does nothing to dissuade @troll from bothering @economist if they are persistent. @troll can still see all of @economist’s posts, and can reply and tag @economist, which brings up a notification for @economist each time. 

The next step up the chain is to mute @troll. This blocks all communication from @troll to @economist. @economist will no longer see @troll’s tweets, even when @troll tags @economist or replies to their tweets. And @troll is not told they’ve been muted, though there are third-party apps they can use to figure this out.  

The problem with muting is that a particularly nasty troll can still pollute your threads for other followers. By muting a troll, you don’t see what they’re doing, but all your followers do and @troll can still interact with them. This is why many people prefer to block irritating users. Blocking a person means they can no longer see your tweets, and therefore cannot reply to them either. This means your followers do not see @troll either, unless they follow @troll. 

That said, really nasty trolls command followers that they can ask to harass you on their behalf. It can be exhausting to individually block all these followers, and twitter doesn’t have great tools yet for dealing with this. 

For a variety of reasons, I haven’t had to deal with any kind of harassment: I’m not that popular on twitter, I’m not part of a demographic that people are bigoted against, and the stuff I study isn’t too controversial. But those who do have to deal with twitter harassment seem to recommend a policy of fast and easy blocking to keep the experience positive. Basically, it’s fine to block for the slightest infraction: e.g., a single rude comment. There are also lists of twitter users who are known trolls, and some people will just automatically block everyone who appears on these lists. You can always unblock later if you’ve made a mistake (this usually happens via an intermediary – the blocked party asks another user to ask you to unblock them). 

Finally, one last trick: if you want someone to stop following you, you can block and then immediately unblock them. This drops them from your follower list (and does not alert them to this fact), though they can re-follow you if they want.

Quote Tweeting vs. Replies 

Retweeting with a comment (also called quote tweeting) and replying are both ways of responding to a tweet, but there are norms about how to use them. Quote tweeting posts your comment to all your followers, but not the followers of the person you are retweeting. Replying does not post your comment to your followers (unless they also follow the tweet you are responding to). This makes it tempting to respond with quote tweets by default. But actually, quote tweeting as a default is considered kind of rude.  

There are a few reasons. First, it’s much harder to follow a conversation where everyone responds by quote tweeting. Replying keeps the conversation in one relatively easy-to-navigate place, while quote tweeting splinters it into many separate tweets that are hard to follow. Also, your quote tweet is posted to your followers, but not to the followers of the original tweet (unless they also follow you). So they can’t easily respond to you, and you end up hijacking the conversation.  

Often this is perfectly appropriate. If you are not really looking to engage with the existing conversation and just want to make a quick comment, then quote retweeting is great. But if your goal is to actually have a conversation with people interested in the first tweet, replies are best. If you really want to bring your own followers into the conversation, you can reply and then retweet your own reply. 

Tagging etiquette 

When you “tag” someone in your tweet, that person will receive a notification that they have been tagged. You can tag someone by adding their username to your tweet (for example, I suggested new users introduce themselves and tag me by adding @mattsclancy so I would see the tweet and retweet it). These kinds of tags count towards your 280 character limit. 

However, when you reply to a tweet, your reply automatically tags the person who posted the original tweet, and anyone tagged in that tweet. These tags do not count towards your 280 character limit. 

Tagging creates two issues. First, if you are talking about someone else’s work, should you tag them? I don’t think there is unanimity of opinion on this, but my view is that when you are saying something nice about someone, it’s probably slightly better to tag them. Compliments about academic work tend to be rarer than criticisms (even friendly criticisms), so people appreciate hearing good things about their work. The issue is much stickier when you are being critical. Some people don’t like to be notified by twitter that someone out there on the internet is criticizing them. Others think being critical without tagging is tantamount to talking about someone behind their back. I don’t know what’s best here, but be aware of the issue. 

The second issue with tagging is that when you are replying to tweets in a conversation, sometimes people will be automatically tagged who aren’t interested. For example, suppose economist A tweets a new article evaluating some health policy intervention. Economist B replies to the tweet saying they wished this article had used a DAG (directed acyclic graph) approach. Economist C responds to B’s comment asking where they can learn more about DAGs. Economist D responds to C with a list of textbooks. Economist E jumps in with some comments on which of these textbooks is their favorite. And so on. The thing with replies is that economist B’s reply tags A; C’s reply tags A and B; D’s reply tags A, B, and C; and E’s reply tags A, B, C, and D. If a long conversation then breaks out between D and E, every one of their back-and-forth will also tag A, B, and C. This can get annoying, especially if A doesn’t even care about DAGs.  

To avoid this problem, you can untag people. When you are jumping into a conversation based on replies to a tweet, it can be a good idea to take a look at who is tagged and to untag anyone who doesn’t seem particularly engaged in your corner of the conversation. 

Pinned Tweets 

Twitter lets you choose one of your tweets to be a pinned tweet. This tweet is always displayed at the top of your list of tweets, so it is a useful supplement to your profile. It can tell people about whatever it is you most want them to know about. It can also be a way to get traction on a tweet that didn’t get much attention the first time around, since everyone who checks out your profile will likely see the tweet. In academia, announcements of a recent publication are a popular kind of pinned tweet. To pin a tweet, click on the little icon at the top-right corner of your tweet.

Direct Messages 

Twitter also lets you communicate privately with other users via direct messages (DMs). Normally you can only send direct messages to users who have followed you, though it’s possible to enable the ability to receive DMs from anyone (under settings; “privacy and safey”; receive direct messages from anyone). If you really want to talk privately with someone who isn’t following you, you can tag them in a tweet that requests they follow you.  


To close, Gray Kimbrough (@graykimbrough) had this advice: “My biggest suggestion is to spend some time thinking about what you want your contribution to be. Many people here feel like they need to comment on every issue and event. Focusing on areas where you have value to add (your area of research? Baking?) is helpful, in my opinion.”

I think that advice is particularly useful for someone new to twitter. When you don’t know anyone, focusing on a particular area can help potential new followers figure out who you are and give them a reason to follow you, especially if the “one thing” is your strong suit. Speaking for myself, I get way more interest in my tweets on the economics of innovation (my area) than my attempts at jokes (even when I think they’re quite good).

Some people go as far as to set up separate twitter accounts that each focus on different topics (e.g., professional and personal, economics and politics). On the other hand, as Gray notes plenty (plenty) of people don’t follow anything like this kind of rule and it works for them.

Additional Resources

Justin Wolfers (@JustinWolfers) made a presentation about using twitter as an economist back in 2015.

Sarah Jacobson (@SarahJacobsonEc) has written a similar guide to using twitter professionally. See pages 21-24. Updated set of slides available here.

The End 

Hope you found this helpful! If you have any suggestions, questions or comments, you can email me. Better yet, contact me on twitter! 

The #EconTwitter Starter Set (link to twitter list here)












































































Rehabilitating the Death of Distance to Revitalize Rural Economies

People are writing about how to revitalize flagging rural economics, so it seems a good time to share my thoughts (a bit of background – I recently moved back to Iowa after living in Washington, and lived in London for a few years before that). At the most fundamental level, the dynamic driving rural decline is the benefits of physical agglomeration. People are more productive and innovative when clustered together and this effect favors cities. Higher productivity and more innovation leads to more and better paying jobs in cities. These, in turn, draw people out of rural economies and into cities, further depressing opportunity in rural economies. This is not to say other factors don’t matter at all, but the biggest challenge for rural economies is how to thrive in a world with strong agglomeration benefits.

Greetings Iowa Skyline Metal Magnet

Source: “Raygun, the greatest store in the universe” (

The best thing we could do for rural economies is to kill the link between physical proximity and agglomeration benefits by using information technology to push agglomeration benefits out of physical space and into the digital realm. My vision is a world where remote work is common (if not the norm), and where better online social networks and socializing options substitute for the invisible networks of social connection and information exchange which are currently based on physical proximity. If people everywhere can enjoy the benefits of agglomeration without the need to physically cluster together there will no longer be a need to flee the country for the city. If we succeed, not only will we bring rural economies into the virtuous cycle of agglomeration benefits, but we may even accelerate innovation and growth by connecting up the entire country into one digital city.

An Idea Ahead of it’s Time?

This is hardly a new idea. At the dawn of the internet era, it was conventional wisdom that the world was flat and the internet would spell the death of distance. It didn’t happen. Instead, the rising importance of knowledge work favored returns to agglomeration more than ever before, accelerating the clustering of knowledge workers into cities. So what went wrong?

I think we all made the classic mistake of underestimating how long it would take for us to figure out how to effectively use a new general purpose technology. I’m reminded of how it took decades for the benefits of electrification to become fully manifest. It wasn’t particularly effective to simply substitute electric motors for the old gigantic steam ones; new distributed production systems had to be invented and built. Why should it be any different for the internet?

In this post, I’ll point to three benefits of physical agglomeration, and speculate on how people can use the internet to retain these benefits without the need for physical proximity (today or in the future). What’s changed is not so much technical advance (although that has happened as well), but social innovations in how we interact with each other online. It may yet turn out that the death of distance is right, but was just ahead of its time.

Making Remote Work Work

Most obviously, pushing agglomeration benefits out of meat space and online will require making remote work work. I have some experience with working remotely. At my previous job in DC I worked remotely two days a week. Most people I socialized with worked remotely at least some of the time too. After moving back to Iowa, I continued to work remotely both at home and out of a regional field office. In my current position, I collaborate with researchers who are geographically separated.

The biggest challenge to remote work is the perception that a remote work force is not as productive as a physically present one. This may because collaboration is hobbled when workers are not physically together or because individual workers shirk when working remotely. In my experience, remote workers are quite heterogeneous but these criticisms are true of some.

Let’s take collaboration and communication first. The trouble with remote work is the increase in transaction costs to transmit information. This isn’t a problem for important information because the value of the communication exceeds transaction costs. When something important needs to be communicated, a conference call, emailed slide deck, or memo will get the job done.

The problem is instead the heap of minor communications, too small to individually exceed transaction costs but cumulatively significant. It’s the absence of spontaneously popping by a co-worker’s cubicle to ask for a clarification, advice, or simply to chat. It’s the absence of body language cues in a teleconference call. I’ve even heard people say the lack of peripheral vision and pheromones impedes communication in meetings. Collectively, these gaps in information add up; problems/opportunities are not identified as early as they could; corporate culture is harder to sustain; trust between employees is harder to build.

But I think a lot of these problems are already being solved by a combination of technological and social innovations. Things like slack substitute for “popping in to ask for a clarification.” Better video technology adds back visual cues in a teleconference. But perhaps most important are cultural adaptations. It’s been interesting to see gifs and emojis emerge as an increasingly robust substitute for body language and vocal tone. Video chatting is something we’re all getting experience with now. And there is at least one highly selective university where all student collaboration happens online.

Similarly, I think we’ll find ways to solve the problem of worker shirking. In my experience, some workers do take advantage of reduced monitoring to shirk their duties, but this tends to be because old systems of monitoring effort (like walking around and keeping an eye on people) are ill suited to remote work. In principle, it’s actually easier to monitor the work of remote workers than physical ones, since you can make a comprehensive record of their digital work (for example, by continuously recording what’s on the screen). Now, that may not end up being necessary. Maybe giving managers the option to call up a live view of their worker’s screens will suffice to prevent shirking. Or maybe a working norm will evolve where you can see your whole team all the time, muted in a ribbon at the bottom of your screen. Or something else entirely.

Thick Labor Markets

A second benefit of urban economies are thick labor markets. If you want to make a movie, you can be confident there will be a plethora of people with just the skills you need, if you live in Los Angeles. If you want to make a new digital product, there will be no shortage of qualified engineers if you locate in the Bay area. If you need the capacity to rapidly scale up manufacturing, qualified workers and managers can be found in Shenzen.

But it’s not just the presence of these workers that matters. It’s also the network of informal social connections that helps match people to the right jobs. You can ask a trusted colleague for advice on a candidate or the firm they work at. Once we can make remote work work, the biggest problem will be building a comparable system for matching remote workers to employers.

In the long run, if we succeed in pushing a lot of work online, this won’t be an issue because the same informal networks that exist in cities will exist in online communities. To some extent then, we just need to be patient. If remote work works, then there will probably be cost advantages to using it (a lower cost of living in rural economies will mean remote workers can out-compete urban ones in terms of wages). Industries most suited to remote work will lead the way, and informal professional networks will follow.

But this process can be accelerated with online services that match employers with employees, like LinkedIn or Upwork. But to really substitute for thick labor markets, we need the digital equivalent of person-to-person networking events and other mixers. Offline, it may be enough to serve drinks, put on some kind of notional speaker, and then let people socialize. But new kinds of events will be needed online. The best candidates to my mind are online games and collaborative educational workshops.

Instead of an after work mixer at the bar, there could be an after work fortnite party (or some more professionally appropriate game). We’ll have to experiment to find what works best, but I suspect a game with low cognitive burden (so you can focus on socializing) and frequent shuffling of the players would be ideal.

Instead of an educational workshop, imagine an online workshop that uses active learning pedagogy, so participants have to interact to learn the material. As with remote work, a generational shift may make existing technology much more effective in the future. The combination of better technology and the ascent of “nerd culture” means more young people today have extensive experience socializing over large online multiplayer games. At the same time, the rise of online active learning education (though nascent) means the next generation may have experience doing school work in a collaborative online environment.

Knowledge Spillovers

Most subtle of all, but perhaps most important in the long run, are the agglomeration benefits related to innovation. Density appears to promote innovation. Why?

Part of it is thick networks. Most innovation happens in teams, and if you know more people, you can put together a better team. For ideas in a very preliminary stage, big networks make it easier to find someone trustworthy to talk things over with. The same kinds of digital activities that promote professional network development could also promote innovation supporting networks.

But cities are also probably better for innovation stemming from serendipity and unexpected connections. Websites like twitter are probably the best online substitute for the unexpected connections and encounters that occur in dense cities. Though there is a strain of toxicity running through much of the site, it also houses wonderfully supportive online communities gathered around common interests.

My experience is with #EconTwitter, a collection of professors, researchers, private sector economists, students and enthusiastic amateurs. The forum genuinely facilitates the open exchange of ideas and knowledge, including much of the “hidden curriculum” that is traditionally communicated in person-to-person contact, and is not typically codified in handbooks and academic articles. I’ve met people in real life that I first encountered on twitter, and even started a research collaboration with someone I’ve never met in the real world. And I’m sure many others have had similar experiences.

I have a hunch that one reason for the advantage of physical meeting of video conference (so far) has been that their high bandwidth nature makes it easier to build trust and openness. This, in turn, gets people to let down their guard and share some of their wilder ideas. Many of these are likely to be duds but wilder ideas are also more likely to lead to major innovations.

If online norms can evolve to make people comfortable taking a risk and sharing ideas that might be bad (e.g., maybe this whole post?), then it may turn out the internet is even better at promoting fruitful connections leading to innovation, since the audience for such ideas is so potentially large. The trouble is, with the internet (especially twitter), unlike at the bar, a screenshot can make a permanent record of any bad take. There may be some technical fixes, but again I think (hope?) in the end we’ll just develop new social norms to deal with it. Part of that may be thick skin, part of it may be liberal uses of block/mute features, and part of it may be an increased threshold before we get outraged about something. Indeed, I see all three of these being hashed out online even now.

Policy Implications

To some extent, all of this is just going to take time. We need to learn to communicate as well online as we do face-to-face, and how to run organizations online. We’re on our way, but it may take a generation raised in this environment. But there are some obvious things we can do to hasten its arrival.

First off, it’s crucial to promote rural broadband and other IT infrastructure, if this policy is to get off the ground. In addition to simply laying down the cable necessary to connect rural communities, this could include low-interest loans to individuals to purchase computer equipment they would need to work remotely. Alternatively, it may turn out to be better to build small communal co-working spaces with necessary equipment and an on-site IT technician.

Second, to promote the use of remote work, it may be desirable for states and small towns to offer wage subsidies and other incentives for distant firms to hire local remote workers. This would be a micro version of the much larger tax breaks that are used today to try and lure businesses to invest locally. The argument for subsidies for remote work from the perspective of an economist is that there is a positive externality from remote work. As more firms experiment with remote work, best practices for its conduct will emerge and this knowledge will diffuse, tipping other firms to increase their use of remote workers.

Third, local universities should offer online degrees tailored to the needs of remote workers. By being online, the courses would be accessible to people who would most hope to subsequently obtain remote work. The degrees would also provide an opportunity to develop the soft skills necessary to be a successful remote worker. They could even practice collaborative team-based pedagogy, to accelerate the development of norms for online collaboration and work.

Fourth, further research on what kinds of social networks facilitate the free exchange of information and formation of social ties is needed.

Fifth and finally, we can promote online gaming and other communal digital activities. Promotion could be cultural, in that we encourage kids to join e-gaming leagues in the same way we now encourage physical sports leagues. Or it could be financial, with subsidies to these industries until they reach a level of maturity where they are a common mode of meeting new people.

There are other aspects of agglomeration economies that also warrant discussion, but for which I have little to add. Capital and finance tend to be clustered in certain geographic regions. Transport infrastructure may be another avenue worth exploring, since it will probably be desirable to continue to maintain some physical interaction. And it’s certainly true that not all industries are amenable to remote work.

I hope we can figure out a way to make it work, because as other have noted there are few other options. Industries where rural areas have comparative advantage, such as agriculture, are rare and increasing productivity means they are shedding workers. And the industries of the future appear to be ones where agglomeration matters. It’s just too powerful a force to be ignored.

How Minerva University Teaches Habits of Mind

Minerva is a new university best known for its global campus. Students spend their first year in San Francisco, and then each class moves en masse through Seoul, Hyderabad, Buenos Aires, Berlin, London, and Taipei over the remaining six semesters. During my undergraduate years, studying abroad was an important experience for me, so I’ve been interested in Minerva since I heard about it a few years ago. Earlier this year I read their semi-official book Building the Intentional University: Minerva and the Future of Higher Education (eds. Stephen M. Kosslyn and Ben Nelson). But after reading the book, I’ve come to think the global campus is a bit of a sideshow. Minerva’s really innovative idea is actually in the considerably less flashy domain of curriculum design.

In this post, I’ll briefly explain why I think their curriculum is so interesting and what problem it is trying to solve. But up front I want to make it clear that I don’t work for Minerva or even know anyone who does. I just read their book, some other books, and teach at a university myself.

Minerva’s curriculum is designed to foster certain habits of thought. These are habits we would associate with critical reasoning, problem solving, communication, leadership, and teamwork skills. So far, so bland. What curriculum isn’t designed to foster good habits of thought? But as we will see, most universities do a fairly poor job of teaching these skills. The difference may well be that where most universities are laissez-faire about how they teach good habits of thought, Minerva is intentional.

Before turning to the Minerva approach, it’s necessary to take a minute to establish the disappointing track record of the traditional approach, and to suggest some reasons why it has disappointed. This is all laid out quite well in Bryan Caplan’s The Case Against Education: Why the Education System is a Waste of Time and Money.

What Does Education Do?

Caplan’s book is about explaining why an extra year of education is correlated with a 10% increase in wages. The traditional explanation for this fact is that education builds valuable knowledge and skills, called “human capital”, which makes graduates more productive employees. Caplan’s central argument is that education is primarily (80%) about signaling, rather than the accumulation of useful skills and knowledge. By “signaling,” Caplan means that the primary purpose of education is to certify students as smart, diligent, focused, submissive, and conforming to social conventions. This basket of characteristics is highly desirable to employers and employees with these characteristics are compensated with higher wages. The key assertion is that education merely certifies these traits; it does not build them.

While he makes several arguments that signaling rather than human capital explains the educational wage premium, the most convincing to me is simply the observation that higher education does not appear to build many skills useful to employers.

I think of myself. I did a double-major in Physics and Religious Studies for my undergraduate degree and was a very strong student. For the most part I loved the experience. But when I look back at my transcripts, I count five classes for which I would struggle to remember more than a sentence-worth of content. I count four more classes that I don’t even remember taking. As for the rest, while I often found the material interesting, I think the professional applicability of the specific facts, models, and tools I learned is quite limited. Indeed, I spent two years as an economic analyst after graduating and never used any physics (actually, the religious studies mattered more!). This is not to say nothing was useful. But Caplan’s estimate that only 20% of what we learn in school directly contributes to work productivity seems on the face plausible.

And I don’t think my experiences are that unusual. Naturally, there are majors and courses that build a lot of skills and knowledge directly relevant to certain career paths. But as Caplan points out, even here the human capital story faces challenges, because students frequently fail to get jobs in the field closest to their major. Few science majors become scientists, few psychology majors becomes psychologists, few economics majors become economists and so on. And then there are all the other majors that do not even try to build skills for specific professions. This is not to say such degrees are a bad idea (wages aren’t everything!), only that the human capital story as an explanation for the positive education-wage correlation looks weaker when we look more closely at what is studied. (Caplan also argues education does a bad job of building wisdom and other non-pecuniary forms of human capital, but I’m not going to go into that).

Deeper Knowledge?

The natural retort to this line of argument is that the specifics of curriculum do not matter because education develops deeper capacities. What is “really” taught are habits of thought: how to construct and evaluate arguments, how to weigh evidence, how to communicate effectively, etc. Whether we study English, economics, engineering, or entomology isn’t particularly relevant. They just give us the raw material for us to practice and hone critical thinking and communication skills.

For Caplan, this is wishful thinking. There are a host of studies that look for domain general forms of learning, and the results tend to disappoint. Caplan describes a study measuring the quality of arguments and reasoning given by students at different points in their education. Fourth year undergraduates did no better than first year undergraduates, and fourth year graduate students did only marginally better than first year graduate students. Another study looks at how students apply statistical reasoning to real life examples, again finding the vast majority fail to do so.

Another way to think about this issue is through the lens of knowledge transfer. It may not matter that people retain specific facts and surface details as long as they do retain the deep structure of theories and ideas. They can then apply these deep structures to contexts with different surface details. Unfortunately, a body of work by educational psychologists finds that people do not naturally transfer models and ideas from one domain to another. Most of the things we learn get “stuck” in the context in which we learn them. If we don’t use those skills frequently, they fade away within a few years. This is the fate of most of what we learn in school.

The problem is this: students tend to learn only what will be on the test and domain general thinking is not on the test. Again taking myself as an example, I grade my students on how well they do in my class; I do not follow them out into the world (or even into other classes) and adjust their grade based on how well they apply the models I teach them in novel contexts. The end result is that knowledge remains bottled up in the context of the specific domain it was learned and unless a student builds explicitly on that domain, it is lost through lack of use.

Reorganizing Undergraduate College

Caplan’s solution to this problem is to scale back the signaling arms race by cutting education subsidies and to expand curriculum that explicitly teaches skills employers want – namely, vocational training. Minerva, in contrast, thinks it can actually teach those habits of thought that we want college to develop. But to do that, it recognizes that it needs to change the way college is traditionally organized. It does this in three main ways:

  1. The curriculum explicitly teaches and assesses domain-general habits of thought
  2. It teaches these habits of thought in varied contexts
  3. It assess student mastery over all four years of education

Let’s take these in turn.

The curriculum explicitly teaches and assesses domain-general habits of thought

Today’s universities spend the majority of their time explicitly teaching and assessing students on their mastery of domain-specific content. We trust this effort will also lead to the development of domain-general critical thinking and communication skills as a useful by-product. For example, if you take an economics class, you are taught economics and assessed on your command of it. However, to the extent mastering economics requires deeper critical thinking skills, you will pick those up too.

Minerva flips this around, and explicitly teaches and assesses students on their mastery of 100 domain-general “habits of mind.” When I say “explicit” I mean it; the 100 habits of mind are enumerated and spelled out in appendix A of Building the Intentional University. Course content is selected to illustrate and practice the use of these principles.

What are these habits of mind? Examples include:

  • Evaluate whether hypotheses lead to testable predictions.
  • Identify and minimize bias that results from searching for or interpreting information to confirm preconceptions.
  • Apply effective strategies to teach yourself specific types of material.
  • Tailor oral and written work for the context and the audience.

Students are assessed on their successful use of these and the other 96 habits of thought both through coursework and during classtime. Classes are delivered online with students participating via webcam, and so there is a video record of all class discussions and activities. Classes are designed so that no more than 25% of class time is spent passively absorbing material, meaning students spend the majority of class time on activities in which they can practice or demonstrate habits of thought. Typically a few habits will be emphasized per week. After classes, professors go back through the footage and assess students’ use of habits of mind, offering feedback to students.

If you want to teach deep thinking habits, it’s probably best to explicitly teach them, rather than to trust they will be extracted from course content. In Caplan’s discussion of the literature on critical thinking skills, he describes a study where students were taught a solution concept either explicitly as an algebraic technique, or in the guise of a structurally equivalent physics technique. Students were then asked to solve a problem using the same solution technique, but in the guise opposite to what they learned (so algebra students solved a physics problem and physics students solved an algebra problem). Most of the students who studied the algebraic version of the technique used it on the physics problem, but few of the students who studied the physics technique used it on the algebra. It seems to be easier to transfer knowledge from the general to the specific than vice-versa.

It teaches habits of thought in varied contexts

While Minerva teaches and assesses domain-general “habits of mind,” some concrete context is still necessary. You can’t really teach someone how to “evaluate whether hypotheses lead to testable predictions” in some kind of abstract Platonic ideal. All these habits of thought are instead practiced in the context of more traditional course content. The problem is that anytime you introduce context, you run the risk of trapping the habit of mind in that specific context. As noted above, far transfer is hard. If you practice the “testability” habit of mind in a psychology class, it might never occur to you to use the same habit in a business context.

To address this problem, all Minerva students take the same courses in their first year. These courses give students a foundational liberal arts and sciences education. This naturally provides many different contexts for students to practice the same habits of thought, helping prevent these habits from getting trapped in the context where it was first taught. And in all of these courses, students are graded on their mastery of these 100 habits of thought.

Because of this scheme, far transfer itself becomes a skill you learn to cultivate and practice. You will learn a habit of thought in one class and be graded on your ability to use it in a different one. I can’t grade a student on her ability to apply the concepts I teach her in her other courses, but in Minerva something like this is the norm. This is only possible because the first year curriculum and assessment criteria are collaboratively developed by the Minerva faculty.

It assess student mastery over all four years of education

A final challenge to learning these habits of mind is we quickly forget what we don’t use. To avoid this fate, Minerva adopts an unusual grading system. As noted above, students are graded in their first year on their mastery of 100 habits of thought. But this grade is retroactively adjusted over the following three years, based on how well students continue to use habits of thought. Rather than learning something for a semester and then forgetting it, students must maintain (or try to improve) their first-year grades by continuing to demonstrate mastery over these habits. Again, something like this is only possible when the curriculum is collaboratively developed by the entire Minerva faculty. It only works because professors teaching later courses “buy-in” to it.

Can Existing Universities Adopt these Ideas?

To summarize, Minerva tries to teach habits of thought by explicitly teaching them, assessing them, and then forcing students to practice far transfer and retention by requiring the use of the habits across all classes for four years. I think this is all pretty cool, and if it works it’s exciting to think we have found a better way to teach domain-general thinking.

However, it works for Minerva because curriculum design has become a collective rather than an individual endeavor. It’s hard for existing institutions to completely adopt this format, since professors have so much autonomy in what and how they teach. But to close, I’ll toss out a few ideas for how a traditional university might experiment with some of the same concepts ideas.

One incremental step could be pairing up complementary classes, who then share assessment. For example, a writing class could pair up with a philosophy class where students have to write essays. Students would learn explicit habits of thought – in this case about writing – in one of the classes, and practice that knowledge in another. The essays would give students a specific context in which to practice their writing skills, and could be jointly graded by both instructors: the philosophy instructor grading for philosophy, the writing instructor for writing.

A step beyond this would be the creation of an explicit “habits of thought” class that is taught every semester. Like Minerva, it would explicitly teach and assess mastery of habits of thought. But the class could be structured so that students are assessed on how they use these habits in their other classes, which would be taught in traditional ways. For example, perhaps students write reflections at the end of each day on how they used the habit of thought in their other classes. Or perhaps students taking the habits of thought class are organized into small groups of students who are taking the same classes, and these could be a forum for discussing how the habits of thought were applicable (or not).

A further step would be the creation of a “habits of thought” minor with its own set of core classes organized much like Minerva’s first year. This minor would lock all students into a required set of classes in their first year, but these courses could be carefully honed so that students can slip into the second-year stream of traditional majors. For example, a student could become an economics student with a minor in “habits of thought” by taking a foundational year that covers a broad set of classes satisfying normal general education requirements. Ideally, these classes would also include enough of the material taught in introductory economics courses so that students could jump into an intermediate course in the following year. The minor could continue to meet throughout the next four years to keep the habits of thought in practice as well.

2018: My Year in Books

Between the new baby, new house, and new job, I didn’t read as much this year as last year. I made it through 45 books. Of those, 8 were plays by Shakespeare. If I ever read all of his works, I’ll make a post about them. But here’s a rough ranking of the remaining 37, with a sentence on each.

Caveat: I liked all of these. I abandon books I don’t like.

Conceptual Non-Fiction

  1. The Second World Wars by Victor David Hansen: Wonderfully organized opus on World War II as a contest of national productive and organizational capacity.
  2. The Book of Why by Judea Pearl and Dana Mackenzie: Correlation can imply causation, (when you combine data with models).
  3. The Enigma of Reason by Hugo Mercier and Dan Sperber: Reason as a social influence tool that, as a bonus, helps you figure out how the world works.
  4. Old Masters and Young Geniuses by David Galenson: Aesthetic innovation comes from either evolutionary processes (tinker and evaluate) or reason (plan and execute).
  5. The Son Also Rises by Gregory Clark: Social status is really sticky across generations.
  6. The Measure of Reality by Alfred W. Crosby: Between 1250 and 1600 in Europe, numbers and measurement colonized new domains, potentially setting up our current paradigm of continuous technological progress.
  7. On Writing by Stephen King: Moving story of King’s own entry into the writing life, and his intuitive story-first method of writing (he’s an evolutionary creator).
  8. The Great Leveler by Walter Schiedel: Final two sentences sums it up; “All of us who prize greater economic equality would do well to remember that, with the rarest of exceptions, it was only ever brought forth in sorrow. Be careful what you wish for.”
  9. How to Read a Book by Mortimer J. Adler and Charles Van Dorn: Very useful, but I need to read it again correctly.
  10. Radical Markets by E. Glen Weyl and Eric Posner: A feast of ideas to wrestle with. (related post)
  11. Cognitive Gadgets by Cecilia Heyes: Pushing cultural evolution even farther; the ways we think and learn are themselves cultural products.
  12. Surfaces and Essences by Douglas R. Hofstadter and Emmanuel Sander: I wasn’t a fan of the presentation, but it’s an impressive feast of ideas to wrestle with. (short review)
  13. The Allure of Battle by Cathal J. Nolan: Using the history of (mostly) Western war to argue warmakers endlessly underestimate the cost and duration of their wars (a good companion to my #1 pick). (related post)
  14. Capitalism without Capital by Jonathan Haskel and Stian Westlake: Goes a long way towards explaining several contemporary economic puzzles.
  15. Misbehavin’ by Richard Thaler: How to shift a paradigm.
  16. The Disruption Dilemma by Joshua Gans: It’s more complicated than Christensen claims.
  17. Free Innovation by Eric Von Hippel: Neat little book on innovation outside the market system (also, it practices what it preaches).
  18. Improbable Destinies by Jonathan B. Losos: Evolution happens faster than you think.
  19. Zero to One by Peter Thiel: Efficiently communicates a lot of original ideas.
  20. The Hungry Brain by Stephan Guyenet: Good overview of the brain and will convince you that dieting is complicated.
  21. The Lean Startup by Eric Ries: More like “The Lea(r)n Startup.”
  22. Theory and Reality by Peter Godfrey-Smith: Very good overview of the basics of philosophy of science.
  23. True Stories and Other Essay by Francis Spufford: Probably for Spufford fans only (I count myself one).

Fiction and Narrative Non-Fiction

  1. Spring by Karl Ove Knausgaard: A left turn in the seasons quartet that completely recontextualizes Autumn and Winter.
  2. The Dispossessed by Ursula K. LeGuin: A masterpiece of worldbuilding, exploring how an anarchist society would work.
  3. The Worst Journey in the World by Apsley Cherry-Garrard: Who knew Earth could be so inhospitable to its children?
  4. Winter by Karl Ove Knausgaard: See the world new again.
  5. The Age of Innocence by Edith Wharton: The impossibility of shrugging off society’s constraints and retaining its gifts (a reread).
  6. Educated by Tara Westover: A stew of ideas – gaslighting, abuse, cultural immigration, the limits of familial reconciliation, and the founding of religions.
  7. Pet Sematary by Steven King: A heatseeking missile to the heart of this new parent.
  8. The Fifth Season by N.K. Jemison: Exploring grief and prejudice in a geologically active fantasy world.
  9. Bad Blood by John Carreyrou: Riveting account of Theranos’ rise and fall.
  10. Golden Hill by Francis Spufford: The pre-revolutionary war New York is a wonderful setting.
  11. The Player of Games by Iain M. Banks: Explores similar terrain as the Dispossessed.
  12. White Darkness by David Grann: The strange siren call of the Antarctic.
  13. The Everything Store by Brad Stone: Solid story of the rise of Amazon.
  14. Louis Riel by Chester Brown: Nuanced story of a complicated revolutionary.

Why is knowledge transfer hard in neural nets but easy with metaphor?

Neural networks (NNs) and metaphors are both ways of representing regularities in nature. NNs pass signals about data features through a complex network and spit out a decision. Metaphors take as a given that we know something, and then assert something else is “like” that. In this post, I am thinking of NNs as a form of representation belonging to computers (even if they were initially inspired by the human brain), and metaphors as belonging to human brains.

These forms of representation have very different strengths and weaknesses.

Within some narrow domains, NNs reign supreme. They have spooky-good representations of regularities in these domains, best demonstrated by superhuman abilities to play Go and classify images. On the other hand, step outside the narrow domain and they completely fall apart. To master other games, the learning algorithms AlphaGo used to master Go would essentially have to start from scratch. It can’t condense the lessons of Go down to abstract principles that apply to chess. And it’s algorithms might be useless for a non-game problem such as image classification.

In contrast, a typical metaphor has opposite implications: great at transferring knowledge to new domains, but of more limited value within any one domain.  Anytime someone tells a parable, they are linking two very different sets of events in a way I doubt any NN could do. But metaphors are often too fuzzy and imprecise to be much help for a specific domain. For instance, Einstein’s use of metaphor in developing general relativity (see Hofstadter and Sander, chapter 8) pointed him in the right direction, but he still needed years of work to deliver the final theory.

This is surprising, because at some level, both techniques operate on the same principles.

Feature Matching

Metaphor asserts two or more different things share important commonalities. As argued by Hofstadter and Sander, one of the most important forms of metaphorical thinking is the formation of categories. Categories assert that certain sets of features “go together.” For example, “barking,” “hairy,” and “four legs” are features that tend to go together. We call this correlated set of features the category “dog.” Categories are useful because they let us fill in gaps when something has some features, but we can’t observe them all.

This kind of categorization via feature tabulation was actually one of the first applications of NNs. As described by Steven Pinker (link to How the Mind Works), a simple auto-associator model is a NN where each node is connected to another. These kinds of NNs easily “fill in the gaps” when given access to some but not all of the features in a category. For example, if barking, hairy, and four legs are three connected nodes, then an auto-associator is likely to activiate the nodes for “hairy” and “four legs” when it observes “barking.” Even better, these simple NNs are easy to train. And if such simple NNs can approximate categorization, then we would expect modern NNs with hidden layers to do that much better.

Now, as I’ve argued elsewhere, proper use of metaphor isn’t as simple as matching features. The “deep features” of a metaphor are the ones that really matter. Typically there will be only a small number of these, but if you get them right, the metaphor is useful. Get it wrong, and the metaphor leads you astray.

But this isn’t so different from NNs either. NNs implement a variety of methods to prune and condense the set of features, almost as if they too are trying to zero in on a smaller set of “deep features.”

  • Stochastic gradient descent (a major tool to the training of NNs) involves optimizing on a random subset of your data in each period, rather than all the data. In essence, we throw some information away each iteration (although we throw away different information each time). Now, this is partially done to speed up training times, but it also seems to improve the robustness of the NN (i.e., it is less sensitive to small changes in the data set).
  • Dropout procedures involve randomly setting some parameters to zero during the optimization process. If the parameter isn’t actually close to zero, the optimization will re-discover this fact, but it turns out you get better results if you frequently ask your NN to randomly ignore some features of its data.
  • Information bottlenecks are NN layers with fewer nodes than the incoming layer. They force the NN to find a more compact way to represent its data, again, forcing it to zero in on the most important features.

So, to summarize. Using metaphor involve matching the deep features between two different situations. NNs are also trained to seek out the “deep features” of training data, the ones that are most robustly correlated with various outcomes. So why don’t NNs transfer knowledge to new domains as well as metaphors?

What are the Features?

It may come down to the kinds of features each picks out. As discussed in another post, the representations of NNs are difficult (impossible?) to concisely translate into forms of representation humans prefer. It’s hard to describe what they’re doing. So we can’t directly compare the deep features that a NN picks out and compare them to the deep features we humans would select.

However, image classification NNs give us strong clues that NNs are picking up things very different from what we would select. There is an interesting literature on finding images that are incorrectly classified by NNs. In this literature, you start with some image and you tweak as few pixels as little as possible to fool the NN into an incorrect classification. For example, this image from the above link is incorrectly classified as a toaster:

Figure 1. Fooling image classification neural networks (source)

How can this be? Whatever the NN thinks a toaster looks like, it’s obviously different from what you or I would think. The huge gap between the deep features we identify and those identified by a NN are best illustrated by the following images from the blog of Filip Piękniewski.

Figure 2. Filip Piękniewski trained a NN to tweak gray images until they were classified with high confidence (source)

Filip starts with gray images and trains a NN to modify pixels until a second NN gives a confident classification. The top left image is classified as a goldfist with 96% probability. The bottom right is classified as a horned viper with 98% probability. The results are kind of creepy, as they highlight the huge gulf between how “we” and NNs “see.” Even though metaphor and NN both involve zeroing in on the deep features of a problem, the features selected are really different.

Different Data, Different Features

[Warning: this isn’t my area but it is my blog so I’m going there anyway]

One reason figure 2 is so alien to us is that it comes from a very alien place. Compared to a human being, a NN’s training data is extremely constrained. Yes they see millions of images, and that seems like a lot. But if we see a qualitatively different image every three seconds, and we’re awake 16 hours a day, then we see a million distinct images every 52 days. And unlike most image classification NNs, we see those images in sequence, which is additional information. Add to that inputs from the rest of our senses, plus intuitions we get from being embodied in the world, plus feedback we get from social learning, plus the ability to try and physically change the world, and it starts to become obvious why we zero in on different things from NNs.

In particular, NNs are (today) trained to perform very well on narrow tasks. Human beings navigates far more diverse problems, many of which are one-of-a-kind. That kind of diverse experience gives us a better framework for understanding “how the world works” on the whole, but less expertise with any one problem. When faced with a novel problem, we can use our blueprint for “how the world works” to find applicable knowledge from other domains (figure 3). And this skill of transferring knowledge across domains is one that we get better at with practice, but which requires knowledge of many domains before you can even begin to practice.

Figure 3. “I gave it a cold.”

My earlier post on the use of metaphor in alchemy and chemistry illustrates how a better blueprint for “how the world works” can dramatically improve feature selection. Prior to 1550, alchemists used metaphor extensively to guide their efforts, but it mostly led them astray. They chose metaphors on the basis of theological and symbolic similarities, rather than underlying interactions and processes. This isn’t a bad idea, if you think the world is run by supernatural entities with a penchant for communicating revelations and other hidden knowledge to mankind. But a better understanding of “how the world works” (i.e., according to impersonal laws) allowed later chemists to choose more fruitful metaphor than the alchemists.

When I see something like Figure 2, I see an intelligence that hasn’t learned how the world “really is.” Animals and physical objects are clumps of matter, not diffuse color patterns, no matter how much those color patterns align with previously seen pixel combinations. But I can see how it would be harder to know that if you hadn’t handled animals, seen them from different angles, and been embodied in physical space.

So I think one reasons human metaphor transfers knowledge so well is that it has so much more diverse training data to draw on. We pick deep features with an eye on “how the world works.” So why don’t AI companies just give their own NNs more diverse training data? One reason is that important parts of the structure of NNs still have to be hand-tuned to the kind of training data. You can’t just let loose an image classification problem on the game of Go and expect to get comparable results. There seems to be a big role for the architecture of NNs.

Whatever the “right” architecture is for the diverse training data humans encounter, evolution seems to have found it. But it took a long time. Evolution worked on the problem for hundreds of millions of years in parallel over billions of life forms. For contrast, AlphaGoZero played 21mn games of Go to train itself. At one hour per game, that works out to a bit under 2,400 years, if the games were played at human speed one at a time.

In a sense, I think this makes NNs more impressive – look how much they’ve done with the equivalent of a paltry 5,000 years of evolution! But I also think it provides a warning that matching broadly human performance might be a lot harder than recent advances have suggested.


Faking Genius

Geniuses are rare in life, but common in fiction. No offense to our writing class, but I suspect a lot of these fictional geniuses are written by smart-but-not-genius writers. But how can this be? How does a non-genius author write a genius character? If the character is smarter than the author, then their thoughts and decisions are, by definition, the kind of things the author wouldn’t think of when in that situation!

How do you fake genius? I’ve noticed three strategies authors use.

House, M.D.

The Genius Who Knows Lots of Stuff

This is the most common and, to me, most annoying strategy. It treats geniuses as little more than people who know lots of facts. I haven’t watched that much House M.D., but from what I’ve seen he’s an archetype of this format. Someone comes in with weird symptoms and House is the only one who knows about the rare disease that matches the symptoms. He is a walking storehouse of weird disease trivia (I know, I know, there’s more to him than that, it’s an illustration not a criticism of the show).

This is a pretty easy strategy for a writer to implement. The writer just uses google and a bookshelf to give the genius a torrent of factoids to say. But it’s also the strategy that leaves me cold, precisely because it’s easy to implement. It’s no more illuminating than flipping through an old set of Trivial Pursuit cards.

A twist on this type is the genius who knows which facts are the right ones. In this case, the author lays down the “real” clues, but then buries them under a pile of extraneous detail and red herrings. The author then makes the genius (usually a detective here) able to sniff out the real clues from the red herrings. The veracity of the “real” clues is proved when they solve the problem. Maybe they point to a villain who confesses or tries to kill the protagonist when outed. Or maybe they point to a treatment that cures the patient. Afterwards, the audience is satisfied that the real clues were there, ready to be seen, but we stand in awe of the genius’ “ability” to see what we had missed.

Geordie La Forge

The Genius Only Intelligible via Metaphor

The next type of genius is so much smarter than us that his speech is incomprehensible. We, the audience, are like dogs trying to understand humans. We recognize some of the words (frequently the word is quantum), but their connections are baffling. Frustrated, the genius then explains the gist of his idea with a simple metaphor that we can understand. Often the genius has to be prompted by someone saying “in English please!”

Star Trek’s tendency to do this was lampooned on Futurama (episode 412, “Where No Fan Has Gone Before”):

Leela: I didn’t wanna leave them either, Fry, but what are we supposed to do?
Fry: Well, usually on the show someone would come up with a complicated plan then explain it with a simple analogy.
Leela: Hmm. If we can re-route engine power through the primary weapons and reconfigure them to Melllvar’s frequency, that should overload his electro-quantum structure.
Bender: Like putting too much air in a balloon!
Fry: Of course! It’s so simple!

Star Trek is hardly the only party guilty of this trick. The Marvel movies do this when Bruce Banner and Tony Stark talk, for example. It’s not absent from more high-brow stuff either (e.g., Dr. Shevek’s explanations of his new physics in Ursula K. Le Guin’s The Dispossessed). This strategy seems to be used a lot in science fiction, precisely because in that genre we are dealing with technologies that haven’t been invented. If the author could explain exactly how they worked, then it wouldn’t be science fiction!

I think this method can actually be very effective. I am fond of this example, from the independent movie Travelling Salesman. In the movie, one of the protagonists has cracked the P versus NP problem. In brief, a proof related to the problem would allow allow us to solve difficult problems (like finding the factors of large numbers) at super speed. This is an open problem, so of course the writers can’t describe how it would really be solved. Instead, they use the following metaphor:

Tim: What if I took something like a quid coin, ok, and I buried it in the [desert]? It’s buried, you have no idea where it is, and I ask you to find it. How long would that take you?
Hugh: (scoffs) well-
Tim: Years, right, I mean millions of years if the desert were big enough.
Hugh: Sure
Tim: What if I melted the sand? Took all the sand in the desert and melted it. Glass. The whole desert becomes one big sheet of glass. So now finding the coin is easy, right? You just– you see it floating there. Change the sand to glass and finding the coin is trivial.

The metaphor conveys the idea that the genius has found a way to peer through all the complexity of a problem and see straight to the answer. But the writer doesn’t actually have to explain how it’s done.

This strategy is easy enough to write. You need a lot of complicated technical-scientific-literary buzzwords. You need a metaphor for the genius’ idea, but you don’t have to have the details worked out. Then you just alternate between the two modes of expression as needed. It’s kind of the empty calories of insight, because it gives the feeling of understanding without the reality, but I still prefer it to “the genius who knows lots of stuff.”

Ender Wiggins

The Handicapped Genius

A final type of genius is more satisfying, at least to me. In this case, the genius operates under a handicap so that exhibiting high (but not genius) intelligence by modern standards is itself proof of genius. A great example is Ender Wiggins, from Ender’s Game. In the book, Ender has a number of cool insights about warfare in three dimensions, and in general exhibits adult level intelligence. But he’s only six years old! A six year old exhibiting adult level reasoning is believable as a genius.

Another common twist is to put your genius in the past and have them be ahead of their time. The character of Thomasina in Arcadia is an example. In the 1800s and without the aid of computers, Thomasina discovers fractals (actually discovered by Benoit Mandelbrot in the 1970s). When the character is ahead of their time, the author has an excuse to illustrate the derivation of actually brilliant things (like fractals). But because these things have been discovered, with a bit of research, the author can learn how they were initially derived and then just copy that for their genius.

This strategy is more satisfying to me than the others because it usually exhibits reasoning from A to B, illustrates the connections between ideas, and so on. The facts aren’t just a torrent, but form a web of relationships. And for characters who are ahead of their time, you might get an idea of what it’s like to be inside the mind of genius. The catch is, you are actually reading a sort of disguised biography of whatever genius discovered the thing (like fractals) we are pretending was discovered by the fictional genius.

How to Fake Genius

So that’s how I’ve seen it done. Despite the tone of the above, I actually don’t think these are bad places to start.  These strategies do get at some truths about geniuses: they do know a lot of stuff and they frequently are unintelligible unless they talk down to us using simple metaphors. Read a pop-physics book for copious examples.

But I would love it if we could go further. If I could feel what it’s like to really be in the head of a first class mind, that would be great. Can we do better? I’m not a writer of fiction, a genius, or even a psychologist, but I have done some research on “innovation,” so I’m not 100% unqualified to make some suggestions. Specifically, I think a good genius character should have the following characteristics:

  1. Geniuses know many things.
  2. Geniuses think with both speed and endurance.
  3. Geniuses think clearly.
  4. Geniuses have a lot of working memory.
  5. Geniuses make unusual connections between disparate concepts.

Most of these traits are not that hard to fake with time and tools. Let’s take them in turn.

1. Geniuses know many things

This is the one nearly everyone gets right, so I’ll be brief. Google and libraries are your friend. A team of writers can pretend all their accumulated knowledge fits in one genius’ head.

2. Geniuses think with both speed and endurance

Another easy one to fake. The writer can ponder the perfect witty retort for their genius for an hour, a week, or a year. But when they put pen to paper, it will seem as if it was instantly on the genius’ lips.

Geniuses are also frequently capable of intense focus for long periods of time. The author can afford to be scatter-brained, as long as they have more time than the genius to ponder. The audience need not know that one day of focused attention by the fictional genius took the writer a few months of scattered attention.

3. Geniuses think clearly

By this, I mean that geniuses don’t make weak arguments and logical errors. Unfortunately, as laid out at length in Mercier and Sperber’s The Enigma of Reason, individuals have a hard time objectively evaluating the strength of their own arguments. This is bad news, because Mercier and Sperber also present evidence that humans are quite good at objectivley evaluating the arguments of others. If the author gives their genius a bad argument, the audience is more apt to spot it than the author.

Fortunately, we can use the same trick to our advantage. A good way to ensure your genius’ arguments are strong is to find a partner (or multiple partners) who you can talk them over with. A group of debaters, each of whom is individually biased towards their own argument, can nonetheless form a very clever collective intelligence because they can objectively evaluate each other. An author can, however, bring these disparate voices into the head of a lone genius to make the collective mind a singular one.

4. Geniuses have a lot of working memory

On average people can hold about 7 pieces of information in their head at the same time, some more and some less. Geniuses, I presume, can hold more. This is important because it’s much easier to see connections between ideas that are held in working memory. Thus, geniuses can perhaps see how larger sets of facts are connected to each other.

Now we are getting into terrain a bit harder to fake. Pen and paper are useful tools for keeping facts close to hand if not in your brain. You can work out the genius’ idea with lots of time and paper (including a lot of paper that is discarded) and then pretend it all happened in their head. Another possible technique is to “chunk” several pieces of information into a broader concept, so it can be worked with. This takes longer than it would for a genius (you have to spend time understanding the chunked concept), but it’s a price of faking genius.

5. Geniuses Make Unusual Connections Between Disparate Concepts

This is hardest to fake. One possibility is to mine your own life for the top 2 or 3 epiphanies and then to reverse engineer a scenario for them to emerge. It helps if you keep a record of your thoughts. Alternatively, you might pick a few disparate subject areas, read deeply in them, and attempt to harvest a surprising connection or two. Again, reverse engineer a setting for those connections to emerge. In either case, the goal would be to make it seem as if these kinds of realizations are ordinary events for the fictional genius.

Fake Geniuses Among Us

It’s irresistible to wonder if we can’t use similar tricks to fake genius in our real lives. I think it’s not only possible but common. Indeed, this is the kind of thing academics and scientists do all the time. We cite things we haven’t read carefully. At seminars, only one person presents, even if the work came from many. Our papers omit the missteps, dead-ends, and other frustrations of research. There’s no place in a paper’s methods section to write “then I thought about the problem off-and-on for two years.” We talk our ideas through at length with our colleagues. We use computers and paper to augment our paltry memory. And we pick and choose research questions that are well suited to the weird ideas we want to explore.

If there’s a larger point, it’s this: I am suspicious of the notion that the difference between us and geniuses is one of kind and not merely of degree. I am suspicious that they can ever be incomprehensible, so long as we give ourselves sufficient time and tools to work out their thoughts. Brainpower, time, and thinking tools are all inputs into great ideas, but to a large degree I think we can substitute the latter two for the first.


Finding the Right Metaphor

Last post introduced the hypothesis that having more metaphors equips people to innovate, because it expands their set of tools for thinking about novel settings. It presented some reasons to think this is true. This post pushes back on that simplistic hypothesis. Having the right metaphor is only half the battle; you also have to find it.

Could it be that having more metaphors is sometimes a handicap? First off, having more metaphors to search through could make it harder to find the right one. Second, having more metaphors might cause a problem analogous to over-fitting. You might have a metaphor that fits many of the surface details of the problem at hand, but not the smaller number of crucial “deep features.”

Metaphors in Alchemy and Chemistry

This is an interesting idea tossed out by David Wootton in The Invention of Science.

For classical and Renaissance authors every well-known animal or plant came with a complex chain of associations and meanings. Lions were regal and courageous; peacocks were proud; ants were industrious; foxes were cunning. Descriptions moved easily from the physical to the symbolic and were incomplete without a range of references to poets and philosophers. [p. 81]

Wootton suggests this formed a sort of reasoning trap. Every potential metaphor had so much baggage in the form of irrelevant features that it became hard to use them well.

I think the use of metaphors in alchemy provides a good illustration of this phenomena. Gentner and Jeziorski (1990) compare the use of metaphor/analogy in alchemy (prior to 1550) and chemistry (post 1600s). Compared to chemistry, alchemy seems to be endlessly led astray by metaphors chosen on the basis of red herring associations.

As explained by Gentner and Jeziorski, a goal of alchemists was to transmute base metals into more valuable ones via the “philosopher’s stone.” This stone was often called an egg, since eggs symbolized the “limitless generativity of the universe.” From Genter and Jeziorski, a sampling of alchemical thought:

1. It has been said that the egg is composed of the four elements because it is the image of the world and contains in itself the four elements…
2. The shell of the egg is an element like earth, cold and dry; it has been called copper, iron, tin, lead. The white of the egg is the water divine, the yellow of the egg is couperose [sulfate], the oily portion is fire.
3. The egg has been called the seed and its shell the skin; its white and its yellow the flesh, its oily part, the soul, its aqueous, the breath or the air.

As an egg is composed of three things, the shell, the white, and the yolk, so is our Philosophical Egg composed of a body, soul, and spirit. Yet in truth it is but one thing [one mercurial genus], a trinity in unity and unity in trinity – Sulphur, Mercury, and Arsenic. [p. 12]

Even as brilliant a mind as Isaac Newton got lost in the alchemical thickets of meaning. Mercier and Sperber give us this sampling of his alchemical musings:

Neptune, with his trident leads philosophers into the academic garden. Therefore Neptune is a mineral, watery solvent and the trident is a water ferment like the Caduceus of Mercury, with which Mercury is fermented, namely, two dry Doves with dry ferrous copper. [p. 326]

Gentner and Jeziorski contrast this style of thinking with the use of metaphor/analogy by later chemists. For example, Robert Boyle, trying to illustrate how individually minute effects can have large-scale effects in aggregate, uses metaphors of ants moving a heap of eggs and wind tugging on the leaves and twigs of a branch until it snaps off (Gentner and Jeziorski p. 8). Sadi Carnot uses the metaphor of water falling through a waterfall to understand the flow of heat through an engine.

What’s interesting is that neither Boyle nor Carnot used a novel metaphors that had been unavailable to the alchemists (although the math associated with Carnot’s model was new). It wasn’t that the availability of new metaphors that differentiated chemists from alchemists. It was the selection and use of existing ones.

It’s About Selection

Metaphors are an effective form of amateur modeling. When facing a novel situation, we take a leap of faith that a similar and known situation will serve as a useful map of the unexplored terrain. Having more metaphors at your fingertips increases the chances one of them will be a good map for the situation. But the ability to find this metaphor matters nearly as much as having it. We need a library that is large but also organized.

Hofstadter and Sander argue metaphor selection is the real test of domain expertise. There are a lot of dimensions along which a metaphor can match the thing to be explained. I’ve written before that its the deep features that need to be matched, but there aren’t universal guidelines for differentiating a deep feature versus from a surface feature. Alchemists thought surface similarity to an egg was the important feature for proto-chemical work. That seems silly to us now, but I suspect that’s because we assume the world operates according to impersonal laws. If you believe the world is instead run by supernatural entities with a penchant for communicating revelations and other hidden knowledge to mankind, then any features with theological symbolism probably seems like the deepest ones of a problem.

I can’t resist one more example. Crosby, in his excellent book on quantification in Europe over 1250-1600, makes a similar point about something as seemingly association-free as numbers:

Western mathematics seethed with messages… in the Middle Ages and Renaissance. Even in the hands of an expert – or, especially, in the hands of an expert – it was a source of extraquantitative news. Roger Bacon, for instance, tried very hard to predict the downfall of Islam numerically. He searched through the writings of Ma’shar, the greatest of the astologers who wrote in Arabic, and found that Abu Ma’shar had discovered a cycle in history of 693 years. That cycle had raised up Islam and would carry it down 693 years later, which should be in the near future, Bacon thought. The cycle was validated in the Bible in the Revelation of St. John the Divine 13:18, which Bacon thought disclosed that “the number” of the Beast or Antichrist was 663, a number certain to be linked to other radical changes. [p. 121]

Nevermind that the number of the beast is 666 – Bacon’s Bible apparently had a typo – and that neither 666 nor 663 is equal to 693! In this era, those were not “deep features” of metaphorical similarity.

Hofstadter and Sander essentially say practice and subtle skills determine how good an expert is at selecting the right metaphor. For many domains, that is surely correct. But I see two other tricks for organizing our library of metaphors.


The first trick is unification. If we can subsume individual cases under more and more universal ones, then we reduce the number of metaphors we have to search through. We also reduce the risk that we overfit to a metaphor with many surface similarities. To stick with our metaphor of a metaphor library, this is like keeping one book from a shelf.

For Hofstadter and Sander, this is essentially what categorization is. They give the example of the category mother emerging from a child’s encounter with more and more people who are somebody’s mother. Whereas the child may initially have to remember a large set of disconnected facts  – Rachel is a mother, Sarah is a mother, Thomas is not a mother, Rebecca is not a mother, etc. – over time the category of mother emerges. When a new person is encountered, the child no longer has to mentally compare them one by one with Rachel, Sarah, Thomas, Rebecca and others. Instead, it can quickly fill in a lot of gaps in its knowledge about the person if it finds out she fits in the mother category.

This is common in science, where models belong to the same family as metaphors (link). For example, many results in economics might initially be derived using a specific functional form. We might assume demand is given by the equation q = A – Bp, where p is a price and A and B are positive numbers. Or we might assume its given by q = ABlogp or q = Ap^-B. This requires us to carry around all these different equations in our minds. Life is much simpler when we are able to generalize our result to any continuous function where demand is falling in price.

You can see the same push for unification across most domains. It’s probably taken to its extreme in physics, where the quest for a single unified theory for the universe is taken to be a holy grail. From my outsider’s perspective (correct me if I’m wrong), historians seem to lie on the opposite end of the spectrum. In that field, detail matters.

Source: XKCD

With unification, we sacrifice details but we hopefully get the big picture right. It’s usually worse to get the big picture wrong than to get the details wrong, and since unification helps us zero in on the right “big picture” metaphor, it’s valuable. But unification becomes a problem when the knowledge domain resists it. This can happen when the details matter.

Systemization and Dani Rodrik’s Growth Diagnostics

When you can’t subsume everything under one case, you have to organize the cases. It’s fairly common to organize metaphors into big categories and then leave it at that. I’ve done that myself, grouping the representations used by innovators into five categories: rules, probabalistic statements, metaphors, neural networks, and instantiations. But you can also create rigorous processes for sorting through these categories and pinpointing the right metaphor for a given situation. The best example of this that I know of is Dani Rodrik’s growth diagnostics.

A bit of context is necessary to explain this. Rodrik is (among other things) a development economist. His growth diagnostics is a process for finding the correct economic model in that context. The problem development economists try to solve is why some national economies fail to grow at desired rates. Economic growth is a very complex and poorly understood phenomena, and there are many competing models of the process. Each of these models is a simplification, and each provides different kinds of policy advice . One might emphasize the rule of law and secure property rights, another investment in education, and a third subsidies for favored industries. Rodrik’s growth diagnostics helps you pinpoint the model most applicable to the setting. Access to a good model then allows you to think through what the impacts of various policies might be.

Figure 2. Growth Diagnostics (from One Economics, Many Recipes by Dani Rodrik)

Growth diagnostics is basically a decision tree (figure 2). It starts at the top with the problem: insufficient private investment and entrepreneurship. Rodrik divides the possible causes of this into two categories: a low return to economic activity, or a high cost of finance. He then provides some suggestions for what kinds of evidence to look for to determine which is the case (e.g., “are interest rates low?”). Suppose we have decided there is a low return to economic activity. Moving down the tree, is the problem that there aren’t socially beneficial things to do (low social return), or merely that the private sector cannot find a way to make useful things profitable (low appropriability)? Again, Rodrik suggests specific things to look for to help you determine which branch of the tree you should descend to.

Following the tree gets you down to a simple economic model of what is constraining growth. Economics has failed to discover a single model to explain everything, but with a procedure for finding the right model, it remains useful. It illustrates how a well-stocked library of metaphors can be made maximally useful whenc combined with a framework for zeroing in on the right one.


Innovation, by definition, requires stepping out of the familiar and into the unknown. These sojourns go better for us when we have a map of the territory. Metaphors can serve this function; they assert the unknown is “like” the known. Having more of these maps is useful, because it’s more likely one of the maps is a good fit (link). But this is only true if we can find that map. Maps with too many details can lead us astray, because the details might fit really well but the big picture is off. If having more maps is going to be useful, we need to organize them. One way is to prune our collection to a small number of maps with only the most important details. The other is to create a meta-map over our maps: a process for determining the deep features of the situation and matching them to the corresponding metaphor.


Do Metaphors Make Us Smarter?

One of the ways we navigate a world full of novel situations and problems is by using metaphors. When facing a new situation or problem, we take a leap of faith that things we don’t know are like things we do know. We search through our memory and find a situation that is “similar” to the one at hand. We use that previous experience as a metaphor for the current one. If it’s a good metaphor, it’s a roadmap through the unknown.

An  implication of this is that having more metaphors and more diverse metaphors is a powerful asset for thinking. All else equal, having more metaphors increases the chances one of them will be a good fit for your current problem. In this post, I’ll present some arguments that suggests this is the case. The next post adds some nuance:  our ability to navigate our personal library of metaphors matters as much (or more) as its size.

Three Cheers for Metaphors

I’m unaware of any study that directly compares innovation and the number of metaphors (drop me a line if you do). But there are a few lines of evidence that strike me as at least consistent with the theory.

1. Metaphors to Solve Problems

The closest thing we have to a direct test of the theory are psychology experiments.

  1. Give some subjects a metaphor well suited to solving a problem.
  2. Don’t give it to a control.
  3. See if the metaphor-equipped group is better at solving the problem.

Gink and Holyoak (1980) is an early example of the type. The authors asked study participants to solve the following problem. A patient has a stomach tumor that must be destroyed. No surgery is permitted, but a beam of radiation can destroy the tumor without operating.  The problem is that any beam strong enough to kill the tumor is also strong enough to kill the tissue it must penetrate to reach the tumor. Any beam weak enough to leave the healthy tissue unharmed is also too weak to destroy the tumor. What should the doctor do?

While you ponder that, let me tell you another story. Totally unrelated, I swear. Once upon a time there was a general who wanted to capture a city. His army was large enough for the task, and there were many roads to the city. Alas, each road was mined. Any force large enough to take the city would detonate the mines as it moved down the road. A smaller force could move down the road without detonating the mines, but would be too small to take the city. What to do?

Fortunately, the general came up with a solution. He divided his army into many smaller divisions, and sent each down a separate road. Each division was too small to detonate the mines, and so they all converged on the city at the same time, and captured it.

Wow, what a neat story.

Now, have you figured out the tumor problem?

The trick is to use many weak beams of radiation, each pointing to the tumor from different directions. They should all intersect at the tumor’s location but nowhere else. Their combined energy will destroy only the tumor, and not the healthy tissue that must be penetrated.Figure 1. Converging armies of radiation

The preceding is basically the experiment that Gink and Holyoak perform. They give people the tumor problem. Only 2 in 30 people were able to solve it on their own. Then they tell them the story of the general. This story’s solution is a tailor-made analogy for the tumor problem (converging weak forces), and 14 of 35 people were able to solve the tumor problem if they got the general story. An additional 12 people came up with the solution after given a hint to use the story to figure out the tumor problem.

A meta-analysis of 57 similar experiments obtains similar results. The intervention reliably has a small-to-medium sized effect. Giving people a solution wrapped up in a metaphor helps them find the solution. A bit.

2. Metaphors and Forecasting

Do these lab results carry over into real world settings? Forecasting  forms a set of problems that are not contrived but do have definite “correct” answers. Phil Tetlock has been asking people to make political and economic forecasts for decades, and tracking the results.  Drawing on a very old dichotomy, Tetlock classifies his forecasters as either “foxes” or “hedgehogs.” The idea is that “the fox knows many things and the hedgehog knows one big thing.” Tetlock consistently finds that foxes are better forecasters than hedgehogs.

Now, this isn’t really a direct test of the hypothesis. Tetlock isn’t “counting metaphors” for his forecasters. He classifies his thinkers as foxes or hedgehogs based on a series of questions about cognitive thinking style. These include agreement with statements such as “even after making up my mind, I am always eager to consider a different opinion” and “I prefer interacting with people whose opinions are very different from my own.” In general, Tetlock uses language like “hedgehogs view all situations through the same lens” and “foxes aggregate information, sometimes contradictory information, from many different sources.”

However, there is evidence that foxes make use of more and different analogies. Tetlock notes “foxes were more disposed than hedgehogs to invoke multiple analogies for the same case (Tetlock, p. 92).” And it seems to help them navigate novel situations.

3. Metaphors and the Individual

There are also some observational studies consistent with the idea that more metaphors make us smarter.

  • Highly creative people (which we can use as a proxy for innovation) tend to be open to new experiences and curious (Sawyer, p 64). These are two channels through which people may acquire additional concepts that can be used as metaphors. People who have lived abroad
  • Multicultural individuals show more creativity. Living in a different culture is, of course, a rich source of new metaphors.
  • Scientific “geniuses” tend to have broad interests (Simonton, p. 112): they have more diverse hobbies (painting, art collecting, drawing, poetry, photography, crafts, music) and are voracious readers, including extensive reading outside their main discipline. Again, this is hardly a direct measure of the size of their metaphor libraries. But broad interests would tend to foster a larger and more diverse set of potentially useful metaphors.

Of course, alternative and additional explanatory factors are also possible. But these threads are at least consistent with the story that having access to more metaphors facilitates innovation.

4. Metaphors over Time

It’s not hard to establish that the set of metaphors has grown over time. In their book on metaphor, Hofstadter and Sander compile an illustrative list of concepts unavailable to most generations of humanity. Each of these is available as a metaphor to people alive today, but not to people living, say 100 years ago. Here are 25 examples from their list of over 100 (p. 129):

  • DNA
  • virus
  • chemical bond
  • catalyst
  • cloning
  • email
  • phishing
  • six degrees of separation
  • uploading and downloading
  • video game
  • data mining
  • instant replay
  • galaxy
  • black hole
  • atom
  • antimatter
  • X-ray
  • heart transplant
  • space station
  • bungee jumping
  • channel-surfing
  • stock market crash
  • placebo
  • wind chill factor
  • greenhouse effect

If these concepts are a better metaphor for some novel situations, then denizons of the modern world have a leg-up on their ancestors. They are equipped with a bigger toolbox for handling novelty. This could be one factor explaining the Flynn effect, whereby the IQ scores on standardized tests rise in each generation.

Additionally, there are many scientific theories whose discovery depended on metaphors that were not always available. The proliferation of clocks in Europe may have made Europeans amenable to thinking of the universe as a machine following strict natural laws, instead of the whims of spirits (Wootton link). Neils Bohr used the heliocentric model of the solar system as a metaphor for the atom – a metaphor that would have been basically unavailable before Copernicus. And Einstein used principles derived from Newton’s classical physics to guide his hunt for the theories of special and general relativity. Again, the evidence is at least consistent with more metaphors being an asset to thinking.

5. Metaphors across Geography

More controversially, some have speculated that similar channels explain the correlation between economic prosperity and test performance (IQ, standardized math and others). GDP per capita is positively correlated with the average performance of a nation on standardized tests. A lot of people argue that this is because human capital/intelligence/IQ (whatever you want to call it) leads to economic prosperity (more innovation, better policies, more cooperation, more patience, etc.). But the causal arrow could just as well point in the opposite direction. Countries with more economic prosperity tend to have more complex economies (link to Hidalgo), more literacy, and greater access to digital information. All three of these channels may well expose the typical citizen to a more diverse set of concepts and processes. And these concepts and processes will then be available as metaphors. This bigger library of metaphors could then be a reason people in these countries do better on standardized tests.

Finally, just as there is some evidence multicultural individuals are more creative, countries with populations from lots of different places tend to have more patent intensity and economic prosperity. Again, this is hardly a direct test, but we can imagine people from different countries bring different sets of metaphors with them. A country with people from many different countries might have a more diverse set of metaphors, which could partially account for their higher performance in innovation.

A Virtuous Circle?

Item #1 above provides the most direct evidence that access to metaphors facilitates problem solving. The remaining items all show that more diverse people, information, interests, and concepts tend to lean in the same direction as various metrics for innovation. Correlation isn’t causation, but it’s possible the set of available metaphors is a causal link between the two. If that’s the case, then as society gets more metaphors it gets better at innovating. Maybe we are living in a virtuous cycle where innovation leads to social complexity and social complexity leads to a wider set of metaphors and access to a wider set of metaphors leads to innovation!

Figure 2. A Virtuous Circle?

I think there is some truth in that story, but also that it’s only part of the story. Maybe a small part. For items #2-#5 above, the evidence is pretty indirect. We don’t know if people really do expand their set of metaphors via the hypothesized channels, and we don’t know if they use those metaphors to innovate. Lots of other things are going on, and we don’t know how much those other factors matter. We also can’t be sure this isn’t a spurious correlation wherein “smart” people have diverse interests, but these don’t inform their ability to innovate.

Item #1 provides the most direct evidence that metaphors are causally related to innovation. However, even when we give people a perfect metaphor right before we test them, the effect is not large. And if we wait a day to test them, performance declines. It appears that there is a lot more to innovation than simply having access to the right metaphor.

Staying in the “metaphor” or having a library of metaphors, a major problem might be our ability to search the library. Which metaphor is the right one for a given problem? This is a problem we turn to in the next post…