Are Markets for Art Criticism Efficient?
Part 2 of a series on subjectivity
I’ve recently become really interested in what makes an intellectual tradition feel subjective or objective, or some shade in between.
It’s so interesting that there is remarkable convergence in film reviews and ethics or even ice cream flavours, yet sometimes two people will walk out of a theatre and have completely opposite takes on a movie. Why does this never happen in math? What does the dreaded phrase “Well, it’s just an opinion” even mean? I really desired to systematize this whole realm of debate without appealing to some hubristic “objective” truth, like many moral or aesthetic philosophers do. If you haven’t yet read it, here is my post where I sketch my theory of what’s happening here, appealing to a coherence view of truth and a 3 variable model of operationalizability, verification infrastructure, and incentives. At first I thought it was so cool to explain it all through just incentives, but I needed to add measurability for it to be a good model. Anyway, if successful, I can explain all of these phenomena simply through conflicting assumptions within the tradition!
However, I’ve hit a snag with my theory. Thus, part 2.
Back 7 months ago, Tyler Cowen recorded a podcast about Bach with Evan Goldfine. Tyler said something I thought was really really interesting, but it was quickly glossed over with another question. I frequently have this problem with podcasts. Interesting lines of questioning are just incessantly washed away by the tide of witty questions. Anyway, this is what he said:
Evan Goldfine: Okay, so, so 15, 20 minutes, you get the deal. 40 minutes maybe? So, so how does that play into your music listening, especially with serious music?
Do you, do you get a flavor and leave, do repeats make you nuts? What’s your consumption function on those pieces?
Tyler Cowen: Well, there’s a big difference. So I think you more or less know in advance what is good. So one belief I hold is that every composer considered to be great is indeed great, that the critical market for classical music is remarkably efficient.
And the people who would be called the best, Bach, Beethoven, Mozart, they are the best. And you really just can’t argue it that much. So someone like Schoenberg who took me a while to appreciate, I knew he was the most important and best composer of atonal music, so I just kept on listening. I knew I’d get there, I did.
And there’s no uncertainty. Or as a movie, you can read reviews, but it’s like you know more than the reviewers often. And if you don’t like it, you don’t like it. Don’t necess. You don’t usually have some historical reason for wanting to know what the movie’s about. You would with some films like Citizen [00:14:00] Kane, but say, that’s great anyway.
The critical market for classical music is remarkably efficient! What a delightfully weird and fresh statement. (Yes I know, it is an Outrageous Fortune cliche at this point to profess our love for Tyler Cowen.)
But as lovely as he is, Tyler has now created a problem for my model! (Unfortunately: Cowen’s 1st law)
If you forget from my first post, I explain convergence in traditions like so:
1. Operationalizability: Can many parts of the tradition have testable criteria for truth? Mathematics and physics score high here, film criticism scores low.
2. Verification infrastructure: (Interestingly wine tasting scores higher than ice cream tasting, simply because there are more adjudication bodies.)
3. And finally stakes: Do incentives push traditions to build and maintain the first two parts?
Classical music criticism scores low on 1, 2, and 3. Below, you can see it slots nicely into point 4.
High measurability + high stakes: strong convergence (aviation safety, formal engineering).
High measurability + low stakes: marginally lower convergence without pressure (puzzles, chess).
Low measurability + high stakes: polarization and use of authority (law, ethics, policy).
Low measurability + low stakes: pluralism and taste (film, food & fashion).
Yet it’s true, classical music has remarkable convergence, and feels more objective than food or fashion or film. However, another puzzle: modern film criticism might seem pretty subjective, but classical film criticism feels much different, and doesn’t seem like it deserves to be in 4 either. It feels like it should be in the company of classical music and ethics rather than fashion, for example. I still agree with Cowen that it is marginally less efficient than classical though. So we have three problems. (1) I can’t explain why pre-modern classical and film shouldn’t be in 4, (2) why their modern equivalents seem so much more subjective, and (3) why even between the pre-modern markets, classical music seems marginally more efficient than classic film.
So what explains all of this?
I think there’s two things going on here. One can be explained by my model, the second cannot.
For the first part, let’s compare 3 people as an example. A modern film critic, a pre-modern classical music critic, and an investor. First, what does it even mean for a market to be efficient? The unhelpful answer is when the price of an asset = the expected long run return. Okay let’s toss that out. Not sufficiently general. New definition: A market is efficient if the price an investor assigns to an asset always = it’s long run value. So investors are always correct about the true value of an asset.
Now let’s look at their incentive structures.
The investor’s survival is on the line. This is this guy’s life. Perhaps if he isn’t solely on the finance bro self improvement grindset he even has a wife and kids that depend on him. He’s also probably obsessed with money and reputation, which are of course more causally associated with his job than the other two.
The film critic is quite different. If he, like the investor, makes money in his practice, then he is steeped (too conventional, hmmm. How about marinated? Pickled? impregnated with? my bad) in the attention market of release cycles. On the margin, he’s optimizing way more for clickbait and virality. As a result, the signal to noise ratio is lower than a snake’s belly in a wagon rut. (I learned this phrase from an old man recently.)
The classical critic, on the other hand, is not employed in his business of interest. He probably just does this out of pure enjoyment on his sunday afternoons. You would think that financial incentives would increase the quality of a market, not decrease it, but referring to my model above, high stakes (such as financial incentives) only have a positive relationship with objectivity when there is high measurability. Also, I guess I am partly arguing that a hobbyist that has reputational stakes within a small community of peers might actually be better optimizing for finding value than someone with mass-market financial stakes, which is an interesting conclusion.
So, that makes classic film and music criticism actually fall into (4), because both those writers are probably hobbyists. But again, we don’t want it in 4! It’s in bad company.
So, we’ve explained through incentives why there is a difference between modern and classical art criticism for the two markets. Argument (2) from above. But we haven’t yet figured out (1) and (3). And you’ll quickly see that even for (2), incentives don’t explain the whole thing.
To answer (1) and (3) and even partly (2), I realized I needed to add a mundane but essential variable. New model:
1. Measurability. (I fused operationalizability and verification infrastructure for simplicity).
2. Stakes. Are there serious incentives for traditions to build and maintain the first two parts?
3. And finally: Iterability of judgment. Can the same object be revisited and compared across generational audiences?
So, some traditions converge even without proof because repeated judgment can partly substitute for measurability! The conflicting assumptions get washed out almost as much as if they had high measurability. Hallelujah. I feel like the vocal melody at 0:34 of In the Meantime by Spacehog.
A combination of 3 (the hobbyist vs clickbait incentive argument from above) and 4 now explain why film and classical music criticism shows remarkable convergence, and even explains why, as Cowen said, classical feels marginally more objective than even classic film. (because of classical music scoring higher in 4, though it’s marginal because I think there are diminishing returns to 4)
My theory thus predicts that the markets for modern classical music and film would produce low convergence, comparatively, because they would score low on everything besides stakes, which often just increases polarization. This seems obviously true! They do have much lower convergence. If you think there are any other gaps in my theory, you are welcome to comment. I want it to be as strong as possible.


I think this is the view: convergence in a judgmental tradition (that is, a domain where we make truth-gesturing claims) is based on (1) measurability, (2) stakes, and (3) iterability of judgment. You can let me know if I'm applying this right.
So I want to predict how much convergence there will be in judgments about the best novels and I turn to the model. My own view is that quality is tied to a notion of evocation, but everyone I talk to has a pet theory and they invoke all kinds of different criteria. Low measurability on that dimension. However, there are well-respected prize committees, like the Man Booker, Pulitzer, and Nobel (which awards authors not novels), which can add some commonly-acknowledged authority to the process. In my view they do little to bridge the divide between judgments of 'classic' novels and more recent novels, leaving measurability quite low.
The stakes are high, I'd argue. One's investment in taste is generally increasing in their time spent with art and those able to make articulate judgments about the best novels are, by-and-large, going to be the sorts who care quite a bit about their taste being good. As an older prestige artform, novels are often taken to be one of the primary pillars of taste, so you're getting more considered views than on posters or baked goods.
This spits out the prediction that judgments about novels will be polarized and appeal to authority. To the contrary, I find these judgments often harmonious and lacking appeal to prizes won or a particular critic's praise. Certainly most, including myself, are relatively inarticulate and gnomic about why one novel is working itself better upon the soul, but it is so often a project of each arriving in Rome by different roads.
Now we also have this iterability mechanism, where we take it to be true that judgments of the same things over the long-run will have their idiosyncrasies cancel each other out, leaving behind signal without noise. Novels stay fixed over time, so they should be highly iterable. Perhaps a bit less than music, given the background knowledge the reader is often expected to have, but not so much less. If it were some other way then it would be far more difficult to read Korean novels in translation than it actually is.
The model says that in lieu of high measurability, high iterability can substitute. Therefore, we should anticipate convergence in judgments about novels. That convergence should come with some lag for newer works, given there hasn't been much time for different people at different times to bring their analysis to bear. That's why the jury is still out on the enduring legacy of the postmodernist novelists, for instance.
An interesting ramification of this, though, is that given high iterability, only stakes matter. So absent any measurables, persistence across time with an audience that cares should be enough to achieve convergence. Maybe that's why natural beauty is so universally agreed upon, as well as certain moral norms? A simpler model might be stakes (rooted in values) and iterability, where measurability is a scalar on how much iteration is required. We do begin to stray towards evolutionary story-telling with this, I fear.
I might add that the model seems misapplied in the case of, say, chess. It would be properly applied to 'chess problems' as objects of assessment, but improperly applied to predict convergence on the answer to any given problem. Given the premises of chess, the problem is deductively solvable. The model is inductive and I see no reason to over-extend it to such areas.