Tipping Points, Epistemic Integrity, and the True Scandal Behind ‘AusterityGate’

These are beautiful times for philosophers of economics. The recent financial crisis has caused about as much turmoil in the economics profession as it has in the financial markets, the economy and government balance sheets. Last week we could learn that economists fail not only with respect to the theoretical, methodological and ethical foundations of their discipline, at least some of the discipline’s most prominent and influential members appear not to possess even the most basic epistemic integrity one would expect any scientist to have, much more scientists whose public policy recommendations have far reaching consequences for the welfare of millions of people.

To briefly recap the story (slightly more detailed ones here and here), it all started with a 2010 paper in which two Harvard economists, Carmen Reinhart and Kenneth Rogoff, purported show that a country’s level of debt and GDP growth are negatively correlated. Moreover, their evidence seemed to indicate an important non-linearity: ‘the relationship between government debt and real GDP growth is weak for debt/GDP ratios below a threshold of 90 percent of GDP. Above 90 percent, median growth rates fall by one percent, and average growth falls considerably more’. 90 percent debt/GDP thus looked like a tipping point beyond which growth drops sharply.

In their paper, Reinhart and Rogoff were careful not to draw strong policy conclusions from their findings or to read them causally. But other statements lend themselves to causal interpretation (‘In a series of academic papers with Carmen Reinhart… we find that very high debt levels of 90% of GDP are a long-term secular drag on economic growth that often lasts for two decades or more’, see here), and they certainly regard the 90 percent threshold an important indicator for policy (e.g., ‘Our analysis, based on these cases and the 23 others we identify, suggests that the long term risks of high debt are real.’, NBER Working Paper 18015, p. 23).

The timing of this research could hardly have been better. Many governments ran huge budget deficits to finance fiscal stimulus packages in the aftermath of the recent financial crisis. As a consequence, public debt/GDP ratios soared all over the world between 2008 and 2012: from 64.8 to 101.6 in the US, 44.5-89.8 in the UK, 66.2-93.1 in the Eurozone, 64.9-81.6 in Germany, 105.4-161.6 in Greece. Alas, not all countries could handle the increased levels of debt equally well: the US steered dangerously close to a ‘fiscal cliff’, several European countries such as Greece and Cyprus had to be bailed out by IMF and the EFSF, and, probably, the best is yet to come. IMF and EFSF grant financial assistance only after a ‘country programme’ with the requesting government has been agreed on, and these programmes invariably contain numerous austerity measures. The Reinhart-Rogoff findings appear to justify austerity. Until an Amherst grad student cooked their goose.

Thomas Herndon tried to replicate the Reinhart-Rogoff findings as an exercise for a term paper in an econometrics class. But no matter what he tried, his results kept deviating from those published by the prominent economists. So he asked them for their spreadsheet, received it and found the sources of discrepancy: a stupid coding error, mysterious data exclusions and dubious methodological choices (download the paper, co-authored by Herndon’s supervisors, here). When corrected for the mistakes, growth remains slightly lower for countries with a debt/GDP ratio above 90 percent but the difference is not at all dramatic. I haven’t seen any evidence to the effect that Reinhart-Rogoff deliberately tweaked results. Thus far, they have admitted to the coding error but defended other aspects of their study. It is pretty clear, however, that the original analysis did not quite receive as much attention as it should have, especially considering the likely policy consequences of research of this kind.

And now? Two Harvard professors toppled from their pedestal, anti-austerity protests vindicated, a triumph of the left. The world makes sense again. At least if we are to trust the many commentators in the blogosphere.

The true scandal behind this episode in my view is that evidence of a negative correlation between the public debt/GDP ratio and growth should never have been taken to support austerity measures. Likewise, the absence of evidence for such a correlation does certainly not support public spending sprees.

As we all know, and as Paul Krugman reminds us (here), correlation does not imply causation. But this is an obvious and not very interesting point. My own main worry is that pooling data from many very heterogeneous countries and historical periods is likely to produce non-sensical results – or at least results that are not very useful for policy considerations.

Take an analogy. What determines whether additional debt is problematic in case of an individual? This will certainly depend on his or her existing level of indebtedness, whether collateral is available and of what quality it is, what his or her employment situation is currently and likely to be in the future, what the money will be spent on and so on. Since these factors differ from person to person, it would be silly to use a rule such as ‘Do not incur a debt greater than x times your annual income’. But this is exactly the kind of recommendation some economists and policy analysts seem to have taken from the Reinhart-Rogoff research.

Countries (or their governments) do not secure public debt with collateral. Their ability to  raise money in capital markets depends mainly on their potential to tax their citizens in the future. And this potential in turn depends on factors such as population growth, median age, private debt and so on. (Would you rather loan money to a 25-year old or a 65-year old? To a double-income-no-kids household or to one that has a growing number of dependents, many of whom have either already left the labour market or may never enter it? To a household that is already much indebted or one that is not? Etc.)

With respect to these factors there are enormous differences among the countries and periods Reinhart-Rogoff pool, differences that should matter for the analysis. For instance, population growth in the U.S. is among the highest among industrialised countries, mainly due to high immigration. Immigrants are mostly young and keen to work. By contrast, Japan, Germany and most Eastern European countries have negative population growth (Slovenia, a likely candidate for EFSF assistance, shrinks by 0.2% per year, for instance). The median Japanese is 44.6, the German 43.7, the Greek 42.2. U.S. Americans are a lot younger at 36.9.

Differences over time matter too. In 1945 the U.S. public debt to GDP ratio was 113%, a lot higher than even today. But in the years after, public debt was reduced dramatically (to a post WWII low of 24.6% in 1974) without negative effects on growth, which remained at a solid 3% in that period. In the private sector, the debt was only 43% of GDP in 1945, however. That figure lies at a stunning 250% today. Thus, a decrease in public spending will be much harder to absorb for the private sector. To the contrary, the private sector will want to reduce rather than increase its debt. Moreover, in the period 1945-1974 the U.S. population became younger on average whereas it is ageing today. (See here for these points.)

Without telling a country and historically specific story of the financial health of a nation, arguments to the effect that austerity is ‘good’ or ‘bad’ should be unconvincing, quite independently of whether or not Reinhart and Rogoff’s figures were fudged or not.

Causation and the Financial Crisis

Last Wednesday’s Durham Castle Lecture was given by Martin Wolf, Associate Editor and Chief Economics Commentator at the Financial Times. The lecture was wonderfully engaging and impressive in the amount of detail, if somewhat depressive in its implications. Summarised in a slogan, Wolf argued that ‘This time is different’. (I’m alluding to Carmen Reinhart and Kenneth Rogoff’s book which, of course, claims just the opposite.) The crisis has changed the global economic system, fundamentally and in un- or rarely precedented ways. Some examples he gave were that economic power permanently shifted from the U.S./Europe to Asia, that European and U.S. government debt increased at rates previously unseen except during the great wars and the U.S. civil war, and that Europe (no doubt under the influence of the Bundesbank) had, prior to the crisis, never experienced a sustained period of zero-interest rates.

From a philosophy of economics point of view a fascinating aspect of the lecture was that Wolf did not tell us which among the several factors he discussed were causes and which were effects. And yet, more than half of the factors he focused on are well-known possible causes of the crisis – the credit crunch, the savings glut, zero interest rates, shift of global economic power – and the sovereign debt crisis in Europe is most certainly among its effects. So why didn’t he talk about causes and effects? Is one of Europe’s foremost economics journalists still in the grip of logical positivism? Does he reject the metaphysical categories of ‘cause’ and ‘effect’? Does he, like Milton Friedman, use the notions only in scare quotes to draw attention to their metaphorical status?

I don’t know whether Wolf is in the grip of logical positivism, and frankly, it’s not quite clear to me what that might mean. So let me try a different explanation. Metaphysical or not, ‘cause’ and ‘effect’ are not the most useful analytical categories for an economist. Economists are often well advised to avoid them. Here are the reasons why.

How do we find out about the causes of a singular event such as the financial crisis? We perform a thought experiment in which a factor of interest (the savings glut, zero interest rates, failed regulation, what have you) is absent or modified and then ask whether the outcome would still have occurred or would have been modified. The idea goes back at least to Max Weber’s seminal ‘Objective Possibility and Adequate Causation in Historical Explanation’, and it has made it into the law as the so-called ‘but-for’ test for causation. Would the harm occurred but for the defendant’s action (or negligence)? If so, the action (or negligence) was a cause-in-fact of the harm and the defendant is liable.

It is well known that the but-for test is flawed. Most ostensibly, an action, negligence or, generally, a ‘factor’, can cause an outcome without the outcome depending on the factor. Two campers, A and B, negligently leave their camping fires unattended, a forest fire ensues. As it happened, it was camper A’s action that led to the wildfire. But if it hadn’t been for his action, the wildfire would have occurred anyway, perhaps a moment later or in a slightly altered manner. Nevertheless, A’s negligence caused the fire.

There is a more fundamental problem with the but-for test, however, or rather with the notion of ‘cause’ that underlies it. According to a longstanding philosophical tradition, to which Max Weber contributed and more recently John Mackie and David Lewis among many others, a cause is a factor that makes a difference to the outcome in the context in which the factor obtains. In a seminar on causation I teach at Durham a student aptly likened this conception of ‘cause’ to the proverbial ‘straw that broke the camel’s back’.

The last straw is, according to Weber, Mackie, Lewis and others a or the ‘cause’ of the camel’s back being broken. But unlike the legal scholar whose goals include the attribution of responsibility in an individual case, the economist tends not to care much about last straws, which appear incidental from a systematic point of view. An economist would look at structural features that made the camel vulnerable, its physiology, the saddle and the material it was made of, the weight of the load it has already been carrying and the way it was distributed on the camel’s back and so on.

The U.S. Treasury’s decision to let Lehman Brothers fail in September 2008 may have been a last straw in the run-up to the crisis. From an economist’s point of view the event is not of much interest, however. Tons of similar events could have triggered the crisis at that point in time. Perhaps this crisis wouldn’t have occurred but for Lehman Brothers’ failure. But it is mainly the structural features of the situation that allowed the Treasury’s decision to make a big difference which arouse the economist’s curiosity, not so much the accidental features that describe how actually happened.

Here, then, is an important distinction: that between triggers and other incidental events on the one hand and underlying, structural factors on the other. This distinction is obscured by the use of the language of ‘cause’ and ‘effect’ because both triggers and other incidental events in the causal history of the outcome as well as genuinely structural factors are all causes. But they are different kinds of causes and, arguably, it is sometimes more important to know whether or not a factor is of a certain kind (causal or not) than whether or not it is a cause.

Suppose for the sake of the argument (if nothing else) that the financial crisis would have happened with or without the failure of Lehman Brothers, but without the failure it would have happened some time later and perhaps in a slightly different manner. In the Lewis tradition, which near-dominated philosophical thinking in the decades after the publication of his 1973 article ‘Causation’, events that move outcomes forward in time such as the Lehman failure are called ‘hasteners’.

Are hasteners causes? That depends. Given that, as Keynes noted, in the long run we’re all dead, the worst even the most evil of slayers can do to you is hasten your demise. But surely murderers cause deaths just as much as doctors (and seat belts) save lives. Other hasteners are not causes. I had two large cups of coffee while I was writing this post which, I’m sure, hastened its publication. We would not, however, answer the question, ‘What caused you to write/publish this blog?’ with ‘Two cups of coffee’.

Some causes bring about an event (in the sense that the event would not have happened if it hadn’t been for the cause), others hasten or delay an event, yet others modify the manner of the event without changing their timing. To give an example of a ‘modifier’, suppose that Europe’s current sovereign debt crisis would have happened with or without the financial crisis that originated in the U.S., and at a similar point in time, but it would have been less severe. We might then say that the financial crisis modified or, more specifically, that it aggravated the sovereign debt crisis.

These distinctions too are obscured by the (unqualified) use of the language of ‘cause’ and ‘effect’. There are many more important distinctions I cannot discuss in detail here: some factors can be manipulated by us, others cannot; some factors that made a difference to the outcome constitute a violation of a norm, others don’t; some factors are of a kind that is likely to recur on future occasions, others aren’t. And so on.

Two lessons emerge. First, economists often use causal language without using the language of ‘cause’ and ‘effect’. They may use the language of types of causes such as triggers, precipitators, structural factors, hasteners, delayers, modifiers. Or they may use more concrete causatives such as aggravate, squeeze, dry up, loosen, swamp. Second, economists are often ill-advised to use the language of ‘cause’ and ‘effect’. ‘Cause’ is too coarse a term to be useful to them, especially in the analysis of causally complex events such as the financial crisis.

Murder Ink (or How to Write a Philosophy Essay)

Writing a philosophy essay is much like writing a murder mystery. In preparing your essay, therefore, think like a detective. What’s the crime? Who dunnit? Where’s the evidence? Does it suffice for a conviction?

Like a crime investigation, a philosophical query begins with a problem. This should be obvious, and in a crime investigation it is obvious. But in philosophy it is often overlooked. No dead body, no crime. No philosophical problem, no philosophy essay. I frequently encounter papers or student essays that discuss the virtues and vices of ideas or methods or claims abstracted from the context of a philosophical problem. Don’t. It’s like talking about the perpetrator’s character and vita when there hasn’t been a crime.

When you have your problem – your crime – begin asking who might have done it. Draw up a list of suspects. That is, make an inventory of possible solutions to your problem. In some cases, this step is simplified to a great extent. If you’re writing a term paper, for instance, your teacher might ask you to discuss a specific answer to some problem. Do epistemic values help with undetermination? Does comparative process tracing help with extrapolation? Is causation regularity? The list of suspects is down to one and you can proceed to evidence gathering. That’s like boring crime cases such as Pistorius’. A single suspect. But then prove that he did not really think that an intruder had entered his home to take a wee. No small feat.

In other cases, the structure of the problem suggests potential solutions. One way of putting the ‘problem of evil’ is in the form of a paradox:

God is good and powerful (and hence would/could prevent evil).

God exists.

And yet, there’s evil.

So here are three solutions. God isn’t so good after all. God is dead. People mistakenly think there’s evil – cancer, genocide and Jim Carrey only make them stronger. If you don’t like any of these, challenge classical logic. A fourth solution.

In many cases it won’t be as easy as this because the list of suspects is open-ended. There certainly isn’t a set catalogue of responses to the problem of induction or what causation is or the mind-body problem. But there are standard solutions. The standard solutions always go on your list. Perhaps it’s worth trying and coming up with something new. If you decide not to consider a standard solution, have a good reason for not doing so. Prevent your reader from thinking you don’t know better.

Once you enter evidence gathering, note that there are two kinds of evidence: direct and indirect. Both are indispensable. Direct evidence addresses the question: ‘What would we expect to be the case if the hypothesis were true?’ Murderers leave traces on murder weapons, and murder weapons leave traces on murderers. Perpetrators leave footprints and DNA. Jealousy, a potential motive, will issue in a range of behaviours, not just the murder, before and after the crime. The same goes for philosophical theses. What would we expect to be the case if the Irenean response to the problem of evil (which says that evil is necessary for and promotes spiritual growth) were true? For starters, we’d expect not to find cases of spiritualness in the absence of evil (‘Before the fall, were Adam and Eve spiritual?’). We’d expect evil to hit especially spiritualness-free zones. We’d expect some people to grow spiritual after a blow of fate. And so on. Do you find these things? Good for you.

But the job isn’t done until the indirect evidence is in. Indirect evidence concerns alternative explanations of the direct evidence. Perhaps there are fingerprints on the murder weapon. Well, perhaps that’s not so surprising if it’s the suspect’s kitchen knife. Perhaps the suspect did have rust particles under his fingernails (the victim’s head had been smashed with a rusty iron rod). But perhaps they stem from climbing up an iron ladder, which, as luck would have it, was rusty, too.

Indirect evidence serves to rule out such alternative accounts for the direct evidence. Perhaps we know that the maid washes all the dishes and cutlery including the kitchen knives on Tuesday mornings and the stabbing occurred round about that time, a time at which the suspect should have been at work. His prints therefore couldn’t have been on the knife except if he came back (to do what?). Perhaps we can ‘fingerprint’ the rust from the iron rod. True, the ladder was just as rusty, but the forensics made sure that the particles we found on the suspect could unambiguously be traced back to the rod.

The leading detective in the Pistorius case did collect direct evidence, but he forgot about the indirect evidence. And that in the face of an obvious (albeit majestically implausible) alternative explanation of the events: Pistorius mistook his wife (girlfriend) for a hat (intruder having a wee on the loo). Quite rightly they took the case away from Botha (if for a different reason). Ignoring indirect evidence is an unforgivable sin for a detective.

Just as it is an unforgivable sin for philosophy writers. Suppose you’ve found some direct evidence for your hypothesis. Some people, indeed, become spiritual after a stroke of fate. What might be reasons for this other than God’s intervention? If there are any, collect indirect evidence to rule them out. Otherwise your philosophical investigation won’t succeed, your suspect will run free.

In an investigation, philosophical or criminal, the three stages ‘make an inventory of potential solutions/suspects’, ‘collect direct evidence’, ‘rule out alternative explanations by means of indirect evidence’ are interrelated. You start with a single suspect. But suspect has an alibi. You think harder and come up with suspects two and three. Interrogating them brings to the light something that casts doubt on the first suspect’s alibi. But perhaps they lied because they’d profit from doing so?

When writing a philosophy essay, considerations such as these happen before you start writing the actual piece. Sketch your case in an outline. Write down the list of suspects, the direct evidence for each, what else might account for it, and the indirect evidence to rule out the alternatives. Nail down your case. A more experienced writer can do that in the head, but it has to be done.

But this is not the way you write your essay. The essay that makes it onto paper is more like the final report the detective presents to the prosecution. He weaves the pieces of evidence together into a convincing narrative. Good narratives usually focus on a single suspect. Defend your choice of suspect, of course, but immediately move on to presenting the evidence. Focus on those pieces of the evidence that are most important to make the overall narrative compelling. Your case should not leave a reasonable doubt. Make sure that there is no other plausible narrative that is equally consistent with the evidence. But most important of all, tell a good story.

This post is dedicated to Murder Ink, a bookstore on Broadway, New York City, once my favourite bookstore in the whole wide world. Sadly, it closed in 2006.

John Broome, Climate Change and Babies

Last Thursday (Feb 7) John Broome gave a talk to the Durham Philosophical Society entitled, ‘The Private Morality of Climate Change’. His thesis was that everyone living in Western societies has a ‘duty of justice’ to leave a zero carbon footprint. His reasoning was essentially the following. The average person living in a developed country such as the UK adds a large amount of carbon dioxide in the course of his or her life. The figure for the UK was about 8.5 metric tonnes per capita in 2008, see here. With a life expectancy of 80.5 years this comes to nearly 700 tonnes. The emissions cause considerable harm to individuals living in the future. Broome believes that the emissions ‘caused’ by an individual living today add half a billionth degree to future temperature, and rising temperatures have known adverse effects.

We have, he argues, however, a duty not to harm others, except in rare situations such as in self-defence. Carbon emissions do not constitute justified harm. Therefore, we ought to prevent it. Since it is impossible for individuals not to cause some emissions – we cannot but eat food that has a positive carbon footprint, for instance – Broome proposes to invest in ‘offsetting strategies’. We could plant trees. But that has a variety of disadvantages, so he suggests to buy carbon offsets. Carbon offsets are reductions in emissions of greenhouse gases made in order to compensate for emissions made elsewhere. These can be purchased, for any number of tonnes of emissions the customer wishes, from companies engaging in emission reducing projects, for instance in the third world. So everyone living in the UK should buy such offsets for 8.5 tonnes per year or 700 tonnes for life.

Broome’s argument, and I hope the account just given is fair, could be criticised from a variety of angles. One might wonder, for instance, whether an obligation to buy carbon offsets is universalisable or whether climate calculations with a precision of half a billionth degree centigrade are meaningful. But let’s suppose Broome is right in saying that we have a duty not to harm future generations by greenhouse gas emissions. What Broome overlooks is that neither the existence of future generations nor reproductive behaviour should be taken as given because they are under our control. That means that we have to take them into our calculations.

It is quite obvious that the largest carbon footprint an individual could cause is by having children. Carbon emissions per capita have declined in many Western countries in recent years but it would be foolish to assume they will approach zero in the next generation or the generation after that. But my children will probably have children of their own and they will have children, so having a child causes carbon emissions several times that of an individual living today. Here, then, is an alternative offsetting strategy: have fewer children.

A fringe benefit of that strategy is that with fewer people in the world less harm can be done. The strategy is two-pronged: fewer expected emissions and fewer individuals to suffer from them. Best of all worlds.

Perhaps this is all a little absurd. But so long as no-one has given a convincing argument to the effect that we have a duty to make babies, which would, perhaps, be a little absurd of its own, I don’t see why reproductive behaviour shouldn’t be a variable in our climate calculations.


P.S. I asked Broome about this in the discussion period. He said, essentially, that it wasn’t that easy because even though we cause future emissions by having children, our responsibility would be mediated by another individual who will be (co-?) responsible for his or her own emissions. But he agreed that these issues couldn’t be treated apart from population ethics (which is a can of worms anyway).