:::: MENU ::::

Give Later

One of the more important things I’ve changed my mind about recently is the best cause to donate to. I now put the most credence on the possibility that the best option is donating to a fund that invests the money and disburses strategically in the future. I will refer to this as “giving later”, though I actually support giving now to a donor-advised fund set up to disburse in the future, for the value that donating now can have for encouraging others to donate (and because of the risk that even if one thinks one will donate later, one will at some point change one’s mind).

There are several reasons why I prefer a fund that disburses in the future. First, I believe people currently discount the future too much (see hyperbolic discounting, climate change). If people discount the future, that causes the rate of return on investments to always be higher than the growth rate (else people would not be willing to invest). In economics, the Ramsey equation is often used to determine how much a social planner should discount future consumption. It is specified by r=ηg+δ, where r is the real rate of return on investment, η is the extent to which marginal utility decreases with consumption, g is the growth rate, and δ represents pure time preferences. Unless one personally puts a particularly high value on δ, it makes sense to invest today and spend later to take advantage of the gap between the real rate of return on investment (~7%) and the growth rate (~3-4%).

How should one set δ? This is a huge open question. Like most effective altruists, I do not believe one should treat people today any differently from people tomorrow. But one might still wish to place a non-zero value on δ due to the risk that people will simply not exist in the future – that nuclear war or other disasters will wipe them out. Economists tend to like to respect people’s pure time preferences and so end up with rather higher values than effective altruists. The Stern report famously set δ=0.1, while Nordhaus prefers δ=3. The current Trump administration set δ up to 7, which justifies not doing anything about climate change (see also this nice figure). With a modest δ, it makes sense to invest now and give later according to the Ramsey equation.

A second reason that I prefer a fund that disburses in the future is that I think we have very limited knowledge today and that our knowledge is increasing. I am concerned about the problem that research results do not generalize all that well, but with respect to economic development I am optimistic that the situation can improve. With respect to technological change which could bring huge benefits or risks, I think we know even less about the problems future generations will face and may be able to understand them better in the future. It seems unlikely to me that we are at the exact moment in time, out of all periods of time from here on out into the future, that we actually have the best opportunity to do good. We may not recognize the best moment when it comes, but that just pushes the argument back a step: I also think it unlikely that we are at the best moment, out of the whole foreseeable future, to have the best combination of knowledge and opportunity to do good.

These are not novel arguments. Some form of them is made in several other blog posts, for example. Some of the criticisms commonly raised are whether donations today can help to improve the long-run growth rate and whether it is feasible to design and maintain a fund that disburses later without value drift. There are sadly few long-run follow-ups of development interventions, but it seems prima facie unlikely that interventions will have a long-run effect on the growth rate, given the growth rate is a function of many, many things. I expect most effects to taper off over time but acknowledge that further research in this area is needed. With regards to it being difficult to build a persistent and safe institution, I agree that this is challenging, but not altogether impossible, and I know several people working on this right now.

There are several reasons to be optimistic. First, this institution could take into consideration the risk of e.g. nuclear war or values drift in setting its disbursement scheme, so that it has a more aggressive disbursement scheme as the risks go up (in the extreme case, disbursing everything right away). Second, it is easy to think of a “lower-bound” version of this that would not be at much risk for values drift. For example, suppose a fund existed that disbursed the minimum amount possible every year (U.S. charities, for example, are required to disburse 5% per year), and then disbursed the rest in year 10. In the simplest possible version of this, think of a cash transfer charity like GiveDirectly which gives out cash to people in developing countries via mobile money transfers. One could set up the institution to automatically make these payments over time without any deviations allowed (say, through a smart contract). Unless mobile money is no longer in use 10 years from now, this option would seem to strictly dominate giving cash transfers today. What about other types of transfers, like to some of GiveWell’s top-rated charities, the Against Malaria Foundation or Deworm the World? It is possible that interventions are particularly cheap now, while they may be more expensive (for the same benefit) in the future. For example, most of the gains in life expectancy have been due to improvements in sanitation and basic healthcare reducing under-5 mortality; it is a lot harder to increase life expectancy from 79 to 80. There are some arguments that can be made against this. I won’t get into them too much, though I will note that under some conditions this situation could be addressed by letting the investments compound for longer before using them. In any case, my assumption is that if the calculus really works out this way, we are back in the world in which the organization disburses everything right away. Further, if one considers the farther future and cares about future potential lives, one may wish to place more emphasis on avoiding existential or extinction risks, and it is not clear that we are at a particularly good time in history to do that.

I think it appeals psychologically to many people – myself included – to think that we are living at a particularly important time. However, I recognize that people have thought this throughout history. As more time has passed, I have become increasingly confident that my gut antipathy to the idea that it’s better to “give later” is just a cognitive bias.

Workshop Link

For those wanting to attend the Forecasts of Social Science Results workshop on Dec. 11 remotely, please use this link to join via Zoom. (If you haven’t used Zoom before, you will be asked to install software on your computer or mobile device before joining.)

An updated agenda is posted here.

Workshop Program and Research Assistant Position

Here is the preliminary program for the workshop on forecasting research results mentioned earlier.

I am also currently seeking a research assistant for work relating to this theme. If you know someone who might be interested, please pass on this job advert. Deadline Oct. 26.

An Altruistic Proposal

I recently proposed to Gabriel Carroll. Some friends helped out, and here is the video! (Song starts at 2:51.)

Apparently, we each separately had the same idea that rather than a ring we would make a donation to charity. I tried to beat him to it, and the “ring” is a cheque made out to GiveDirectly, folded into an origami ring.

This kind of approach isn’t for everyone, but for us  it’s more meaningful than a ring would be.

Thanks to everyone who helped out!

Variance neglect: Testing a novel bias with policymakers and others

Aidan Coville and I have a new working paper in which we look at how policymakers, development practitioners (such as international organization staff), and researchers update in response to evidence from impact evaluations. In particular, we test whether they are subject to two biases: updating more on “good news” than “bad news” and not taking the variance of the estimates adequately into consideration (here, we mean confidence intervals, but you could imagine testing for neglect of other kinds of variance, such as inter-study variance).

The first bias is sometimes called overconfidence (not to be confused with the kind of overconfidence in which one simply is too certain in one’s beliefs), and in our experiment we are able to distinguish it from confirmation bias. The second bias is novel to my knowledge, and we call it “variance neglect”. In our context, it refers to ignoring confidence intervals, but you can also imagine other types of variance being ignored, such as inter-study variance. Variance neglect is related to “extension neglect”, in which people may, for example, neglect sample size when considering how much to trust a new finding. However, I think it is distinct, since sample size is not all that matters when considering variance. It is also reminiscent of Kahneman and Tversky’s prospect theory, in which people overweight low probability events and underweight high probability events. However, if someone overweights low probability events and underweights high probability events, that should have the effect of increasing the dispersion of their beliefs. What we find, in supplemental tests, is more consistent with a fundamental misunderstanding of confidence intervals. We give respondents one interval and ask them to provide a different interval (e.g. giving a 95% interval and asking for the interquartile range). For small ranges, they have overly disperse distributions, but for large ranges they report overly narrow distributions. Variance neglect is perhaps more closely related to the hot hand fallacy and the gambler’s fallacy, which also have to do with incorrect treatment or perception of variance. I would also distinguish variance neglect from the hot hand fallacy and the gambler’s fallacy, though, both because variance neglect does not require as restrictive a functional form and because the latter two biases conceptually have to do with seeing repeated streaks, while I am more concerned with seeing noisy data once. (Especially given how I know from past work how rare it is to have multiple studies on the same intervention covering the same outcome – instead, studies often “run away from each other” as authors seek to make them unique.)

We show how these biases can be easily fit into a quasi-Bayesian model. Testing for these biases is also straightforward. First, we elicit respondents’ priors. Then we randomly vary whether we show them high or low point estimates (relative to their priors) and large or small confidence intervals. Finally, we elicit their posteriors. That allows us to cleanly estimate whether they update more on the “good news” than the “bad news”. Testing for variance neglect is a little more complicated: for someone to suffer from variance neglect, they don’t need to update equally on large confidence intervals and small confidence intervals, nor do they need to perversely update more on large confidence intervals. Instead, all we need to show is that they do not update as differently when they see small confidence intervals as opposed to large confidence intervals as they would if they were Bayesian. Since we know their priors and we know what data we showed them, we can determine how a Bayesian would have updated and compare their responses.

The setting is also pretty cool. We can’t bring policymakers to a lab, but we can bring the lab to policymakers. We leverage a series of World Bank and Inter-American Development Bank impact evaluation workshops in various countries. These workshops tend to be one week long, and over the course of the week policymakers working on a particular program learn about impact evaluation and try to design one for their program with the help of a researcher. Several different types of participants attend these meetings: “policymakers”, or officials working in developing country government agencies (both those in charge of the particular programs and monitoring and evaluation specialists); “practitioners”, which are mostly international organization staff (like World Bank operational staff) and NGO partners; and researchers. Apart from the workshops, we also conducted some surveys at the headquarters of the World Bank and the Inter-American Development Bank. Finally, we also ran the experiment on Amazon’s Mechanical Turk (where workers take surveys for cash) to obtain another comparison group.

Quick summary of results: we found significant evidence of overconfidence and variance neglect, and no respondent type (policymakers, practitioners, researchers, and MTurk workers) updated significantly better or worse than any other. We also disaggregated results by gender; a few results indicated that women suffered more from these biases but it is difficult to interpret given that a) other results did not indicate that and b) the distribution of women in the sample varied by respondent type. Finally, we found that respondents updated more on more granular data, even when the data should have theoretically provided the same informational content (e.g. providing more/fewer quantiles of the same distribution). This suggests that when you have bad news, you should come bearing a lot of data.

The working paper has more details. There is still some work to do (e.g. non-parametric tests, robustness checks), but we are pretty happy with the initial results. We also ran several follow-up experiments on MTurk that try to parse the ultimate cause of the variance neglect we observe. We are running these experiments on a pre-specified larger sample and will add them to the next draft. Preliminary results suggest that misunderstanding confidence intervals (as opposed to inattentiveness or lack of trust in the experiment) is the crux of the issue.

This is not the only experiment that we have run in this space. We also ran a set of experiments on a similar sample that tried to determine how people weight different aspects of a study’s context or research design. For example, would a policymaker, practitioner or researcher prefer to see results from an RCT done in a different region or results from a quasi-experimental study done in their setting? This was a simple discrete choice experiment in which respondents were asked to repeatedly pick one of two studies with different attributes. Other attributes tested (in various permutations) include program implementer (government or NGO), sample size, point estimates, confidence intervals, and whether the program was recommended by a local expert. You would need far stronger assumptions than we would be comfortable making in order to say that anyone was making a “correct” or “incorrect” choice, but their preferences were interesting nonetheless. Tune in next time to find out how respondents answered.