Blog | Eva Vivalt

Recently I encountered the phrase “strong opinions, weakly held” — something advocated in the rationalist community. Some backstory for it is here. I am interested in considering the first part of the phrase and will ignore the “weakly held” portion, as I trust everyone agrees on the importance of being able to change their minds in the face of new evidence.

What could “strong opinions” mean? I see four possibilities:

Definition 1) Narrow priors (or posteriors, if you will — depends on which point of time you are considering)

Definition 2) Strongly stated opinions, in the sense of making a point forcefully

Definition 3) Strongly stated opinions, in the sense of making a point with precise language that accurately conveys one’s beliefs

Definition 4) Having an opinion at all, even if one’s beliefs entertain a wide range of possible outcomes (e.g. a uniform distribution over the entire space)

I can see several possible arguments for or against “strong opinions” in the sense of each of those definitions. Nonetheless, it is wholly unclear to me which arguments are typically made, using which definitions. If at the bare minimum one would like statements to be made clearly, in the sense of Definition 3, presumably there are better ways of putting that. By the sheer number of things it could mean, it is an ironic phrase. Perhaps it is better put as “clear opinions, weakly held”.

Banerjee, Chassang and Snowberg have an under-appreciated paper, “Decision-Theoretic Approaches to Experiment Design and External Validity”, that anyone who designs experiments should think about.

Some highlights:

1. Bayesians do not (if making policy decisions themselves) randomize

Suppose you were a Bayesian and trying to maximize expected utility. There exists some set of ways to assign individuals to the treatment group that would maximize your expected utility. Randomizing could sometimes deviate from that set of ways (e.g. if you are unlucky enough to have imbalance between the treatment and control group along some observable characteristics). Therefore, randomizing would not be optimal. This is in the same spirit as Kasy (2013).

2. Priors matter

Apart from randomizing sometimes leading to failure to obtain balance, randomizing could also not be optimal for some priors. They provide the example of a superintendent who believes that whether a student is from a poor or privileged background is the main determinant of educational outcomes and who believes that those who go to private schools do better because they tend to be from privileged backgrounds, but who is open to testing whether private schools are helpful in and of themselves. The superintendent has the chance to enroll a single student in a private school. Clearly, they would not learn much by enrolling a privileged student in a private school — to learn the most, they should enroll a poor student.

3. The optimal experimental design depends on how the decisions are made

The experimenter may not be making the policy decision themselves. Rather, they may be trying to convince others (or trying to convince some small part of themselves that is uncertain, in the Knightian sense). Thus, when designing the experiment they need to place some weight on how much they want to convince themselves, given their own priors, vs. how much they want to convince someone else (or themselves, under ambiguity), given the other person’s priors. This depends on how the decisions are being made, e.g., whether it is a group of people with quite varied priors making a decision.

4. Randomizing is optimal when faced with an adversarial audience (given a sufficiently large sample size and assuming a maximin objective)

Suppose you care about the worst case scenario: a decision-maker whose priors are such that given the experimenter’s chosen design they have a greater chance of picking the wrong policy than anyone else with different priors.

In this situation, a randomized experiment is best (so long as the sample size is sufficiently large). It is not targeted towards people with any particular priors, and given that it is not targeted towards people with any particular priors, it also leaves less room for error. Optimizing for some priors generally means making decisions worse for people with other priors. The qualifying statement that the sample size must be sufficiently large is included because with small samples comes greater loss of power from randomization compared with the optimal deterministic experiment (again, think of covariate balance).

This paper is nice because, among other things, it helps explain why academics who face skeptical audiences randomize while firms that do not face an adversarial audience but merely wish to learn for the sake of their own decision-making will experiment in a smaller and more targeted way, especially when the costs per participant are high. A key assumption in the current framework is the maximin objective, which may not always be what we care about.

Eva Vivalt

Clear opinions, weakly held

Comments

Priors matter for optimal design of experiments