urban institute nonprofit social and economic policy research

Experimental Design

by Stephen H. Bell

What is it used to measure?

Impact evaluations, including those with experimental designs, measure the impact of government programs and policies or other "treatments" on, for example, people, families, neighborhoods, or firms. In most contexts, experimental designs provide the most reliable impact evaluations (though other approaches are available when experiments are not feasible, including quasi-experimental methods). Experimental findings tell us how (or if) an intervention affects the treated group. The effect’s magnitude then defines how worthwhile an intervention is and, ultimately, whether its benefits justify its cost.

How does it work?

Measuring an intervention’s impact poses difficult challenges for evaluators. Not only must one collect data on outcomes from the intervention, one must measure what the outcomes would have been without the intervention. Experimental designs do this by dividing subjects at random into two groups—one that participates and one that does not. The second group illustrates outcomes absent the program. This is the ideal comparison group, since it differs from the participant sample only by chance.

The keys to successfully evaluating social policies using random assignment are

  • identifying enough potential subjects that chance differences between those assigned to the intervention and those excluded cancel out, leaving no meaningful difference other than the program’s impact; 
  • arranging with government agencies and staff in the field an "interception point" to randomize subjects when they first reach the threshold of policy/program involvement; 
  • assuring that no case assigned to the control group receives the intervention;
  • studying the program or policy in many different settings, all of which implement random assignment (and collect follow-up data) consistently, so that findings reflect the range and diversity of factors that characterize most national policies at the local level; and
  • conducting follow-up surveys and other data collection that includes the highest possible share of subjects as residential mobility and other factors increasingly prevent keeping the initial matched groups intact, and hence fully comparable, over the several years often required to observe fully an intervention's impact.

Research example

"Case-Managed Reentry and Employment: Lessons from the Opportunity to Succeed Program"