Evidence at USAID

Sarah Rose at the Center for Global Development recently published an excellent note on how to make USAID programming more evidence-based. As a former member of one of the groups mentioned in the article (the Evaluation and Impact Assessment group at the erstwhile Global Development Lab) and a long-time evaluator, this is a topic dear to my cold, data-driven heart. I realize that probably marks me as a member of very small fraternity, but people really should care more about making donors more evidence-based! As Abhijit Banerjee recently pointed out, funding from donors like USAID may make up a small share of overall development financing in most countries but, in contrast to domestic financing, is quite flexible. Politicians typically have little fiscal space after taking into account funding mandated by previous legislation and naturally seek to use this money for quick wins like new roads or bridges. Donors, not subject to those constraints, could use their money for higher impact projects like early child development or to figure out what programs are most cost effective over the long term. Unfortunately, for reasons we’ll get to, USAID doesn’t often do this.

Back to Rose’s note. Rose argues that USAID has made significant progress in becoming more evidence-based over the past decade. She points out that USAID adopted a new evaluation policy in 2011 and created several new units within the agency focused on evaluation. She goes on to cite two internal reviews and an external GAO review which found that, overall, USAID has done a decent job in executing the vision outlined in the evaluation policy. The first internal review, for instance, found that most evaluations are high-quality, relevant, and used to inform programming.

Rose argues that despite these gains, there are still gaps in how USAID generates and uses evidence. To fill these gaps, Rose recommends that USAID:

  1. Nominate an administrator who will champion evidence-based programming
  2. Create a new evidence and evaluation unit which consolidates the various groups within USAID responsible for evaluation work
  3. Hire or train a cadre of impact evaluation specialists to oversee impact evaluations and “evidence brokers” to digest, synthesize, and communicate results from new studies
  4. Build evidence use and generation into program design and procurement. This would mean that where evidence is weak, programs should seek to generate new evidence. Where evidence is strong, program design should take into account the latest evidence
  5. Develop new methods for faster, less expensive impact evaluation methods

I agreed with much in Rose’s article, but I think she pulls way too many punches and disagree with some of her recommendations.

The State of Evaluations at USAID

Rose’s assessment that USAID has become far more evidence-based in the past decade is a very charitable take. USAID may have superficially met many of the requirements of its own evaluation policy but, in my experience, these evaluations rarely serve any purpose other than to shuffle funding between DC and Maryland / Virginia. To give one concrete (though hopefully not representative) example: A friend of mine recently met with a USAID contractor to discuss potentially conducting an evaluation of one of their programs. As he was leaving the contractor’s office, the CEO pointed out that the CEO’s job, and the job of all the other employees he had met in the office, depended on my friend’s “independent” evaluation coming to the right conclusions. My friend didn’t take the consultancy but, if he did, it’s likely that his final evaluation report would have been deemed to be “high-quality, relevant, and useful” according to the first internal review cited above. To meet this bar, according to the review, the evaluation need only ensure that the executive summary “accurately reflects the most critical aspects of the report,” that the “basic characteristics of the program, project or activity are described” and a dozen other equally superficial criteria are met. None of the criteria attempt to ensure that the evaluation is truly independent or that the recommendations are not influenced by program implementers.

Outright fraud of the type my friend encountered is (probably) rare but so are useful evaluations. Over the past decade or so, I have read hundreds of evaluations. (Yes, I should probably find better things to do with my time.) The best evaluations have shaped my thinking on what works and what doesn’t or provided helpful insight into what went right or wrong with a program. Excluding those evaluations commissioned by DIV, I can only recall a handle of evaluations that I found useful that were conducted by USAID or its partners. More worryingly, program managers often seem to ignore the latest evidence when designing programs. A recent SSIR article found that the big USAID contractors almost never scale-up proven solutions from smaller non-profits even in cases where rigorous evidence exists.

As Rose points out, this isn’t really the fault of USAID staff who spend most of their time navigating the minutiae of federal regulations to get money out the door and then ensuring compliance with various reporting requirements once it is out. An additional challenge, not mentioned in Rose’s note, is that many in USAID senior leadership have a naïve understanding of what constitutes “evidence-based development.” Political appointees often come from business backgrounds, where measuring success (i.e. profits) is relatively easy, and have little experience in international development, where measuring impact is much, much harder. Faced with the task of prioritizing different programs, these political appointees often push staff to come up numbers on program impact (“metrics” in business-speak) that simply don’t exist. The result is a set of absurdly ambitious claims about “number of lives saved” and program staff with an understandable aversion to any further measurement.

Ensuring Generation of More High-quality Evidence

How do we fix this situation? Many of Rose’s recommendation focus on increasing USAID’s capacity to conduct impact evaluations. Superficially, this makes sense – if USAID wants to be more evidence-based making sure that it generates more high-quality evidence seems like a good first step. Yet, to a first approximation, USAID doesn’t really need to generate its own evidence. The type of programming USAID funds is very similar to the type of programming other donors fund and outside researchers already evaluate. Academics, who conduct the majority of impact evaluations, are incentivized to create interesting evidence not policy-relevant evidence but observers have probably made more of this difference it deserves and, in any case, this problem is not unique to USAID.

Even if USAID has a unique evidence need, USAID staff probably aren’t the best people to lead an impact evaluation. Identifying opportunities for useful impact evaluations requires a deep understanding of the latest evidence in a sector and carrying out an impact evaluation requires a very specific set of skills. These tasks are best left to professors or outside organizations specializing in this work. (An added bonus of working with professors is that their time comes heavily subsidized.) This holds doubly true for efforts to develop new impact evaluation methodologies – these tasks are best left to much nimbler organizations.

This is not to say that USAID should not be involved in impact evaluations. USAID funds a lot of interesting, cutting edge programs. Impact evaluations of these programs could potentially add a lot to the global evidence base. So while USAID doesn’t necessarily need to take the lead in evidence generation, it would be great if it could collaborate with outside researchers to ensure that the most innovative programs are evaluated for impact. Unfortunately, this is rarely the case. The decision of whether to subject a program to an impact evaluation typically falls to individual program managers. Thus, whether or not a program receives an impact evaluation is more dependent on an individual manager’s enthusiasm for impact evaluations in general than it is on the program’s suitability for an impact evaluation. In addition, since impact evaluations add administrative hassle to program implementation, only the most enthusiastic managers sign up for their programs to receive an impact evaluation.

I’m not sure what the best way to fix this is, but a good start would be to modify the rule for when impact evaluations. USAID’s current evaluation policy states that all innovative programs should be subject to an impact evaluation (unless it is not possible to do so). If, instead, bureau heads would given a target of being involved in at at least 3 or so high-quality impact evaluations (where for lack of a better metric quality would be measured by where the results were published) bureaus would have an incentive to work with researchers to conduct useful impact evaluations rather than be forced to justify individual decisions on whether or not to evaluate the impact of a specific project. This is a bit similar to Rose’s recommendation to “focus impact evaluations more strategically.” Where I differ from Rose is that I don’t see a need for bureaus to plan out in advance what impact evaluations they will be involved in. For various reasons, it is often hard to plan impact evaluations too far in advance and, as mentioned above, I don’t think USAID staff are well positioned to identify gaps in the evidence base.

Another promising fix would be to create a new internal fund dedicated to impact evaluations which could be mobilized rapidly and which would cover the cost of both the evaluation and any additional implementation hassle. This would allow bureaus and outside evaluators to rapidly respond to interesting impact evaluation opportunities and reduce the negative incentive for program managers to be involved in impact evaluations.

Alternatively, USAID could just allocate more funding to DIV, perhaps with some slight changes to its mission. While USAID as a whole hasn’t done very well when it comes to impact evaluations, DIV has been a shining exception to this rule. DIV’s mission is not to fund impact evaluations per se but this has been a positive side effect of its work. # Ensuring More Effective Use of Evidence

Rose also makes several recommendations for how USAID could better use existing evidence. Here, I wholeheartedly agree with the overall aim. USAID doesn’t necessarily have to generate new evidence, but it absolutely does need to use existing evidence. Rose’s key recommendations here are to a) consolidate the various entities responsible for evaluation, b) create a new cadre of “evidence brokers” responsible for digesting and communicating the latest evidence to other stuff, and c) add rules to ensure greater attention to evidence in program design and procurement.

All of these recommendations make sense, but I have minor quibbles with each. My issue with the first recommendation is only that the timing absolutely sucks. In normal times, the first recommendation would make a lot of sense. Less than a year after an extremely painful reorganization though, yet another shuffling of official titles might just break the severely weakened will of many USAID staff. Still, this is probably something that should happen relatively soon.

The second recommendation – to create a cadre of “evidence brokers” – is intriguing but would face many practical hurdles. Existing staff probably wouldn’t be too keen on the idea that they need “evidence brokers” to help them figure out which programs are most effective. Without any formal authority over programming decisions, these “evidence brokers,” no matter how talented, could easily be relegated to a cold dark room within RRB never to be seen or heard from. Still, this is an interesting approach and deserves further exploration.

My lukewarm reaction to the third recommendation really comes down to the fact that it would add yet more rules to already overburdened bureaucrats.I would love to see program solicitations be more informed by evidence, yet the last thing program staff needs is yet another box to check when designing programs. The existing set of bureaucratic boxes related to program design, the ADS 201, is already over a hundred pages (of very dense type).

Whew, that was much longer than I originally anticipated and I haven’t even gotten to my own recommendations for how to ensure USAID makes better use of existing evidence. That will have to wait for another post.