Are we making a difference? If so, how, and for who?
These are questions we often ask about our policies, programmes and projects. Yet, they are often extremely challenging to answer. Understanding the ‘black box’ from ‘what we do’ to ‘what impact happens’ has puzzled evaluators for years.
It is now over a decade since we gathered to establish the Centre for Development Impact (CDI): a community of like-minded academics and evaluators who sought to improve the way we evaluate impact, particularly in complex settings. CDI is a collaboration between Itad, the Institute of Development Studies (IDS) and the University of East Anglia (UEA).
As we turn 10, I reflect on CDI’s origins, its work and why there is still more to do.
In the early 2010s, evaluation in international development was heavily critiqued, with powerful advocates for experimental designs coming to the fore. Off the back of the CGD’s seminal report When will we ever learn?, the 2009 Cairo Conference on impact evaluation, and the rise of organisations like J-PAL and 3ie, the future seemed clear.
Randomised control trials, and other quasi-experimental methods, would usher in a new era of rigour. Combined with cost-effectiveness analysis, policymakers would be able to take better evidence-based decisions, mirroring similar advancements in evidence-based medicine.
Yet, in international development, many of those who gathered to establish CDI felt that our real-world experiences did not bear this out. Social and natural systems were more fragile, with multiple causal pathways and challenging contexts.
So, around 2012/13, we set up CDI to explore a more pluralistic agenda. Our interest was in a wider range of perspectives of causality, rigour and what counts as evidence – or as Robert Chambers would put it ‘who’s evidence counts?’.
The work of Elliot Stern and others (later captured in this Bond guidance on impact evaluation) provided inspiration, but it was also influenced by our own evaluations in areas as diverse as climate change, advocacy, governance, slavery, migration, microfinance, nutrition and child poverty – where experimental evaluations were often just not appropriate or feasible.
This was the start of a community of practice that coalesced around:
- The appropriate use of methods, rather than assuming a hierarchy of methods
- Answering not just ‘what works’ but ‘how it works’ and ‘under what conditions’
- Using evidence to improve, and not just prove, impact by learning and adaptation
- Exploring issues of inclusion and power in evaluation
An emerging and enduring agenda
In those early days, we initiated many different strands of work through seminars, events and publications. It was a journey where we encouraged a wide range of views – and despite few available resources, it was a community of practice that is still active some ten years later.
Over that period, we’ve convened over 50 CDI seminars, created a series of 25-and-counting CDI Practice Papers, held several major events and conferences, and published numerous other publications. Out of this breadth of work, two main strands have emerged:
- Exploring the range of alternatives to using counterfactuals for causal analysis, such as with contribution analysis, process tracing, realist evaluation, and complex-aware and systems approaches.
- Exploring power through both the process and methods of evaluation. This has included work on ethics, specifically the relationship between evaluators and the powerful and powerless, as well as inclusive designs, such as PIALA and participatory statistics.
This agenda is every bit as relevant today. Indeed, perhaps more so, with a greater recognition of complexity and systems, and an increased focus on learning and adaptation. Beyond CDI, we now see some of these elements enshrined in guidance for the UK government, as well as new initiatives, such as Causal Pathways in the US.
CDI’s current work on ‘bricolage’ takes our work to the next level. Whereas in the early days we were experimenting with the use of separate theory-based methods (process tracing, realist evaluation and so on), we are now working on the ‘craft’ of combining methods – and how to do so in a deliberate and transparent way. This is also integrates notions of inclusive rigour, including how we undertake causal analysis that takes in diverse perspectives, especially of those less powerful and marginalised.
Of course, there are still areas where we’ve seen slower progress, and where there is more to do. Three come to mind. First, among commissioners and evaluators alike, we still don’t have a sufficiently shared framing and language to know what is good practice in most areas, although there is in a few such as the RAMESES standards, and work on strength of evidence rubrics. This contrasts with the likes of economics and statistics, the bedrock of experimental designs, where the norms and standards of rigour and statistical significance are more widely accepted.
Second, generalisation remains challenging – and we can’t always offer the neat, single answer that is so seductive to policymakers. Indeed, many theory-based methods are better at finding out what works in one specific context, rather than producing more ‘portable’ findings about what will work in multiple contexts, but not elsewhere (though some, like the recent UKES rubrics on the quality of evidence, start to explore similar issues of ‘transferability’).
And finally, ethics and power dynamics are important but often overlooked. While in international development, we now have a growing recognition of unconscious bias, decolonisation, and a greater desire for equitable working, our practice still has some way to go.