What is an Outcomes Evaluation?
For the Platform, an Outcome Evaluation (OE) is one that allows you to see whether and to what extent the intervention has achieved effective changes in the behavior and characteristics of its beneficiaries (CEDE, 2015). That is, the OE seeks to collect information and evidence of the actual results of the program’s participants. This type of evaluation covers different methodologies and types of assessments, including also qualitative and quantitative strategies. The difference between them is the level of information required, the representativeness of the results and the confidence to explain the observed changes being caused by the program.
The Results Evaluation (RE) and the Impact Evaluation by RCT (Randomized Control Trial) are the two methodologies for Outcomes Evaluation that are implemented by the Platform for assessing Graduation cases. Both methods differ in their ability to define levels of attribution (i.e., causality). An RE collects information in two stages, before and after the intervention, of the population participating in the program, in order to quantify changes in the population and try to match them with the implementation of the program. Impact Assessments, meanwhile, try to make use of program design features or statistical tools to gather information from the participants and from a group of people which imitates the household receiving the program which, for some reason (e.g., geographic targeting), did not receive the intervention. This last group is known as the control group. Thus, Impact Assessments could isolate the actual causal effect of the program by comparing the group receiving the program (treatment group) and the group that did not receive it (control group). In particular, RCT corresponds to the most robust methodology by “building” the control group before the intervention, randomizing who benefits from the program and who does not.
Overall, the Platform promotes the use of mixed methodologies, where quantitative information is complemented by qualitative data. In this way, the Platform can better understand quantitative results and have more elements for discussion on the policy implications. Qualitative strategies include, among others, interviews, focus groups and autobiographical accounts (Life Stories).
a reunir información no solo de la población participante sino también de un grupo de personas que imite de la mejor forma a los hogares que son participantes pero que por alguna razón (e.g., focalización geográfica) no recibieron la intervención, lo que se conoce como el grupo de control. De esta forma, las Evaluaciones de Impacto podrían aislar el efecto causal real del programa al comparar al grupo que recibió el programa (grupo de tratamiento) y al grupo que no lo recibió (grupo de control). En particular, las evaluaciones por RCT corresponden a la metodología de impacto más robusta al “construir” el grupo de control antes de la intervención, aleatorizando quien recibe y quien no el programa.
En general, la Plataforma promueve el uso de metodologías mixtas, en el que la información cuantitativa se complementa con información cualitativa. De esta forma se pueden entender mejor los resultados cuantitativos y se tienen más elementos para la discusión sobre las implicaciones de política. Las estrategias cualitativas incluyen, entre otras, entrevistas, grupos focales y relatos autobiográficos (Historias de Vida).
What is the purpose of an Outcomes Evaluation?
The Outcomes Evaluations are particularly important to justify the existence of a program and therefore its continuity or scaling up. Thus, the OE are very useful to designers and implementers for supporting the usefulness of the program and demonstrate compliance with its objectives. This information can also make design changes, in case it should require them, as well as try new procedures to evaluate which may be the best option. Likewise, information on the results is crucial to assess the economic sustainability of the project as one of the inputs of the cost-benefit assessments.
The Graduation Programs provide one of the clearest examples of the positive results to accompany and support the implementation of innovative designs with rigorous evaluations to sustain the effects of a program as clearly as possible. Both BRAC as the Ford Foundation and CGAP have implemented Impact Evaluations by RCT for several of the interventions led by them.
This has provided enough evidence about the benefits of the Graduation Program. Some of the main findings on key variables are presented in the following table:
¿+? y ¿-? refer to positive and negative effects, respectively, but for which no statistical evidence in favor was found.
Source: Banerjee et al. (2015); Bandiera et al. (2011); Briefs IPA
How is an Outcomes Evaluation done?
Outcome evaluation includes the collection of quantitative data before and after implementation, in order to compare the status of participant households with and without the program. The minimum requirements for the design of the RE methodology include:
1. Identification and prioritization of the results to be evaluated
Prior to the Results Evaluation planning and design stage, the identification and prioritization of the results to be measured led to the clear establishment of the objectives and scope of the evaluations. All Graduation Programs have common goals so particular results are expected for all cases.
2. Construction of indicators of interest
Each result to evaluate, chosen during the previous step, must have at least one clear and quantifiable indicator from which to determine whether progress has been made or not on those particular results. Indicators should be chosen guided from the CREMA methodology that exposes DNP (2012) and developed by the World Bank, for which the indicators should be:
- Clear, i.e., precise and unambiguous,
- Relevant, i.e., appropriate for the particular outcome of interest,
- Economic, their calculating costs are reasonable,
- Measurable, i.e., they could be validated by external validation, and
- Adequate, i.e., offering enough information to estimate the performance.
For the Platform case, the defined indicators are:
3. Setting goals in terms of indicators
A process suggested by McNamara (2006) is to establish, in the light of the objectives and results defined, goals for the proposed indicators. This is an exercise that should be considered in conjunction with the implementing institutions of programs, from their expectations of the program.
4. Design of data collection tools
The main objective of this requirement is to provide the information for the construction, estimation and interpretation of the indicators defined in the previous stages. For Results Evaluation, a complementary approach between the techniques of qualitative and quantitative research will be used. This approach will be used to provide feedback between the findings of these two techniques to more clearly understand the results of the evaluation and the effects on participants as well as the channels through which they become effective.
Thus, the adaptation of this methodology to each particular case contemplates the clear specification of activities carried out within the quantitative analysis and the qualitative analysis, and the way in which these two analyses should interact. The quantitative analysis includes the development of a survey that is conducted before and after implementation. The sample design is defined by the availability of resources, the statistical change to be detected before and after the intervention, as well as the size of the intervention. The qualitative analysis for its part may include, among others, focus groups, individual or group interviews and life stories. The choice of tools depends on the complement sought between the two types of analysis.
5. Information analysis strategy and results.
The proposed strategy of RE allows two types of analysis based on the collected information. The first is the analysis of the indicators, from the comparison between the baseline and the end line. On the other hand, the second type of analysis looks for structural changes in the results because different socio-demographic variables.
Impact Evaluation by RCT
Impact Evaluation by RCT uses randomization treatment (the random defines who receives the program and who does not) to define the control group, so that on average the two groups, treatment and control, are identical.
As it is the Impact Evaluation method that requires the fewer assumptions, it is the most robust method.
The main elements to consider in the design of an RCT include:
1. Identification and prioritization of results to be evaluated
Prior to the planning and design stages of the Impact Evaluation, the identification and prioritization of the results to be measured let to establish clearly the objectives and scope of the proposed evaluations. For example, within the Platform, all Graduation Programs have common goals so particular results are expected, such as poverty reduction, increase households’ food security, create self-employment sources, increase levels of savings, create social and life skills, and personal growth (De Montesquiou et al, 2014. Banerjee et al, 2015). Still, the process of Graduation Programs adaptation may require other variables. Clarify which results are relevant and useful for implementers and evaluators is necessary before advancing in the evaluation design.
2. Randomizing strategy
The challenge of an RCT is to have a control group that is on average as close as possible to the treatment group. One of the ways to achieve this is through randomization. Therefore, randomization is an essential part of an RCT. The literature suggests different types of strategies to obtain an efficient design, depending on the particular context and without sacrificing the robustness of this type of evaluation. There are at least three elements of a program that can be randomized: access, timing and invitation to participate, individually and in groups (Glennerster and Takavarasha, 2013). As suggested by Glennerster and Takavarasha (2013) and Bernal and Peña (2011), the selection of a randomization strategy depends on the context and limitations of the proposed evaluation. Each of the strategies will have practical, political, methodological and ethical implications.
3. Power calculations and sample design
The power of a RCT design corresponds to the ability to identify the perceived effects as statistically significant (given a confidence level). That is, it is the ability to identify that the increase or reduction perceived on the result variables, when comparing controls and treaties, is not by chance but because these results are presented in the generality of cases where the program is implemented.
Power critically depends on (1) the strategy of randomization, (2) the number of treated households, (3) the number of households surveyed, (4) the statistical confidence desired for the results and (5) the frequency of surveys. Therefore, the analysis will also determine the sample design for the quantitative information recollection.
4. Design of data collection tools
The Platform suggests that RCT should complement their quantitative techniques (survey) with qualitative research strategies. This approach allows the results of each technique are fed back in order to have outcomes more complete and relevant to public policy. This also permit get more clearly understanding of the results of the evaluation and the effects on participants, as well as the channels through which they become effective.
5. Monitoring strategy
Impact Evaluations by RCT requires of the design and development of monitoring and randomization protocols. Part of the results’ validity of these exercises depends on whether the households/participants defined as treated receive the program, while the households/participants assigned to the control group do not. In the same way, a continuous monitor of other existing interventions, different from the evaluated, should be done.
Centro de Estudios sobre Desarrollo Económico – CEDE (2015). Metodología para la Evaluación de Resultados. Documento Interno de Trabajo. Plataforma de Evaluación y Aprendizaje de los Programas de Graduación en América Latina. CEDE – Facultad de Economía, Universidad de los Andes.
Banerjee, A., E. Duflo, N. Goldberg, D. Karlan, R. Osei, W.Pariente, J. Shapiro, B. Thuysbaert, and C. Udry (2015). “A Multifaceted Program Causes Lasting Progress for the Very Poor: Evidence from Six Countries.” Science 348, no. 6236 (May 14, 2015).
Bandiera O., Burguess R., Das N., Gulesci S., Rasul I. Shams R. y Sulaiman M. (2011). “Asset Transfer Programme for the Ultra Poor: A Randomized Control Trial Evaluation”. CFPR Working Paper 22. Disponible en: https://www.microfinancegateway.org/sites/default/files/publication_files/asset_transfer_programme_for_the_ultra_poor-_a_randomized_control_trial_evaluation.pdf
Departamento Nacional de Planeación – DNP (2012). Guias Metodológicas Sinergia – Guía para la Evaluación de Políticas Públicas. Bogotá D.C., Colombia.
McNamara, Carter (2006). Basic Guide to Program Evaluation. Disponible en: managementhelp.org/evaluation/program-evaluation-guide.htm
De Montesquiou, A., Sheldon, T., De Giovanni, F,. and Hashemi, S. (2014). From Extreme Poverty to Sustainable Livelihoods: A Technical Guide to the Graduation Approach. Consultative Group to Assist the Poor (CGAP) and BRAC Development Institute.
Glennerster, R., & Takavarasha, K. (2013). Running randomized evaluations: A practical guide. Princeton University Press.
Bernal, R., y Peña, X. (2011). Guía práctica para la evaluación de impacto. Universidad de los Andes, Facultad de Economía, Centro de Estudios sobre Desarrollo Económico.