Discover the Latest Innovations and Lessons Learned in Rule of Law and Legal Empowerment Projects
In August 2014, UNDP published a new guide: “Why, What and How to Measure? A User’s Guide to Measuring Rule of Law, Justice and Security Programmes.” This is a great resource for beginner’s just starting out in the rule of law field. The toolkit includes the following information: how to operate under budget, time, political and data constraints; what type of skills to look for when hiring external expertise; how to use research findings to design and implement effective programs; how to translate measurement findings into practice; what to do with the lack of stakeholder support and more. Check out some of the excerpts below or click here to read the entire report!
Overcoming Common Measurement Challenges
Challenges related to a lack of familiarity with measurement methods: Many people who work for local NGOs, national governments and international organizations may be unfamiliar with the steps necessary for measurement. In some cases, potential partners may try to block data collection because they feel excluded from the process of collecting and analysing data, or fear that their authority will be undermined if research data is used in the decision-making process. It is extremely important that assessments or data collection do not alienate governmental partners who are often keenly aware of serious shortcomings on the ground. These partners will often require reassurances to overcome their perception that measurement is a donor imposed requirement of limited value to them.
Possible Solutions: To preempt negative perceptions it is crucial to manage assessment results very carefully. Including national stakeholders in measurement steps is often the best way to avoid accusations that programming is donor-driven, Western or otherwise imposed. Therefore, engage government officials early on in an assessment process to ensure that they understand the process and can trust the motivations behind it. The following four steps can help to generate local support:
(1) Include partners in initial discussions as a way of incorporating their interests and concerns into the design of measurement activities. (2) Brief senior officials and development partners about data collection plans and how the findings will be used. (3) Brief national stakeholders privately with any preliminary findings to encourage their ownership of the results. (4) Distribute findings and recommendations among all stakeholders in advance of their general release and provide an opportunity for the airing of comments and concerns.
Challenges of entrenched interests: In some cases, project staff may encounter obstacles to measurement because of a concern that the results will upset entrenched interests and disrupt established practices.
Possible Solutions: Structure the measurement exercise in a way that introduces incentives for positive change. For instance, draw attention to improved indicators such as a reduction in the number of police complaints. In some cases, however, it may be necessary to confront entrenched interests and seek support from other sources such as the media or civil society organizations.
Challenges of maintaining good relations with local partners: Negative findings of a project’s measurement may damage relationships with stakeholders who were also involved in the project design and implementation.
Possible Solutions: Involve local partners in every step of a project’s implementation and measurement process so that they are apprised of preliminary findings and may offer their input. Try to avoid unnecessarily criticizing local partners and instead discuss findings in terms of what can proactively be done to achieve desired outcomes.
Challenges presented by the pressure to succeed: Pressure to demonstrate success could directly or indirectly influence the selection of a RoL project. This is a dangerous because there are areas where it is difficult to demonstrate clear measurable results, often because short timeframes do not allow changes to mature sufficiently to show tangible results. The fact that results will be difficult to measure should not provide a disincentive to pursue good development programming.
Possible Solutions: Work with national stakeholders to develop program areas before determining how to evaluate projects. Assessments can help to determine what projects should be priorities and how to implement them to achieve desirable changes. The ability to evaluate a project’s effectiveness is an important consideration, but it should not dictate priority areas.
Investigate Existing Data Sources: One of the first steps in the assessment process is to investigate existing data sources. DPKO and OHCHR recently developed a series of indicators as a part of the UN Rule of Law Indicators Project that cover police, courts, prosecution, criminal defence and prisons. Other international projects, such as the regional Barometers (Afrobarometer, Arab Barometer, Asian Barometer, Eurobarometer and Latinobaromoeter) conduct public surveys in multiple countries that cover a range of governance issues. Similarly, national governments, civil society groups and other international development agencies may conduct national censuses or public surveys, collect administrative records, or compile other types of data.
[See Page 18 for Questions to Ask Yourself When Designing Assessments; Page 19 for Questions to Inform a Mid-Term Evaluation; and Page 20 for Questions to Inform Final Evaluations]
Measuring Indirect Change: In some cases, when it is not possible to directly measure an impact or change, it will be necessary to develop proxy measures. Proxy measures act as a substitute when it is not possible to measure the desired outcome directly. For example, if it is difficult to measure increased confidence in the police, proxy measures could include changes in the number of calls for police assistance, the number of witnesses volunteering to provide testimony, or public surveys measuring the perception of police trustworthiness. To measure an issue as complex as confidence, it is best to combine evidence from different data sources and proxies to provide multiple measures of the underlying concept. This approach is known as ‘triangulation’.
[See Page 21 on How to Evaluate the Impact of Mobile Courts]
Outcomes are actual or intended changes in development conditions that interventions are seeking to support. Outcomes describe the intended changes in development conditions that result from the interventions of governments and other stakeholders, including international development agencies. They are medium-term development results created through delivery of outputs and the contributions of various partners and non-partners. Outcomes provide a clear vision of what has changed or will change globally or in a particular region, country or community within a period of time. They normally relate to changes in institutional performance or behavior among individual groups. Outcomes cannot normally be achieved by only one agency and are not under the direct control of a project manager.
Outputs are short-term development results produced by project and non-project activities. Since outputs are the most immediate results of program and project activities, they are usually within the greater control of the government, or the project manager. Outputs generated by projects are always connected directly to an outcome. There is a critical responsibility at each project level with regards to the generation of the planned output through a carefully planned set of relevant and effective activities and proper use of resources allocated for those activities.
Defining Outputs and Indicators for Success: The very first step to establish a project’s effectiveness is to define, and agree with stakeholders on, outputs and indicators for success. This must be done before the project is implemented. The process of collecting assessment data will help refine project outputs and establish the baseline data. Once project outputs have been identified, the design of output indicators as well as targets to assess progress as part of mid-term and final evaluations can commence.
The process of developing indicators and targets may require a rethinking of whether outputs are realistic and appropriate. Although the ability to measure outputs should not influence programmatic areas, project outputs should be SMART, that is: specific, measurable, achievable, relevant and time-bound, to ensure that progress can be tracked.
How to Build Stakeholder Support for ROL Projects: To secure this support, conduct regular stakeholder briefings and provide project updates or other short documents describing the progress of the project along with any initial findings. A project advisory group is another way to engage stakeholders and incorporate their advice on project methods and how to translate findings into policy recommendations.
Weighing Risks and Benefits of Data Collection: In some cases, the risks associated with collecting information may outweigh the benefits. For example, conducting research interviews with members of sexual minorities may expose some of these individuals to a broader public and put them at risk of being physically and emotionally abused. In many instances, conducting anonymous research on sexual orientation and gender identity can minimize the risks and maximize the benefits of data collection.
Once the scope of the project has been finalized (see Page 26 for more information), the next stage is to decide how to approach the three measurement steps. This could, for instance, involve deciding between conducting qualitative in-depth interviews, or a public survey.
Skills Necessary to Collect, Analyze, and Interpret Data: This can vary substantially depending on the measurement approach. Designing and implementing measurement plans often requires a particular skill set, such as developing a sampling design for a national survey, conducting statistical analysis, or qualitative data collection with marginalized groups. If these skills are not available in-house it may be necessary to hire a consultant with the appropriate expertise. In addition to methodological expertise, qualified researchers should be able to apply their knowledge to crisis-affected and fragile settings. [See Page 28 on Pros & Cons to Hiring External Consultants]
Time Constraints: [See Page 30 on How to Select Consultants to Measure under Time Constraints]
Budget Constraints: It is recommended that staff allocate 10 percent of the project budget for assessment, mid-term evaluation and final evaluation (possibly including the hire of a measurement expert for the project).
Data Constraints: In fragile settings, data, as well as the resources and capacity to collect data, may be entirely unavailable or the information may be collected but incomplete. In other instances, political and security factors can compromise the quality of the data that is made available, or information may be of poor quality due to improper data collection and management skills within government offices or international agencies. Data may also be limited because of missing or incomplete information. [See Page 33 on How to Overcome Data Constraints]
Political Constraints: Political constraints can greatly hamper measurement activities. For example, government partners may be reluctant to collect and provide data, they may refuse to implement recommendations or acknowledge findings, and, in some cases, they may ask for an evaluation that portrays them in a favorable light and supports their political aspirations.
Cultural Constraints: Speaking local languages, dressing properly, using appropriate gestures, and acknowledging respondents’ efforts to provide data or help contextualize findings are all necessary steps for a successful measurement initiative. [See Page 35 for Steps on Overcoming Cultural Constraints].
Measurement Approaches: There are two overarching categories of measurement data: quantitative and qualitative. The former category refers to numerical descriptions such as percentages and averages, and the latter to information presented in narrative form (e.g., summaries of observations, first-hand accounts and descriptions of a process). Neither is harder or easier, or more or less valid than the other. Many measurement initiatives apply a mix of quantitative and qualitative methods, capitalizing on the relative strengths of each approach.
Qualitative measurement methods (QualMM) are often used to provide a nuanced description of issues that are complex, or not easily quantifiable. They are typically used to study a limited number of cases in detail.
Quantitative measurement methods (QuantMM) usually rely on numerical or statistical data, whether collected through administrative data systems, direct observations or quantitative surveys. For example, if targeting the provision of legal aid services in a remote area, the percentage of defendants that were presented by a lawyer before the project was in place (i.e., baseline data) could be compared with the percentage with representation once the project has been operating for a year (i.e., follow-up data). [See Page 38 for a Review of the Strengths and Limitations of QualMM and QuantMM]
Given that most RoL issues — such as transparency, fairness, access and responsiveness — are multi-faceted, it is advisable to use groups of complementary indicators, often called ‘baskets of indicators.’ [Examples Include Public Surveys, Administrative Data, and Expert Surveys – Page 40]
Isolating the Impact of a Project: Because it is usually impossible to account for all factors influencing the outcome that is being measured, results should be described in terms of ‘association’, i.e., how two or more developments are correlated, rather than ‘causality’, i.e., how project activities led to an outcome.
Evaluation Fatigue: Local government partner agencies may become frustrated and experience evaluation fatigue if several development agencies are working on similar projects and requesting similar datasets without coordinating efforts.
Two Ways to Measure the Effect of a Project: To isolate the specific impact of a project, measurement approaches that determine causality are crucial. These typically require some form of comparison group to establish a counter-factual (e.g., the proportion of women who would report SGBV if the UNDP project did not exist). There are two main groupings of designs that can be used to measure the effect of a project. These are:
(1) Experimental designs, where participants are randomly assigned to a group that is affected by a project, or a comparison group that does not receive services. In order to compare outcomes, it is important to track both groups using the same methods. Experimental designs are the only measurement tool that can establish a direct causal link, with a high level of certainty, between project activities and desired changes.
(2) Quasi-experimental designs, which include a range of methods that approximate random allocation and are used in settings where it is either impractical or undesirable to randomly deny people services. For example, it would be unethical to deny defendants access to legal representation in capital cases.
Six Ways to Collect Data:
1. Administrative Data includes a range of information collected by agencies or individuals, typically for purposes other than conducting research. Administrative data can be self-selected by reviewing files, report books or other types of written records — an approach known as a case file review. For example, data on the number of firearm-related deaths can be gleaned from local hospital records, or the incidence of gender-based violence can be found in the occurrence books of local police stations. The benefit of a case file review is that one can format the data to suit the needs of a project and collect the most recent data available, which may not be accessible to other sources as yet. Case file reviews, however, can be costly and the scope of data collection may be limited.
Using Secondary Data: Many governmental agencies and NGOs routinely collect administrative data, even in places with limited resources. For example, ministries of justice, civil society organizations or other development partner agencies may centrally record the number of people seeking legal and paralegal assistance, the number of inmates in local prisons, the number of police officers trained to respond to SGBV, or the salaries of judges and magistrates. Additionally, national bureaus of statistics (or their equivalent) may collect demographic information that is useful for measuring the effectiveness of a RoL project.
If an agency is not required to collect administrative data as part of its daily operations, it may be possible to help improve its capacity to collect and use data — by providing training and technical support. Such partnerships can help improve the quality of administrative data while also enhancing local ownership over a project and its measurement.
Other Actors Collecting Administrative Data: A wide range of national, bilateral and multilateral actors collect administrative data. For example, the International Committee for the Red Cross and Médecins Sans Frontières record information on mortality and morbidity, often including data collected from correctional facilities. Similarly, other UN agencies, such as the Department of Peacekeeping Operations (DPKO), collect a wide range of administrative data on criminal justice systems.
2. Public Surveys: can provide valuable insights on a range of RoL issues that may be impossible to measure using other data collection techniques. These can include so-called ‘household surveys’, which are a major source of social and demographic statistics in some countries, alongside population and housing censuses and the administrative record systems.
[See Page 49 on How to Select Survey Respondents and Design Public Surveys]
3. Expert Surveys: First, they generate in-depth information on technical or specialized issues that may be unfamiliar to members of the general public. Second, because expert surveys are not designed to be representative of a wider population, they often rely on a relatively small pool of respondents (20-30 people is not unusual) and are typically quicker to implement and less costly than large public surveys
Defining Experts: The criteria for defining ‘experts’ may be ambiguous and can depend on the perspective of different stakeholders. For some, experts are individuals who hold high positions in the government or have a detailed knowledge of an issue based on academic study. For others, experts can include anyone who has specialized knowledge of the issue at hand, irrespective of their professional position or affiliation. Unless a sufficient number of experts respond to survey questions, it may be impossible to report quantitative results.
Expert Affiliation: Expert surveys are particularly vulnerable to the perspective or affiliation of respondents. For example, if a project surveyed 20 experts from human rights NGOs about the prevalence of torture in national prisons, their responses may be entirely different from an identical survey of 20 government officials. To account for the variety of opinions that exists on most RoL topics it is important to include experts that represent the full range of opinions. Including multiple perspectives will also help to ensure that survey results are viewed as credible by a wide range of audiences.
[See Page 53 on Steps for Conducting Expert Surveys]
Confidentiality and Informed Consent: Whether conducting surveys or focus groups, project staff must ensure that data collection is voluntary, anonymous, or confidential, if necessary, and does not pose undue risks to respondents. Study participants need to be well informed about the possible risks and benefits of participating in a survey before they agree to answer questions
4. Focus Groups: are one of the most commonly used qualitative data collection methods. They are usually arranged as a facilitated group discussion and typically adopt a semi-structured or unstructured questioning format, allowing members of the group to express their opinions in an unconstrained manner. Groups typically include five to eight participants, a moderator and an assistant moderator. The moderator is responsible for guiding the conversation and asking supplementary questions to follow up on topics of particular interest. The assistant moderator is responsible for keeping written notes (if the conversation is not being recorded) and asking additional questions as necessary. Participants should share a common background (e.g., all women, researchers, criminal justice professionals) so they feel comfortable expressing their opinions.
Assessing Focus Group Statements: Focus groups typically produce qualitative, narrative accounts that can be analysed by looking for common themes in participants’ statements. They are particularly useful for assessing the diversity of experiences and are generally more cost-effective than one-on-one interviews as they allow multiple participants to be included in one session. Focus groups are usually not appropriate for sensitive or taboo issues, or on topics that could leave participants uncomfortable sharing experiences in a group setting. For example, one-on-one interviews or surveys would be prefer- able when assessing rates of domestic violence or other similarly sensitive topics. Because participants are asked to express their views and experiences in an open setting it is particularly important to provide clear information on topics to be discussed and to mitigate potential risks in advance.
Focus Group Benefits: They can serve as a potential data sources and a source of guidance for project design and review. They also provide a forum for pre-testing survey questionnaires, contextualizing and interpreting findings, and eliciting practical suggestions about how to use measurement findings to improve policy and practice.
5. Document Reviews: can include a wide range of materials, including court records, police crime registries, vetting documents, budgets, fiscal reports, written accounts of spending, newspaper articles, monographs and autobiographies, and pictures of accidents, people or corpses. Document reviews can help determine whether governments have provided sufficient information to the general public to ensure transparency and accountability (by publishing budgets, or information on the outcome of official investigations of corrupt practices, for example). Document reviews can also assess whether laws and regulations are consistent with international human rights principles or other practice standards. It is usually advisable to combine document review measures with other data sources, such as expert surveys or public surveys, to understand both the adoption and implementation of laws.
[See Page 56 on Steps to Undertake When Doing a Document Review]
6. Observations: can be conducted of criminal trials, prison conditions, police detention cells, informal justice proceedings and interactions between the police and members of the general public. Observations can be recorded through meticulous note-taking or by filling out observation work- sheets once the observation has been completed if note-taking is too intrusive. In some settings, and with the necessary permissions, it may be helpful to take photographs. While most observations result in narrative summaries (QualMM) project staff can also conduct an observation of a large number of institutions and produce quantitative results. An example of this would be visiting all prisons in a large country to assess the availability and quality of toilets and other basic sanitation systems.
[See Page 57 on Steps to Undertake When Conducting Observations]