Orchestrating a MEL system for portfolios and programs: what we’re testing now

Autores: Alix Wadeson, Florencia Guerzovich e Thomas Aston

image.png
Arkhip Kuindzhi · Dnepr in the morning, 1881

As we support different organizations, think about their MEL systems, and reflect on our own work, we’re finding that we can measure progress and aggregate results to conduct a symphony that is larger than the sum of its ‘instruments’ (continuing with our ‘school of music’ metaphor). In our last post of this blog series, we discussed the importance of engaging with programs and portfolios in place, while working towards a new framework by incorporating “layers” of learning from experimentation, both at project and portfolio levels. We also learned about similarities with the thinking of other colleagues, such as the UNDP’s Accelerator Labs and Innovation teams. Our understanding is slowly evolving, but we are already learning various lessons; we hope these can inform others also working on MEL or managing ‘schools of music’, and continue this dialogue with peers and funders.

We have organized this learning and current thinking into five key areas: (i) construct validity and comparisons; (ii) collective and meaningful theory building; (iii) realistic timelines for realizing and measuring impact; (iv) purposeful and appropriate data collection and aggregation; and (v) MEL utility for all.

1) Are we measuring what we think we are measuring? And how can we compare and learn across projects and contexts?

We recommend starting by defining standard concepts, and indicators with specific guidance on what is (and is not) important to document at the project and portfolio level. This decision is often associated with a theory of change (ToC). There is some recent debate about the language, but it’s essentially a hypothesis of how and why change is expected to happen (or happened). We should also consider the main questions that stakeholders have about this theory (or theories), rather than generate concepts or indicators for their own sake.

Two of us are developing and testing a Monitoring, Evaluation, Reporting and Learning (MERL) guide for GPSA grant partners and evaluation consultants. This process has helped us reflect on common challenges in establishing other TPA portfolio MEL systems. For us, this process includes unpacking the key indicators that could prove useful to help us monitor, evaluate, iterate, and narrate a general trajectory of change across different contexts.

Also baked into this process is a set of core assumptions that the ‘school of music’ identified with its partners over the years. This included compromises for prioritizing certain assumptions over others, across the portfolio (e.g., zeroing in on the non-financial contribution of the funder as part of the effort rather than only on the work of local groups; or focusing on the development of relationships for meaningful action as opposed to focusing on the design of tools).

set of core concepts is also embedded here, for example, ‘sustainability’. The GPSA stakeholders defined both old and new concepts together, taking into account emergent practice as well as research in the sector. It was important to be clear about their definitions to ensure that stakeholders understand what exactly the portfolio intends to measure about and learn about collectively, before getting into the ‘how’.

Another example is “capacity building”, quite a generic term, but also one that is central to the GPSA’s work (and also for many other TPA projects). Therefore, the GPSA observed its practice and developed its own framework to define types of necessary capacities for effective social accountability processes. It’s important we are clear about what these key concepts mean for a given portfolio or program.

Such definitions are important to support the potential transfer of key ideas to different project contexts more easily, and to ensure we are monitoring similar dynamics (i.e., construct validity). These are also important to help avoid situations where actors are (perhaps unknowingly) talking past each other when reflecting about the work — e.g., equating capacity development to the top-down transfer of expert knowledge as opposed to the development of capacities to learn by doing with and from others. USAID also just launched a document explaining what they mean by locally-led development. So, we aren’t alone in this endeavor.

Definitions can be a purposive tool for a living process that enables us to find compromises to produce harmony among the different components of a ‘school of music’ (see below). Their codification is just a step to support MEL that can be revisited over time, in light of emergent practice. In the TPA sector, this isn’t meant to prescribe or proscribe, but rather, to ensure we’re talking about the same things, so we can also work to measure and learn about them consistently. We find that there are no perfect definitions and concepts can evolve with learning and practice. Therefore we are striving for ‘good enough’, while also being careful about conceptual stretching.

A ToC can be a useful instrument to prioritize concepts and indicators. But a portfolio-level ToC is often written in a way that does not speak to the specificities of concrete projects (see below). This is why it is useful to “localize” the ToC and associated Results Framework (RF) indicators into project-level ones early onCoherent nested ToCs; defined core concepts; a priority set of indicators; and a general approach to coding and scoring them might help us learn about key features of a portfolio/program.

It’s also equally critical to be explicit about what is not necessary to measure. We believe all stakeholders should avoid the urge to add too many indicators, and to add ‘extras’ with caution. The aim here is twofold:

a) focus attention on a manageable number of priority areas (theory, questions, indicators, learning activities) for the different stakeholders of a given ‘school of music’; and

b) avoid projects with too many indicators, while offering limited value added to your priority areas. The latter seems to be a pervasive problem in TPA projects that we know. The GPSA team reviewed 1,000s of funding applications. Over the years, one of us curated MEL jam sessions with grant partners. Developing fit-for-purpose TPA indicators which can provide the meaningful evidence we seek is rarely easy and can contribute to ‘overdoing it’ on the indicators. However, we suspect that if agencies really measured all the indicators identified in their proposals, sometimes over 70, they would not have time and money left to do the actual work! It can also dilute the quality of indicator measurement across the board, when the resources are spread too thin in this way (particularly with small teams).

This may sound like cause for concern, a risk of putting diverse grants into a “MEL straitjacket,” promoting check-box isomorphic mimicry, or promoting undue standardization. This is not our intent. So, to allow for diversity, one option is to use “functional equivalents,” an approach used in comparative social science and law as well as in the TPA field (e.g., in OECD Anti-bribery convention and Global Integrity’s legendary reports and indicators). In practical terms, this is about determining the function of key aspects of TPA projects, rather than focusing on the form (i.e., name or label).

For example, many TPA portfolios seek to bring actors from civil society, public sector, and citizens to engage collaboratively on joint problem solving to address specific service delivery or policy failure problems. These processes can take many forms with different labels — from school-based management bodies to community health committees to higher level engagement structures between policy makers and CSO coalitions. They can be formal or informal, rigid or loose.

The importance is the function that they play and that there are the appropriate group of stakeholders engaged in them to effectively address the identified problems. If these processes are critical to the joint purpose of a ‘school of music’, then each member should also ideally track an indicator focused on the activities and engagement of multi-stakeholder compacts, platforms or interfaces. However, it should be up to each member to determine what that looks like and how that works in practice (i.e., the functional equivalent). Our preference is a balance that promotes localization and appropriate contextual fit but does not simply propose that a thousand flowers (and indicators) bloom wild; this can lead to cacophony and may well be counterproductive to collective learning over the medium and long-term.

We suspect that the use of functional equivalent indicators may enable better comparison of processes with similar aims (they are part of the same overall program/portfolio) but that are not just copycats (i.e., they represent locally-led, contextually-tailored versions of the collective ‘music’). We hope this will help us to learn more about what works and does not, under which conditions. For example, how and why different contextual factors can sometimes lead to the same results or alternatively, how seemingly similar situations produce different results.

Over time, you can build a useful map of the portfolio, which can help to identify opportunities for structured comparisons. Ad-hoc case selection strategies, so common in TPA evaluations, can work well for communications, but less so for transparent, adaptable and accountable ‘schools of music’. They also get in the way of social learning, in the sense of grappling together with what we know and the uncertainty of the work.

Organizations should be prepared for course correction on the choice of core concepts and indicators until they reach the ‘Goldilocks’ balance.

We recognize that there’s a risk of too much standardization, as practitioners know well. However, we have seen in some TPA portfolios and programs that not having enough relevant data in many cases or datasets that cannot be meaningfully compared. It may take time to get the mix of concepts and indicators ‘right’, but it’s worth the effort in our view.

From a MEL practitioner perspective, the focus should be balance and compromise. Unrealistic zero-sum debates across extremes do not help us to move forward nor help us to ‘learn by doing’. As Kathy Bain picked up from our previous posts, “If we cannot support the scale up and learning from the many disparate but rather small scale success stories we all know about, we are falling short. Candid discussions and more purposeful experimentation on how best to do this, while learning from each other, is urgently needed”.

2) Build theory collectively, yet with boundaries

Nested mid-level theories of change (with boundaries) can help provide focus and build political compromises. We have shared some of our experiences on the benefits of mid-level theory for field and portfolio MEL. Mapping assumptions can help prioritize change pathways within a portfolio. Being explicit can help us interrogate the validity of these assumptions, as well as recognize other pathways that co-exist beyond our portfolio (i.e., other genres of music that exist in other schools of music and may be in harmony or discord with our work).

In this way, when we talk theory, we are not thinking about only lobbying for our preferred musical genre as universal, but the possible benefits of alternative paths and what their tradeoffs might be given organizational and contextual circumstances. When portfolio ToCs are only made with strong normative assumptions for advocacy, fundraising or other objectives, they may inadvertently undermine the quality and effectiveness of MEL, reinforcing our discursive “existential threat.”

We also advise to explicitly ask about the funder or fund manager’s contribution and comparative advantage: Even when ‘schools of music’ increasingly let ‘local musicians’ lead, there is much to learn from their common thread. We need to learn about whether they add value to the symphony or the transaction costs, organizational dynamics and/or other factors turn lofty goals into a symphony or a cacophony. If we inquire rather than assume local partner coordination (i.e. an orchestra) and a funder or fund manager’s role in it, we can learn how to better support the work.

For example, the Fund for Transparent Slovakia’s (FpTS) evaluation found that incentivizing joint projects among NGO partners did not pay off in their context, but supplementing grants with informal dialogues, including but not limited to partners, added value to the system (see p.61 here for a glimpse). However, it is important to note that much of the FpTS administrators’ staff time, responsible for the fund’s value-add beyond direct grants, was not covered by its administrative costs.

Clarifying a funder’s role in a given sector or context, where there are other actors trying different approaches, often also requires asking: what is our unique contribution to funding change within a system? Also, what ‘musicians’ and ‘musical genres’ are well suited to support and which ones are a better fit for another’s niche or specialty? In the case of the FpTS that means grappling with the opportunities and constraints of working with funders from the local private sector. For the GPSA, the institutional home at the World Bank cannot be overlooked. For many others, those opportunities and constraints will be shaped by the link with a government’s foreign policy, a founder, management and/or a board member, whose influence practitioners working at portfolio level know well.

3) Identify an appropriate time horizon for impact (and be realistic!)

Target conscientiouslyLooping back to our first post reflecting on the feedback that the Hewlett Foundation received on its strategy: Telling a narrative of progress is about showing results that stretch us collectively, but does not ‘throw the baby out with the bathwater’. We recognize that TPA often is messy and takes patient investment, but the process can lead to success if funders and other stakeholders keep at it together, as Louise Cord, the Global Director for Social Sustainability and Inclusion at the World Bank, put it.

For example, in the short- and medium term, you can set targets for the journey that are doable but not easy to achieve (e.g., other actors’ uptake and adaptation of interventions’ lessons — i.e., embeddedness) rather than expect unrealistic ones (e.g. wholesale scale-up through the adoption and implementation of an intervention exactly as one designed it). This way we may avoid feeding the discursive existential crisis of the next decade.

Connect MEL across strategy cycles: We often say that TPA work is the story of a marathon, rather than a sprint. So, we can see it as a type of relay race between the conductors in a school of music. Strategy cycles start and end, often informed by path dependence and, hopefully, learning from predecessors. The challenge is that we often forget to talk about the baton passed between the conductors in this relay race.

Evidence suggests that scaling up innovations takes a decade and translating policy change to implementation tends to take at least five years. So, it’s not realistic to expect high-level impact for communities within only a couple of yearsas is often expected. For this reason, we should seek measurement of impact and progress in 5, 10, and even 20-year periods. For the medium-term, we could use more in-depth reflections of iterations across those cycles — did the interpretations of the lessons from the last funding cycle that inform our current actions hold up over time, or not? For example, the Partnership to Engage, Reform and Learn (PERL) program is giving a series of webinars in October 2021 to share lessons from 20 years of different UK governance programs in Nigeria.

Similarly, the World Bank’s approach to social accountability in the Dominican Republic since the 1990s finds that local and global contextual shifts, and TPA and sectoral lessons, all informed changes in the Bank’s approach in country and elsewhere. All too often, those long-term histories are reserved for a selected few, doing the reflections, having the benefit of inter-generational reflection or learning from their predecessors, and/or shifting strategies from one approach to the next. In other words, we could use more comparisons to tell the story of these relays and the learning generated across strategy cycles over time.

4) Gather data purposefully and aggregate appropriately

Build filters before gathering data: The challenges of monitoring a portfolio’s work relate to its scope – the sheer volume of information and transaction costs associated with working with so many people and actors. The challenge is not so much around generating information as organizing it, constructing filters, and developing the systems to apply them so that the right information and indicators are available and consistently applied across project grants. As Clay Shirky (in Juskalian, 2008) asserts, “Without clear guidance, long qualitative narratives may be so variable that analysis, particularly comparative analyses, becomes extremely difficult”.

Aggregate data appropriately: An additional challenge with the portfolio structure may be demands for inappropriate aggregation, combining dissimilar investments, projects or outcomes to present several diverse projects in a simplified narrative. This is a challenge for TPA professionals as many of us have faced requests to aggregate index scores (e.g., civic space) and results, or perhaps find ways to add contributions to women’s reproductive health and sanitation — without considering whether we are adding apples and oranges and/or ignoring interaction (negative) effectsAn over-simplified approach risks ignoring important differences and nuances that may be valuable for learning, while ignoring contextual factors and over-generalizing.

The use of a set of functional equivalent indicators with the same units of measurement, based on common conceptual definitions within a portfolio of grants, can ease aggregation processes downstream. That is, the approach may help us to transfer meaningful, targeted information from implementers to managers, and subsequently use this knowledge to inform higher-level decision making and governance structures (e.g., Boards; Steering Committees; funders; and public officials funding specific civil society work).

5) Strive for utility to all: implementers, funders and the field as a whole

Construct a MEL system that works for implementation: There is a craft to portfolio management, including its MEL. It’s hard work and requires financial and management support. Failing to purposefully invest in sufficiently resourced MEL systems can backfire. It feeds and potentially deepens the field’s supposed “existential crisis”. We know (and experience ourselves) that organizations that have very limited human capacity, financial resources, short time horizons. Therefore the political space may find some of these examples unhelpful. We believe that it’s important to design MEL systems that are possible to actually implement, which also means making well-informed trade-offs and compromises.

In our MEL choices, we prioritize questions and concrete contextual features. Prioritization, given scarce resources, entails trade-offs, and short, medium and long-term risks which we can manage or sweep under the rug. We learned that we should prioritize collaboratively and do internal advocacy to open space to create, course-correct and sustain the implementation of compromised solutions over time that travel across decision-making levels of the portfolio. This is often difficult because we are working with organizational restrictions, limited resources, technical criteria and, often, shifting politics within organizations and the systems in which they work.

In the face of these challenges, we should be transparent and manage risk, rather than set ourselves up for failure with unrealistic expectations. With an eye towards the portfolio-level narrative, funders and intermediary staff can help to frame and/or co-produce with partners, strategic questions that are broadly relevant to most stakeholders. They can also help us to identify lessons that may be applicable and could be tracked across multiple interventions to tell a collective story.

Prioritize focus areas for monitoring and learning, with key evaluation questions to apply across the portfolio (i.e., create filters and frames)As Al Kags argues, “The question of fostering active citizenship and indeed responsive government, is a complex one with a kaleidoscope of nuances. It is more like a set of puzzles, each of which have layers of contexts.”

For example, how do we tackle common challenges such as increasing the likelihood of scale up of TPA work? These are often areas of interest for many actors across the portfolio — from project managers to funders. The documentation of these common areas isn’t detailed enough for implementers, but it can help them identify counterparts within a ‘school of music’ from whom they can learn and with whom they might collaborate and go deeper into common areas of interest. Then one can assess whether setting a top-down common approach, method, or broader parameters to answer the question has more advantages or disadvantages than letting each team define the methodology on its own.

Coda

For now, we’re pleased with the thoughtful dialogue generated so far on rethinking the TPA sector’s narratives of success and failure and the role that funders and other portfolio managers might play, as they design and implement their own MEL systems, policies and practices. Others are engaging in this discussion offering their own perspectives. We recently conducted a webinar on building mid-level theory. Participants showed a surprising level of appetite for a candid conversation about how to build this in practice for the TPA field for organizations of different sizes and types. In a recent webinar on the future of anticorruption work convened by USAID’s Achraf Aouadi (I-Watch Tunisia) put the issues (from his perspective) on the table, as did Ambassador Power (from hers). So, this isn’t just the view of three consultants. Rather than dismiss TPA portfolios as too ‘hard to measure’, let’s rise to the challenge and learn from each other. We encourage others to join in the discussion and let us know how you are managing these issues (and others) in your TPA programs.

We pose a polite challenge to funders out there: in addition to investing in your improved portfolio MEL systems in this new strategy cycle, you can also help the field by supporting relevant TPA actors to “play their MEL music together”, to facilitate collective strategic thinking and exchanging of tricks of the trade.