On Mon, May 19, 2014 at 7:06 PM, Edward Galvez <egalvez(a)wikimedia.org>wrote;wrote:
Hi Pine,
Thank you for your bringing this page to our attention and for raising
these interesting questions. I would have to agree that the “Program
evaluation basics” page is not well-designed and should be revisited. We
are actually going to be redesigning the entire evaluation portal soon and
this page will likely be revised and included in the new design in some
way. We are also continuing to build tools and learning resources (like the
learning modules [1]) on evaluation to help explain some of these concepts.
I also agree that we need to think more about how we can define “impact”
within the context of Wikimedia. Before we can reach a final “impact”,
there are different layers of success in terms of outputs and short-,
intermediate-, and long-term outcomes that help to measure success along
the way.
We have been working on this approach to evaluation—we have developed
resources for mapping a program’s theory of change in order to identify
measurable outcomes, both near and far. Specifically, logic models are a
useful tool for drawing out the steps needed to reach long-term impact and
identifying more immediate indicators for evaluation; there is a resource
page within the Evaluation portal on logic models [2] and I am working on a
learning module that will guide anyone through what a logic model is and
how to create one. As far as the term “impact”, it is very jargonistic and
can be used in many ways which can be confusing. Since we began last year,
we have been working to generate a growing glossary of a shared language
around evaluation [3]. That glossary page is more current and inclusive
than the original “Program Evaluation basics” page you linked to. Please
feel free to discuss this and any other of those terms and definitions
there on the portal.
Coincidentally, we are asking the community to provide feedback on some of
the initial evaluation capacity building efforts our team has engaged in
thus far. We’d like to hear feedback on the metrics and methods used so we
can continue towards a shared understanding of Wikimedia programs and their
impacts. We invite you (or anyone!) to read about the Community Dialogue
[4] and join in the discussion on the Evaluation portal Parlor [5].
As always, I’m available for any questions!
Best,
Edward
[1]
https://meta.wikimedia.org/wiki/Programs:Evaluation_portal/Learning_modules
[2]
https://meta.wikimedia.org/wiki/Programs:Evaluation_portal/Library/Logic_mo…
[3]
https://meta.wikimedia.org/wiki/Programs:Evaluation_portal/Library/Glossary
[4]
https://meta.wikimedia.org/wiki/Programs:Evaluation_portal/Parlor/Dialogue
[5]
https://meta.wikimedia.org/wiki/Programs_talk:Evaluation_portal/Parlor/Dial…
Interesting exchange, thanks guys.
This particular topic needs a great deal of attention - not just because of
how crucial it is to measuring success, but also because it has
traditionally been both difficult and sensitive. Sue and others have raised
questions over the years about how we determine if the various programs run
by the WMF and chapters are useful or not, and if so to what degree. The
WMF and the Program Evaluation team are just beginning to take steps to
answer these questions, and in my opinion much more needs to be invested in
this effort. I would like to see compliance with program evaluation
standards integrated into every grant of funding drawing on donor funds. To
smooth the way for this increased level of scrutiny each grant of any type
should include an earmark for just this purpose.
Why? Because ultimately we are where we've always been -- with clear
knowledge of what "impacts" matter but difficulty in working out whether
anything any movement partner does or has done helps the bottom line. Tens
of millions of dollars a year get spent, but most non-core spending would
be hard to justify using strict measures of impact. That doesn't mean they
don't *have* impact, just that because we don't forcefully ask the
questions we don't and can't get the answers.
Every project, chapter, grant, initiative and expenditure should be
scrutinized with basically the same few questions:
1) Does it add to the quantity and / or quality of content?
2) Does it add readers, either by increasing interest or improving
accessibility?
3) Does it add editors?
Any major expense, grant request or new initiative should be measured by
the answers to these questions, and every answer should be quantifiable to
some degree. I would suggest that if the answer to all three is no for any
non-core expense, heavy scrutiny should be applied to ensure funds aren't
being wasted. The FDC does this to some extent now, although it asks the
same questions much more vaguely and in terms of strategic alignment.
The logic models are useful tools for thinking through and explaining to an
audience the structure and goals of a program, but they are vulnerable to
the same fuzziness that exists without the tools. They are also not well
oriented to measuring performance, which is really the crux of the problem
and of Pine's question. Let's look at the logic model you've used as an
example from the WikiWomen's edit-a-thon[1]. Their logic model is great at
explaining the goals of the program. This is a major improvement,
particularly if it is standardized across all WMF-funded projects. But does
it help us answer the question about impact? Using the Boulmetin Dutwin
model of analysis, we can get clear information about program efficiency
and program effectiveness. But we don't get anywhere on impact, despite the
use of the logic model.
The risk here is basically that the movement spends millions of dollars
going down the wrong road. If we spend a decade funding editing workshops
and Wikipedia Education Programs, and only at the end discover that they
are completely ineffective, the opportunity cost (both in financing and in
volunteer energy) would be enormous. By the same token if we for a decade
fund chapters whose principal activities are ultimately judged to be
ineffective, the scale of volunteer disillusionment could be breathtaking
and the failure could threaten the entire movement. The WMF needs to focus
on common program activities, drill down deeply into each one, and actively
discourage any program (across all affiliate groups) that doesn't
demonstrate its impact value.
Judging by meta I think Edward and the PE team have made a great start. But
it's 2014 and the WMF is still at a starting point. Proposing that funding
requests include SMART goals is not good enough, and I'd love to see Lila
and the board empower Edward to do a lot more, and to insist on deep
cooperation from entities receiving funds. At some point in the future we
can move this discussion from "does anything anyone does have any impact?"
to "knowing that we *can have an impact*, how much impact is enough to
justify funding?"
[1]:
https://meta.wikimedia.org/wiki/File:Editathon_LM_Stierch-page-001.jpg