"Community Health" measures
"Community Health" measures
This discussion has been combined and edited at Community health/Metrics. Please add additional thoughts to that page.
Long post alert!
One of the things we want to track going forward is "community health".... which is multi-faceted, and doesn't lend itself to easy measures. There's been a lot of thinking and talking about community health ---- this post is an attempt to summarize and reflect back some of that discussion, and to express some of my own thinking about how we can best measure it. Here are some of the measures I believe we can/should be thinking about......
1) Total number of active editors.
The Wikimedia Foundation has been tracking "number of active editors" (>=5 edits/month). There are currently about 100,000 active editors in all projects. Many of us have I think talked about this extensively in various forums, particularly in the wake of the PARC work by Ed Chi, and Felipe Ortega's work which spawned a number of inaccurate and alarmist media stories, including one in the Wall Street Journal with the headline Volunteers Log Off As Wikipedia Ages. Essentially: the total number of active editors started to decline in 2007, after which it stabilized, and has been flat since. We don't know what an appropriate number of editors might be for the Wikimedia projects: for example, it may be that the mature projects require a smaller number of editors once they reach maturity, than they did in heavy article-growth mode. Nevertheless, 1) it makes sense to track the number of active editors, because if they plummet, we will want to know that, and 2) it makes sense for us to distinguish between mature projects and growing projects, and track active editors for both, because we would reasonably expect and want to see growth-in-active-editors in projects that are not yet mature.
2) I think we will also want to track retention of active editors. Again, I don't think we know what the "right" number is -- it seems reasonable to expect that there is a certain amount of editor churn, which might be completely normal and healthy.
Bear with me while I make up a construct here: Let's say there are two subcategories of editor. 1) Let's say half of editors are permanently committed to the projects. They will take wikibreaks due to the ebb and flow of their outside-Wikimedia obligations, but as a group let's say their numbers are stable. And 2) let's say half of editors are what we might call "life-stage" editors: they join us while they are in post-secondary education, edit for let's say five years total, then stop editing as they shift their focus to careers and family. That would suggest that every year we would lose 10% of editors, and that 10% would be replaced by new "life-stage" editors coming in. If that construct were true, we would expect to see a "loss" of 10% of editors annually -- and as long as they were being replaced by 10% new people, we would likely consider that perfectly fine.
So the first thing we need to do is establish a baseline -- figure out what is actually happening today. If there's 10% turnover/churn annually, that by itself doesn't tell us much. But if the 10 goes to 20, that would be cause for investigation. And if the 10 goes to 5, that would probably be a good sign.
3) I think we need an editor "engagement" or "satisfaction" measure. I was talking about this in the office the other day with a few people. We probably don't want to measure "happiness" -- because the purpose of the Wikimedia projects isn't to make editors happy, and in theory at least, editors could be super-happy and yet not very productive. (If for example they decided to have wiki-parties all the time, and they formed great friendships and had lots of fun, but didn't write any articles. ) But you do want to measure overall engagement/satisfaction. This is true in organizations too -- HR departments have learned over time not to measure employee happiness, but rather to measure employee engagement or satisfaction.)
We talked in the office about surveying departing/departed editors --as we did with the Former Contributors Study-- but we realized that's not on-point. In part because it's difficult to know who's actually departed -- one of the outcomes of the Former Contributors Survey was a bunch of respondents telling us they didn't feel like they'd departed; they had just been inactive for a while. (And some thanked us for prompting them to return, which was nice :-) But mostly it's off-point because the purpose of the measure is to help assess current community health, and "why people left" is only one small piece of that overall picture. It seems to me that the simplest way to measure community health is simply to ask people (via a regular survey) how engaged/satisfied they are feeling in their work on the projects.
Again, we need a baseline here. Let's imagine that at any given moment in time maybe 1% of active Wikimedia editors are unconstructive, don't share our mission and goals, don't really understand the work we're trying to do, and generally are unhappy because they're not aligned with us. They will leave us soon, but they haven't left us yet. And let's further imagine that at any given moment in time maybe 15% of active Wikimedia editors are feeling angry or unhappy about a dispute they're engaged in at that particular moment, although they are otherwise generally satisfied. And let's further imagine that 20% of the world is always going to report feeling dissatisfied, because that's just the kind of people they are. In that construct, we'd expect that a "normal" level of self-reported dissatisfaction would be 36%. So if we found ourselves with a 36% dissatisfied baseline, we would know that we're never going to get to 0% dissatisfied, but we might take steps aimed at trying to help some of the 15% be less situationally frustrated. In this construct, if we could get to something like 25% dissatisfied, that would be good progress.
So, I would say, we should launch a regular survey of satisfaction levels. It won't be easy to parse out the "bad fit" people from the situationally-dissatisfied from the constitutionally-dissatisfied, but that's what we should be aiming to do. And we should go in with the understanding that we'll never achieve 0% dissatisfaction, but that we should be aiming to trend towards less.
4) Editor demographics.
This is a really interesting and complicated piece, and there have been lots of good discussions about it in the strategy project generally, and on this particular page. Broadly, I think we want editor demographics to look more like the general population. I don't think we should aspire to map exactly against gen-pop, because I don't think that would be realistic or even desirable.
Basically:
- Some demographic skew is inevitable, and outside our control. For example, people in poor countries will always edit less --on the whole-- than people in rich countries, because people in rich countries have more leisure time, better connectivity and equipment, higher education and literacy levels and so forth. Similarly, women will likely always edit less than men, because they have less free time. We should still aspire to make it easier for those groups to edit, but they will likely never achieve representation-on-Wikimedia proportionate to their representation in the general population.
- Some demographic skew is --at least partly-- open to influence by us. For example, we have speculated that women would be likelier to edit if they were invited and thanked, and if there were increased opportunities for face-to-face interaction. By thanking, by inviting, by having meet-ups and conferences, and/or by specific targeted outreach, we would likely be able to attract more women.
- I hesitate to say this because it risks sounding elitist, but to a certain extent we don't _want_ gen-pop representation. I sometimes think the most important defining feature of Wikimedians is their unusually high intelligence. Jimmy has sometimes posed the rhetorical question: what kind of person edits an encyclopedia in their spare time, for fun? (Answer: smart geeks.) By that very fact, we know that Wikimedians are generally extremely intelligent. And we know that Wikimedia biases to encourage smart people -- a large part of reputation here is driven by doing work that visibly manifests intelligence, or is dependent on being intelligent. So, it makes sense to me that editors might skew better-educated-than-average, more professional-career-than-average, maybe even higher-income-earning-than-average. (I want to say here: I'm not saying that less-educated, less-professional-career, lower-income people are by definition less intelligent -- for many individuals for many reasons, that is of course not even remotely true. But I am saying that if Wikimedians are somewhat better-educated, more likely to be in professional careers, and higher income-earning, that shouldn't surprise us, and --as a fact by itself-- it shouldn't necessarily trouble us.)
So upshot on demographics: I think that we do not want to map identically to gen-pop. But I do think we should aspire to map somewhat more closely to gen-pop, particularly in the areas where we see a huge gap --- e.g., gender. I think the projects will be better and richer and more comprehensive if we have input from people who are currently underrepresented. So I think we need to use the UNU-Merit data as our baseline, and track change-over-time, with the goal of coming somewhat closer to gen-pop than we currently are.
This is a big long post! I'd be curious to know what you all think. Basically, for community health: do these proposed measures feel roughly correct...... are there measures that are really significant that are missing........
I think what I've written here is largely consistent with the goals-in-development on the "movement priorities" page -- essentially, I am wanting to talk a little about it here, before doing some editing of the page itself.
Zack (Exley, the new CCO) and I have done a little bit of thinking about what "micro-measurements" we could begin to track that would result in some trending that could roll up to community health measures. These are obviously very very detail oriented, and would have to be fairly carefully interpreted, but we've come up with, as a starting point:
- Number of edits by admins
- Number of blocks
- Admins with most blocks
- Number of speedy deletes
- Number of posts to ANI
- Number of new admins
- Highly active admins
- Admins inactive for 30, 60, 90, 120 days
- Number of article deletions
- Number of full process deletes
- Inbound OTRS tickets
- Outbound OTRS tickets
- Articles with most reverts
- Articles by edits (trending)
- Number of Reverts
- Number of edits
- Number of new articles
- Number of new files
- Number of new users
- Highly active users
- Users inactive for 30, 60, 90, 120 days
- Number of reports to AIV
- Global blocks
- Number of steward activities
- Number of permissions changes
- Help requests in #wikipedia-en-help
- Previously active users who are newly inactive
- Help requests using {{helpme}}
- Ratio of human to bot edits
- active users to admin ratio
- number of orphaned articles
- ratio of orphaned articles to total articles
That is a kind of hilarious list! But yeah, I can see the value in those measures. As long as it all could be viewed rolled-up into a single green/orange/red status, I would be happy :-)
Yours sincerely,
Sue "Simple Is Good" Gardner
Dear Dr. Simple is Good,
Absolutely. But in the meantime.... MOAR DATA PLZ!
Philippe actually nailed it, in my books. I don't think we've quite yet gotten to the bottom of what *really* affects community health, so we really need to track a lot of different numbers.
Satisfaction is probably more accurate than happiness. But the survey actually showed that some of the most dissatisfied editors are actually our most engaged. Not sure why that is. Maybe it's pathological OCD. Maybe it's that more engaged editors are willing to accept higher levels of BS. But the real point: we're not going to be able to measure health by checking the population for satisfaction and growth. Dissatisfaction and stagnation are the symptom, not the diagnosis.
The survey focused on newer editors... and it actually seems there would be three killer stats to look at.
- One would be how many editors abandon edits without pushing save. (Obviously some amount is normal. But like Sue noticed, it would be a huge sign of improvement to get from 50% abandonment down to 40%, or what not.) That would measure how easy and convenient people are finding it to edit.
- Two would be how many edits are reverted. Again, some amount is normal. But a revert shows a problem on two ends. On one hand, it shows a community that is hostile to change. On the other hand, it shows an influx of editors who may be making inappropriate changes that upset community norms. Whose fault is it -- the reverter or the revertee? It almost doesn't matter. As much as reverting is natural, we know that too much is a bad sign.
- Three would be activity at dispute resolution pages. Drama usually goes there. Drama will probably grow with the population (more people means more disputes), but if it's growing faster than the population then we have a problem with community health.
It would also be REALLY useful if we could slice up the dataset. Imagine that we could find out that we're getting more editors, but only for articles about music! Then we could look at other stats around music articles, and figure out what's helping that music sub-community grow while other parts of Wikipedia are stagnating. Numbers are just data. But when you can compare them to something, you can understand what the heck is really going on.
I love the idea of segmentation by article type, Randomran. I'm going to continue to think on that a little. Really great idea.
This was a really great thread. I took a pass at combining and editing all of the suggestions at Community Health/Metrics. I'd encourage you to post further ideas and discussion directly to that page.