Proposal talk:Survey Former users

From Strategic Planning


I think we can tighten up the language a bit, and make it a bit cleaner (dare I say, scientific).

Otherwise, I think this is a good start. But we need to cut to the chase and try to ask why they left. It would probably be a "check off all that apply". If necessary, maybe break that up into categories. ... maybe put an open-ended question before the list of possible reasons, so that we don't lead them. Randomran 16:03, 4 November 2009 (UTC)Reply[reply]

I've tried to cover this in question 2 - but feel free to rephrase - its a wiki afterall, I do think we need to differentiate between those who had to suspend editing because of real life events and those who chose to stop because of wiki based events. My idea of starting with open ended questions was that though they are a pain to analyse, but they don't miss out key answers. I would suggest that we start with open ended questions until you have enough responses to create a drop down menu. WereSpielChequers 17:15, 4 November 2009 (UTC)Reply[reply]
As for making it more scientific, I think we need to beware false overcertainty. You can of course ask people if they intend to do something in the next 6 months, 5 years, 5 - 15 years, and calculate very scientific looking data from the result. But it would probably give less meaningful results than asking realistic questions. One sign of a good consumer survey is that the questions are clear to the person filling out the form, even if that makes analysis less easy or "scientific looking". WereSpielChequers 17:32, 4 November 2009 (UTC)Reply[reply]
Oh, we definitely can't go overboard with precision. I was just talking about making the language a little less informal. Let me take a shot at it. Feel free to revert or change things around or what have you. Randomran 18:17, 4 November 2009 (UTC)Reply[reply]
Gave it a shot. It's good to have a mix of open-ended and questionaire type questions... the questionaires let you be systematic and look for overall trends. But once you've seen the trends, you can use the open-ended feedback to look for more detail. Randomran 19:09, 4 November 2009 (UTC)Reply[reply]
I'd be inclined to go the other way - start with the open ended questions to find out the frequent answers then build the questionnaire to find out which answers are commonest. If you go the other way round you risk missing out a major answer, like the big survey recently which forgot to have because its fun as an answer to the question why do you edit wikipedia. WereSpielChequers 22:27, 6 November 2009 (UTC)Reply[reply]
It sounds to me like you're suggesting two rounds of surveying. The first round, we'd just get some ideas of what might be causing editors to leave, and the dynamics of those reasons. The second round, we'd get a little more systematic, and try to separate the main reasons from the fringe reasons. Does that sound right? Randomran 22:32, 6 November 2009 (UTC)Reply[reply]
Well yes that's a minimum. In practice if the surveying is done electronically on the anniversary of someone's last edit you can change or add questions on an ongoing basis, though you probably won't change so often as you want to build large enough samples who've been asked the same questions that you can spot trends. WereSpielChequers 23:06, 6 November 2009 (UTC)Reply[reply]
I have no knowledge of sound survey methodology but the survey looks good to me. Thanks to you both for your work on this. --Bodnotbod 14:12, 9 November 2009 (UTC)Reply[reply]

Sample Bias

One of the key questions is how do we deal with sample bias. In a normal survey you have broadly three approaches to that, weight the data, structure the sample or put a caveat. The first two require you to know something about the population you are sampling from and the people you are trying to interview. So an Opinion Poll for an election would tell its pollsters to get so many people of a particular age, gender and maybe some other variables that they could construct a representative sample of the population - using census or similar data to calculate what percentage of the sample each "cell" needs to be - so they might need as many in the "male 30/35 graduates" cell as in the "female 30-35 graduates" but far fewer "male 75 - 85" as "female 75 - 85". Weighting needs the same sort of cell matrix, but instead of randomly discarding responses till you have the right proportion per cell you keep all the data but weight it to make it proportional - so if you need twice as many "female 75 - 85" as "male 75 - 85" but you actually get four times as many "female 75 - 85" you would give each of those responses a weight of 0.5. The difficulty with such approaches is that your results rely heavily on getting that cell structure right, so sometimes it is safest to simply give a caveat that there maybe a sampling bias. Off the top of my head the only thing I could think of to construct such a sampling matrix would be length of time on Wiki x no of edits x time since left wiki. WereSpielChequers 23:01, 6 November 2009 (UTC)Reply[reply]

I'm glad you know what you're doing :) Those are really good ideas. In fact, when you talked about structuring the sample, I really thought focusing on their life cycle would be a great way to structure the sample. It depends on what we're looking for, though. As useful as it is to survey the population at large, it may also be more useful to focus on former "core" users. 10% of our users do 90% of the edits, and figuring out why they burn out or leave is very important. For the 90% who don't get as involved -- most of whom who leave in less than 15 days -- the question isn't why did they leave, but why didn't they embrace Wikipedia in the first place. We already know that depends on making Wikipedia more welcoming and usable. Randomran 16:36, 7 November 2009 (UTC)Reply[reply]
I think the loss of former core editors is certainly worth researching, but I'm also keen that we learn why people tested us, made a few edits and then left. Questions for these groups do need to differ, as one group chose to leave the community and the other tested but did not join it. To get back to the bias issue, the most obvious biases we have with this method are that we lose people who did not enable email, or whose email has changed. Some of this could be tested by putting notes on users talkpages in the hope that they still look there, though I doubt if that would work, and the main group it would find might well be those who've undergone cleanstart but still watchlist their old talkpage. Alternatively we could do some analysis on these two groups to see if they differ in their pre-departure behaviour, the rationale being that the more similar the two groups are in that respect, the less one need be worried by the bias of omitting them. WereSpielChequers 13:47, 9 November 2009 (UTC)Reply[reply]

Time to roll it out?

There's been no activity on this for a number of days. Shall I point Philippe and Eekim to it and ask them if they can either action this themselves or tell us how to go about it? We don't have a great deal of time. --Bodnotbod 11:20, 17 November 2009 (UTC)Reply[reply]

  • I think it's a good idea. Maybe we just need to add a few more personal questions (age / education level / etc.) so we can weight the sample, and reduce the effects of bias. (Thanks to WereSpielChequers for pointing that out! And for getting this going!) Randomran 16:54, 17 November 2009 (UTC)Reply[reply]
    • Happy to have this raised with them - I am a bit busy elsewhere at present. Just a quick point about weighting questions, the questions you ask need to be linked to the things you are weighting against. If we knew the typical ages or education levels of our editors we could ask our respondents and use that for a weighting matrix. As we don't I would suggest you can only weight by things like number of edits. WereSpielChequers 18:44, 17 November 2009 (UTC)Reply[reply]
      • Technical resources are somewhat limited right now by the fundraiser and some other items, but let me get with eekim this week and see what we can and cannot do to get forward motion on this. (Philippe, not signed in, from Brazil)
    • WereSpielChequers, I've designed some surveys for work (and earlier, in school), but I trust your expertise on this a lot more. I'll go with whatever you think is appropriate -- whether that's a lot, or something minimal. Either way, let me know if you need any help or input. I have this page watchlisted, but feel free to hit me on my user page. Randomran 19:48, 18 November 2009 (UTC)Reply[reply]

Proposed additions

The "Answer: I would never come back unless (________________) happens." is very important, and I wonder if we shouldn't make it into its own dedicated question.

To "Why did you leave Wikipedia?": I think some questions are too similar and thus, not needed. For example:

  • "Because there was not enough response on my questions or concerns" and "Because there was not enough help to improve the articles that I felt needed improvement."; I'd suggest keeping the first one and loosing the second
  • "Because I felt other editors were being: rude or hostile / unreasonable or stubborn / deceptive or sneaky" seem the same; I'd suggest "Because I felt other editors were being uncivil towards me"

Suggested additions:

  • "Because I was treated unfairly and felt unwelcome"
  • "Because I have seen other editors treated unfairly"
  • "Because editing Wikipedia became to stressful"

--Piotrus 20:24, 25 November 2009 (UTC)Reply[reply]

    • I support the additions. But I wouldn't necessarily drop distinctions between rudeness, stubbornness, and deception. Some people are actively antagonized off. But others face a much more passive "sorry, but I disagree, and will never agree" kind of obstruction. And yet others feel like they were conned, because of Wikilawyering and politics, and just got sick of that. I think studying the distinction is useful... although maybe there's a better way to get at it with a rephrase. Randomran 21:25, 25 November 2009 (UTC)Reply[reply]
      • Well, I agree that finer differentiation is good, but I think we will end up with a very high correlation between all three answers. In my experience, they go hand to hand anyway, and if I were taking the survey, based on my experiences, I'd check all three. If we want to go that route, what may be better, is to have a 1-5 ranking question with adjectives, something along the lines: please rate, on scale 1 (very insignificant) to 5 (very significant), if your reasons for leaving included interactions with editors whom you would describe as: adjectives go here. --Piotrus 00:40, 26 November 2009 (UTC)Reply[reply]
        • That's a neat idea and might give us some insights into the real problem. I think it would say a lot if more people said that sneakiness drove them off the project, rather than open hostility. You don't think we kind of accomplish the same thing by asking them to click the X most true answers? If someone is clicking all three, that means they can't click on other choices. If I had to prioritize, I would have to pick stubbornness. I think other people would choose differently, which is exactly what we want. Randomran 03:51, 26 November 2009 (UTC)Reply[reply]
          • Actually, come to think of it, I think the entire question 7 may be better if we allow editors to chose any number of answers, and rate how they apply on a scale of 1-5 (with the option of "don't rate at all if it doesn't apply to you"). --Piotrus 19:49, 26 November 2009 (UTC)Reply[reply]

Reaching out

I think the survey should be both displayed in the Wikipedia banner spot (like the 2008 General Survey was), and the invitation to it should be emailed / posted to a talk page of every user who has been inactive for a month. We should also add an option to 1) "Do you plan on ever returning to Wikipedia?" that would allow editors to answer "I've never left, I am just sometimes not editing for a month or longer." --Piotrus 20:26, 25 November 2009 (UTC)Reply[reply]

The tricky thing about putting it in the banner spot is that it might be skewed by active contributors trying to make some kind of point. But I agree we need to think about how to publicize it to the right audience. Randomran 21:26, 25 November 2009 (UTC)Reply[reply]
Well, as it stands, the survey is designed for editors who left, so current ones shouldn't participate it in. Although it may not be a bad idea to run two concurrent surveys, the other one for editors who are still here but are considering leaving ("Have you left Wikipedia? Please take survey A and tell us why. Are you thinking about leaving Wikipedia? Please take survey Btell us why").--Piotrus 00:37, 26 November 2009 (UTC)Reply[reply]
I guess maybe we'd have to link the survey to their login in order to screen for that. Randomran 03:52, 26 November 2009 (UTC)Reply[reply]

Access to data

We also need to address who will have access to the data. I think that the data should be publicly accessible to any interested researcher (the respondents of course should be made aware of that). --Piotrus 20:28, 25 November 2009 (UTC)Reply[reply]

Also a good point. Randomran 21:26, 25 November 2009 (UTC)Reply[reply]
The murkiness about data sets from the 2008 survey are still high on my grievance lift. At first, it was supposed to be a community project ([1]), and community did help out a lot - and in the end it appears to have ended as just another study of Wikipedia, one which got a great dataset, but that hasn't been made available to other scholars or foundation (AFAIK) beyond what authors chose to publish :( Now, don't get me wrong: the survey and the resulting study are better than nothing (for all my fondness for GUS, it was all talk). But it could have been done much better... and we should stress it in our recommendations: we need an easy way to carry out regular surveys, and make data available to all interested researches. --Piotrus 00:42, 26 November 2009 (UTC)Reply[reply]