|Thread title||Replies||Last modified|
|Some thoughts from Wikia's Danny Horn||1||10:02, 10 August 2010|
|Meeting w/ Omniture||0||18:35, 9 June 2010|
|April 7, 2010 update||10||18:55, 26 April 2010|
|Two articles on analytics||0||00:21, 6 April 2010|
|March 22, 2010 update||0||21:59, 22 March 2010|
|A number of areas where collected data would be useful||2||09:27, 11 March 2010|
|Analytics for editors||3||00:19, 11 March 2010|
I had lunch with Danny Horn today. Danny's a product manager at Wikia, and he's doing great work on incorporating measurement into product development and experimenting with new ideas in a systematic way.
Turns out they're using Google Analytics for click-tracking. Apparently, it's crufty, but it works. They've found some interesting things. For example, they found that anonymous users tend to click on the section edit links to edit, whereas experienced editors tend to click on the Edit tab. Danny experimented with making the Edit tab bigger and by making the links green to stand out more. That did result in higher-clicks, but he hasn't checked yet whether the Edit-to-Save ratio has gone down as well. (Note that this idea cropped up during IRC office hours/2010-03-31.)
Danny also did a lot of qualitative work to try to understand why new user editing behavior. He's concluded that new editors are more likely to edit when there's already a lot of content there. For example, new editors are more likely to edit long articles than short articles. This makes sense (in retrospect); after all, one of the hardest things to do on a wiki is figure out where to put something. If the content is already there, that tells you where things go, and it's easier to edit something that's already there than it is to create it from scratch.
One of the things Danny is working on now is experimenting with badges to encourage participation. Originally, he wasn't excited about the idea, but when he tested a prototype with experienced users, he noticed everyone reacted with a visceral, "Oh!" when they saw they had acquired a new badge. Behavioral economics backs up the idea that rewards like this can incentivize behavior, and there are many great examples of this on the web. (There's also a proposal on this wiki: Proposal:Add game-like features.)
Two quick thoughts based on Danny's work:
- For the purposes of this task force, I lean more and more toward building our own open source tools (or on top of an existing open source package) to do this sort of thing. I'm amazed at how much Danny has learned just through click-tracking data, and we could build on top of Nimish's user experience click-tracking work to implement the same functionality fairly easily.
- Danny's work aligns very well with Howie's emphasis on user segmentation and the notion that we may want to treat different user segments differently. Optimize the interface for encouraging new users to contribute and for helping new users become experienced users.
Howie, Nimish, Tomasz, and I spent a few minutes chatting today about where things stand. Next steps:
- Experiment with a Piwik installation on Wikimedia Foundation's internal wiki. It gives us real data, and we don't have to worry about taking down a live public site. Once we get some experience with the tool, User:Erik Zachte will be our liaison with the Piwik community.
- Howie has invited Omniture to come talk to us about their tool. They'll hopefully swing by in two or three weeks. We'll be sure to document our notes here.
- Tomasz is talking to Google Analytics to get more information.
- I'll take a pass at fleshing out a rough framework for how to think through the build vs buy / open source vs proprietary questions.
Thanks! Have you played with it on an installation of Mediawiki by any chance?
Now, I´ve installed Piwiki on my local mediawiki. Piwiki is functioning, but I don´t see the Piwiki-specialpage in the Wiki. I will try to learn how to write a plug-in, but I´m not an experienced programmer.
What I also saw is that the Extension:UsabilityInitiative, that is active at Wikipedia, uses Click Tracking as part of the Beta rollout.
What I like is the programming language R which is able to W:Cluster analysis. We have got a Wikibook Wikibooks:Data Mining Algorithms In R. But clustering works with numbers, not with click paths. I wonder if it is possible to transform click paths into numbers.
Checking it out. I don't know much about this area. Does this only measure traffic? There are a lot of other analytics that would help us.
Piwik is oriented toward traffic analysis, although it could potentially be extended for other uses. We see it as a potential first step, not as a catch all. Check out Task force/Analytics/Requirements and make sure the other analytics you'd like to see captured are recorded/linked there. I know many were discussed in the community health task force.
Wanted to share two links on analytics that I think are interesting. First, Google is writing an opt-out plugin so that people can opt-out of being measured on Google Analytics.
Second, here's a strongly worded piece on "web analytics truths." These ideas are worth considering as we evaluate options for an analytics framework. I think it argues in some ways for building out our own open source analytics infrastructure... provided we are disciplined in asking the right questions and measuring the right things.
Would love to hear people's thoughts.
Last week, Howie, Nimish, Erik Z., and I chatted about the requirements that many of you helped capture. Howie took some excellent notes, which I integrated into the following pages:
Please take a look and feel free to post comments. The next steps are to prioritize the requirements and start evaluating possible tools. Help with both of these would be much welcome; respond to this post if you can help, or simply edit the pages listed above.
During the research for the Local language projects Task there where a number of different kinds of data that had been useful.
- Data that allows analysis of the correlation between localization of the MediaWiki software and the growth of the same local projects. Siebrand told me that he and Zachte had talked about something similar before. This would help in judging how important the localization work is.
- Data about average article size, break down into content type (text, picture, scripts, etc.) and how much of that content that has to be loaded on an average each time an article is loaded. Some of the material like scripts could for example be catched by most users.
There is also not only data about the actual Wikimedia projects that are of interest.
- Data about connection speeds of the visiting user, as well as potentially visiting users, is one external parameter that is of importance to the projects.
- Another is data from Google, Alexa, etc. about what people are interested in reading about. Collecting such data together with similar information from the Wikimedia projects and presenting this to potential editors could help increasing the number of editors.
One important thing is to collect data, another is to present important data in an easy way to anyone that could do something useful with the information. As for example visitors willing to edit, but not knowing where they can contribute.
I did some analytics be myself (Benutzer:Goldzahn/Logbuch). For example, I looked into the user creation log of a whole day or looked what happens to a featured article when it is presented at the main page. I don´t know if there is a software that could do something similar, I counted the numbers by hand. My principles were to name the data I used, so that someone else could check the numbers. I think this is important if the analytics should help to decide if something should be done in this way or in another way. Well, the listed requirements (Fundraising, Strategy) are for analytics needed for foundation topics. What I did were analytics for editors and since user wouldn´t have access rights to the statistics server (I remember that the statistics server is provided by the swiss chapter, right?) we would need something like statistics-tools. We have something like Wikipedia:Graphic Lab, we could start aa Analytics Lab too. They would write statistic scripts which would run on the statistic server. I don´t know if such a software or a script-language for analytics on log-data exists. --Goldzahn 11:35, 10 March 2010 (UTC)
I think it's both fair and desirable to collect requirements for other members of the movement -- editors, etc. I would love to see an open analytics lab that anyone would have access to and that could serve as an open infrastructure for doing research, analytics, visualizations, and experimentation.
Could you add some notes to Task force/Analytics/Requirements? Ideally, ideas for what to measure should be associated with why we want to see that data.