Over the past few weeks, there's been some great discussions about the task force recommendations. There's some great energy here on this wiki, and I want to start moving toward completion. That includes:
- Integrating the feedback into the existing recommendations
- Filling in gaps (areas such as movement roles, expanding content, and reader conversion)
- Evaluation and prioritizing the recommendations
- Writing a draft plan
To get this work done, I'm proposing the creation of a Strategy Task Force. I hope that you all will read and help refine the proposal, and I especially hope that many of you sign up for the Task Force. Let's also move the discussions there so that we can have a central place to discuss next steps for strategy. Thanks!
What are we recommending to foundation ? If I look at Recommendation 2. it has assertions, but nothing to 'bite' on. Are we just saying this is a good idea, there should be a reader, and it will use openzim as the content format ?
I think we less rather than more for our recommendations. Short, to the point, no waffle.
Wizzy, could you write something "with bite" for this? I don't even own a cellphone, and I feel like I'm the last person who can write that recommendation. (though I did add a couple of links & refs) Could you have a go? I will email you the log of last Thursday's discussion, which covered this. I'll also try to write a short summary, though of course I'm running out of time!
Do you think I'm waffling too much in nos. 3 & 4? I'm trying to keep the points down to 1-2 sentences, but some are complex issues. If we oversimplify, they may think we haven't thought about the issues in depth; if we are too long, we run the risk of them losing interest and/or missing the main points. Not sure where the best balance lies! Please edit as you think appropriate, and we can always argue afterwards about anything we think really matters. Cheers, Walkerma 18:06, 11 January 2010 (UTC)
Does these three, somewhat reworded, points that I provided on the cellphone talk page provide any help in writting down a recommendation.
- Ensure that there is an application suitable for mobile phones that can hold a large amount of articles
- Compile a list of important and well written material for each country/region/language
- Convince mobile phone developers to preload their phones with material from the list that corresponds to the country/region/language of distribution.
I looked through some of the other 'Recommendation' pages - it seems Local Languages has the best format. The 'Strategy' section is what contains our recommendations - the Assertions, etc. are just to back it up.
So we write a Question, and answer it in a Strategy statement.
I will try to refactor our pages along that way.
Dafer45, thanks for pointing out that discussion - I would have missed it. I like the SMS suggestion, I added it to the main recommendation - the others I think we have covered.
I really dislike the first strategy statement. It is specific to companies/products, and not to processes.
Support third party developers/providers of open offline storage standards (such as OpenZim), readers which use them (such as Linterweb), and proprietary offline solutions (such as WikiPock.)
Encourage development of non-internet distribution systems, eg. SMS article requests.
I just linked to a new whitepaper in Mobile reach:
Thoughts folks here might find it interesting.
As I mentioned at the last IRC meeting, there is one major topic that we have yet to cover. This relates to no. 6 in our mandate:
- "Offline usage of the Wikimedia material inherently makes participation difficult. Are there ways to overcome this challenge, to enable offline readers of the Wikimedia projects to also make contributions?"
Propsals related to this include:
- Proposal:Third World Wikipedia Mirroring (very bare-bones description!)
- Proposal:Editing in Internet cafés (little more than a statement!)
I asked SJ (via email) a series of questions about this issue, which (I believe) he mentioned in his Wikimania talk in 2009. SJ currently works for One Laptop per Child (OLPC), and I believe he has talked with offline users in remote Peruvian schools about this sort of thing. He had a lot of fantastic ideas, which I'd like to list as the foundation for a recommendation on this issue; please note that he is focused on offline uses in schools, mainly for Wikipedia:
- Offline editing, periodic synchronization.
- Encourage large-group newbie editing. Find ways to identify blocks of new users, make sure they are welcomed by mentors and watchers rather than vandal-fighters and spelling zealots.
- Support school projects to edit WP, especially about local topics. Encourage mentorship of young editors; build guidelines that don't reflexively delete anything the reader hasn't heard of as NN. Improve notability guidelines by making it easier to write about people/places/events that have no web presence for lack of connectivity.
- Find mentors for each new community interested in solving that problem. Don't dictate from on high, support the idea and encourage the growth of these networks.
- Encourage local editing through contests and other high-profile events making editing a cool thing to do within a community at universities and schools; where you can find mentors for kids and others from that community in the future. [see the Kiswahili Wikipedia Challenge for an example I've been working on recently]
- We shouldn't hold articles about new and hard-to-source topics in a holding area, where we want to expand, to the standards we use for commonly known topics with thousands of references. Perhaps a different article-page template indicating this is an article about a new and developing topic, is of extra interest but lessened verifiability; and could particularly use corroborating cites, sources, and edits.
While I don't think any of these proposals could be implemented tomorrow - they all require either significant organization on the ground, or they require major changes in practice. Still, I personally like them, and they indicate (for me) a viable path towards engaging whole new communities of contributors. In order to facilitate the implementation of SJ's proposals, I think we need to do the following:
- Work with NGOs - particularly those interested in educational initiatives such as OLPC, to develop a system whereby mentors and organizers can be in place locally, and people working in the local languages can contribute.
- When the time is right, start a discussion on how the community can work with groups contributing from offline, and how we should treat the idea of "reliable sources" and "original research" differently when we don't have any available! A town in Africa will often have very little information available in census data, peer-reviewed journals, etc., so it may be hard to verify even current data. As SJ suggests, we may need templates to flag this type of issue, and we may need experts serving as advisers and "welcomers" in the online communities.
- Hardware and software will need to be developed that make it easy to edit offline as a group, then upload the information from there. The proposals at the top mention this type of idea.
Please make your comments here; I also hope that we can discuss this topic at the IRC meeting on Thursday, at least briefly. Walkerma 10:03, 7 January 2010 (UTC)
Generically speaking, the concept is good but has substantial technological challenges depending on the scenario:
- Non-internet-accessible school site, long period synchronization
- Scenario: Synchronizes local dump at start of school year.
- Multiple classes of students with editing integrated in the curriculum over the course of the school year.
- End of school year synchronization of thousands of student edits to revisions now 9 months out of date
- High number of edit conflicts predictable
- Single synchronizer (school teacher?) becomes responsible for resolving each edit conflict for entire school even though xe may not have relevant knowledge.
- Single-user short period synchronization
- Scenario: cellphone synchronizes weekly automated.
- Edit conflicts, after hand resolve, are then delayed until the next week's updating, at which point they may have developed another edit conflict.
- Scenario: cellphone synchronizes weekly automated.
- Any remote scenario
- en.WP admins/CVN see a blast of edits from a given user/IP, especially if poorly formatted, and their likely response is to block/rollback all contributions.
- Most such synchronizations would be effectively indistinguishable from bot edits, and may be in violation of some project's bot/automated edits policies
Some of these may be resolved via technological measures, such as a more contextual edit resolver, but it is unlikely they could be 100% resolved. I would question whether - at this stage - this should be a primary recommendation.
Amgine says :- I would question whether - at this stage - this should be a primary recommendation
- I agree. Take it out.
I have summarised the Jan 05 IRC meeting on the IRC page.
I have altered Recommendation 1 accordingly - mainly to add a Parser section.
I asked SJ some questions via email (he's traveling right now), and he sent me very detailed answers, which I think are very helpful for our discussion. SJ is on the WMF board of trustees, and he has experience of bringing WP to schools in remote areas of Peru (with no internet anywhere near), via One Laptop per Child. You can read his comments here.
Just some comments on Task force/Offline/SJ Q&A.
- Do schools represent a good distribution point for Wikimedia content in general?
I would add that one of the best things about wikipedia in third-world schools is self-paced learning - too much education is still done with rote-based learning, and this provides a refreshing difference.
A downside about computers in general in the classroom (touched on by SJ) is the teachers feeling like they are being unempowered - they like firm authority and control, and they feel threatened by computers, particularly information sources like wikipedia. It is the teachers problem - but since they can so easily derail our efforts it must be taken into account.
Not sure if that fits anywhere in our recommendations - but just don't try to bypass the teacher.
- What format would OLPC like the content to be in? XML? OpenZIM?
SJ says :- It would be helpful for "everything" to be available in "a suite of formats". I am not sure I agree - unless these are automated, subsidiary formats from a Master - which I imagine will be XML.
- What content would you like to be available for OLPC, besides Wikipedia?
SJ mentions a featured image collection - I like that.
Cross-linking between different wiki modules is sometimes requested. Not sure how this would be done, but that would be great. I would love to have a bunch of different 'modules' - like (for instance) History, and Africa, and have them interlink. I guess you could just leave everything redlinked - but that is for the XML Parser people to think about.
He mentions wikibrowse - that (corrected) link describes the process that generates their selection - interesting, if you compare it to our IRC discussions.
- Offline editing
Definitely on the wishlist, but I can't see it happening yet. A local collaborative effort on some chosen areas, with a post-project merge, is the only way I see it working.
He mentions that still privileges the rich. I really think this will go away. America is so behind when it comes to cellphones.
I've greatly expanded recommendation nos. 3 & 4 as mentioned last week on IRC. These still need work/finishing off, but the basics are there. They also still need supporting facts, though most things are pretty obvious (things like that schools have teachers). Please review and leave comments - also, solicit comments from others.
Hi! I have summarized the reach and regional analyses in this document and realized that it might be of interest to you.
He wikilinked the word "this". :) Task force/Local language projects/Commons and differences in reach and regional analyses
I just got through reading the content from this Task Force, including all of the chat logs. First, congratulations. You are all making tremendous progress, and I'm extremely psyched about where you are.
There are a couple of next steps I'd encourage you all to think about moving forward:
- Summarize your discussions. It would be useful if some group of you went through the chat log and incorporated them in the appropriate wiki pages. There's tons of good data there, and it would be good to bring it to the forefront. Moreover, I think this will help propel the discussion from this point forward.
- Take a second pass at the Key Questions included in your mandate. How would you answer those questions now that you've taken a deep dive into the issues?
- Be careful to separate the high-level recommendations from the detailed tactical work. For example, your first recommendation is to make the data dumps easier to use, and there are some low-level suggestions on how to do that. I would move the low-level suggestions to a proposal and link to that. The recommendation itself should explain why making data dumps easier to use is a priority, and what sort of impact it would have on fulfilling the Wikimedia vision over the next five years.
In your meeting last week, I promised to check with Sarah to see if she'd made any progress on finding a mobile phone industry contact. She's been trying diligently, but the one that was initially suggested isn't responsive to emails. So she's still looking. :)
- I had an excellent phone conversation with Patrice from WikiPock, and he is able to join us on IRC on Tuesday 22nd, if people are OK with that. It looks like they would work well with the community. I would still like to contact people from Wapedia and Collison as well. I'm mainly interested to know what would make their lives easier (in terms of formatting), and what would encourage & enable them to develop products suitable for developing countries.
There's a template for weekly reports on the task force's page. They're important for a couple of reasons: first, it lets the group distill its thinking, and second, it helps others get a quick overview of what's going on for the task force.
I wonder if you all would take a stab at one, so that I can link it from the Task force page?
- I'll try to start working on this in a week or so; right now I'm in my final week of classes, with final exams coming up soon. Till then, my summaries of the IRCs show the same information, just in a different format. Cheers, Walkerma 03:35, 10 December 2009 (UTC)
I've used offline Wikipedia implementations like Moulin. A major weakness in solutions such as these is they are hard to get small updates. I have to download the whole DVD image to get an update. Over slow internet connection like those here in Cameroon it is very time consuming.
There needs to be a mechanism to update changes from hour to hour, day to day, month to month depending on the preference of the user. Also a way to share updates with others once they are downloaded would be quite useful. Perhaps being able to extract updates from the offline database newer than a given date and save them to a file. Saving those updates to a USB drive and then using it to update multiple computers seems like a good solution.
Different users could setup an update schedule that works for them. For instance the university I work for would benefit from having an offline mirror of sorts, that could update every few hours. While the internet cafe in town could benefit from updating their content every few days. Home users without internet could get updates from the internet cafe, if they wanted.
(1) Making an offline version of Mediawiki content is a not so easy and time consuming task. (2) Making periodically such work an provide incremental update is worth. (3) Giving this incremental update a precision of an hour is even worth.
As a technical and well experienced expert in the domain, my opinion is: (1) Should be a topic: How to do that correctly for all our projects. (2) This is pretty technical challenge. Solution is storage format dependent, currently as far as I know nobody has published something about that. (2) Pretty unrealistic currently, also not sure this is necessary for end users, also not sure this is theoretically possible without introducing heavy side-effects.
I'm interested in any idea/work/proposition about how to resolve (2) with the format ZIM.
So could there be an automated process for getting online data, offline? Something similar to the data dumps, but be able to request data added after a specific time stamp? If we intend to let people redistribute it, we should have some kind of verification.
As for the offline format, Zim may work, I don't have much knowledge of that project. However I will be looking into it as time allows.
"could there be an automated process for getting online data, offline?" Yes, we can. The Foundation will certainly do it and use the ZIM file format. Thomasz is in charge of such stuff on the WMF dev. side. Would be great to involve him in our discussions. This will be intensively discussed the 22 November in Basel during the next OpenZIM dev meeting.
I'd like to see us putting out a specific collection - let's consider (say) Cameroon French release Version 1.700. This would have a broad selection of general topics, plus very thorough coverage of Cameroon and neighbouring countries. Let's say that's released on January 1st, 2014. We might then put out monthly updates - so on February 1st you could get Version 1.701, on March 1st 1.702, etc. These would be the same articles, but updated versions. Then, at the end of the year, we might review the actual content, add a few new articles and remove some that have become less important. On January 1st, 2015, you would be able to get Version 1.800, and the cycle would begin again.
Much of the work for this in en:WP and fr:WP could be done using the WikiProject assessments and SelectionBot - hopefully this or other solutions will become available elsewhere soon. To do this type of versioning would require three main things that are new, in order to work well:
- A reliable method for selecting vandalism-free article versions. I believe that WikiTrust will provide this.
- A good system of organization for releases which requires little manual maintenance. This would also include a way to see if your article collection has a new version available, then to download if desired.
- A system whereby you can need download the changes to the articles, rather than the entire content (which will mostly be unchanged).
Walkerma 04:23, 24 November 2009 (UTC)
Metadata, machine-readable, data reuse: Are strategies to improve offline take-up within the scope of the task force?
Mediawiki stores articles in flat text files. The data dumps can include some implied metadata, primarily via category memberships. Data reusers, including offline Wikipedias, prefer resources which are easily parsed and processed, one of the reasons Mediawiki moved to xml-based output. However, the content itself includes very little support for reuse. Articles are not uniform in layout, elements, or syntax use within a single language or project, let alone trans language or project.
The lack of uniformity or metadata is a significant bar to the use of Mediawiki content for offline applications, as well as online applications.
Should the task force make findings statements such as this, assuming no proposal is presented which specifically addresses this question?
I think we should do an inventory of projects that
- compile/prepare selections of WP content for offline usage
- create software that supports offline access/reuse (e.g. offline readers)
We then could interview the maintainers of those projects and ask them about their development roadmap (and the reasoning behind).