Task force/Offline/IRC/2009-12-01

    From Strategic Planning
    < Task force‎ | Offline‎ | IRC
    19:00 < Amgine_> <waves @ hejko>
    19:00 < walkerma> hejko, Jan_eissefeldt: Hi!  Amgine - about now
    19:01 < walkerma> Do you think we should discuss cellphone releases today?  That would be my preferred agenda topic for now
    19:01 < walkerma> Or is there something else more important?
    19:01 < hejko> I have no new information on this topic.
    19:01 < LauraHale> No pedophile discussion.
    19:02 < Amgine_> heh.
    19:02 < Amgine_> Okay, I'm here multiple times.
    19:03 < Amgine_> Cellphone releases.
    19:03 < walkerma> You can talk to yourself, Amgine
    19:03 < walkerma> !
    19:03 < Amgine_> <laughs>
    19:03 < walkerma> OK, Wizzy, are you there?
    19:03 -!- GerardM- [n=chatzill@dhcp-077-250-053-164.chello.nl] has quit [Read error: 104 (Connection reset by peer)]
    19:03 < walkerma> Or do I need a lower case wizzy ?
    19:04 < Amgine_> Shouldn't, but I think he's been idle a while.
    19:05 < walkerma> That's a shame, he had quite a bit to say on cellphones last week.  I confess, I've never owned a cellphone, so they're mostly a mystery 
                      to me.
    19:05 < Amgine_> Mmm, new software always hides how to do things.
    19:06 < walkerma> My understanding is that many people would like to store a copy of WP, Wiktionary, etc on their cellphones, and access these without 
                      having to go on the internet - right?
    19:06 < Amgine_> Cellphones are, for all intents and purposes, a handheld computer with particularly limited storage at the moment.
    19:06 < wizzy> hi
    19:06 < walkerma> Amgine_- yes that's how I'm regarding them for this discussion.  Please correct my complete ignorance if I get something wrong!
    19:06 < Amgine_> Yes, plus they have the added benefit of having some internet connectivity, so a system might be built which can update the repository.
    19:07 < Amgine_> Most cellphones today have limited browser capabilities.
    19:07 < walkerma> Hi wizzy - we're talking about what we need for cellphone releases.  Would you care to give your view on the situation in RSA?
    19:07 < hejko> There are more than 4 billion cellphones. So *if* we can bring WP content to cell phones this would be huge sucess in terms of reach of WMF 
                   projects.
    19:08 < Amgine_> There are two primary elements to overcome: storage capacity, and a presentation/storage script.
    19:08 < walkerma> http://www.metamanda.com/blog/archives/2009/03/cell-phone-use-at-60-worldwide.html
    19:08 < wizzy> South Africa has close to 100% penetration for cellphones
    19:08 < wizzy> ditto (I believe) for India
    19:08 < Amgine_> <nods>
    19:08 < Amgine_> Not quite so for India, but far higher than internet access.
    19:09 < Amgine_> Nearly an order of magnitude.
    19:09 < Amgine_> Cellphone networks exist now in every country of the world.
    19:09 < wizzy> I have an E71 - high end - that has a Micro-SD slot. That can take 32Gigs
    19:09 < hejko> I think our focus should be on ultra low cost phones.
    19:09 < hejko> Or better, the next generation of them.
    19:10 < wizzy> the next generation will look like my current generation
    19:10 < Amgine_> How can we research those specifications, hejko?
    19:10 < walkerma> So would 32 GB be a reasonable prediction for a budget cellphone in 2012?
    19:10 < wizzy> What I raised last week was that the data format needs to be vanilla HTML for a cellphone browser to use it
    19:10 < hejko> I doubt that. But it should feature something like 512MB  which is plenty of RAM when storing text only.
    19:11 < wizzy> I think that 32Gig is perfectly reasonable for 2012
    19:12 < wizzy> I have no Micro-SD card atm - but onboard (builtin) I have about 64Meg free
    19:12 < Amgine_> I'd rather not guess. Are there information resources about cellphone development?
    19:12 < Amgine_> Industry standards and so on?
    19:12 < hejko> I think this is a question that we need to answer before we go deeper into this. That is why I wanted to have a contact who can us 
                   something about the roadmaps of cellphone manufacturers.
    19:12 < wizzy> development ? Are we building a reader ?
    19:13 < walkerma> Do you think people would be buying a specific memory card/disc/chip/whatever that contains a collection of Wikimedia resources - in 
                      which case a smaller size is fine - or would they just want to put a collection onto one standard card they keep in their cellphone all 
                      the time?
    19:13 < wizzy> development is very often Java
    19:13 < Amgine_> wizzy: we're able to suggest priorities and goals for the WMF.
    19:13 < hejko> walkerma: i think the contents must be already bundled when purchasing the phone.
    19:14 < wizzy> I think more likely they will get a standard release from somewhere
    19:14 < wizzy> and (I think) use their builtin browser to access it
    19:14 < wizzy> no reader
    19:14 < Amgine_> Walkerma: I view this as a memory card which can be handed out + an OEM distribution bundle.
    19:14 < hejko> yes and the release probably differ based on the specs of the phone
    19:15 < walkerma> OEM = ?
    19:15 < hejko> Amgine_:  pre installed :)
    19:15 < wizzy> why would it differ ? 
    19:15 < Amgine_> Original Equipment Manufacturer.
    19:15 < Amgine_> Not all phone manufacturers will choose to include it. There is a small cost in loading software/data.
    19:16 < hejko> phones have different memory specifications. some may be able to take 1Mio articles while others might only have enough RAM to put 10K 
                   articles on them.
    19:16 < Amgine_> <nods>
    19:16 < walkerma> There's no reason we can't do both, is there - a download, and a distribution bundled in with new cellphones ?
    19:16 < wizzy> Micro-SD is the standard. If they are putting it onboard - yes, I agree
    19:17 < walkerma> hejko: Yes, that's why we need to create standard collections of specific sizes
    19:17 < wizzy> yes
    19:17 < hejko> btw: did you read the related thread: http://strategy.wikimedia.org/wiki/Talk:Task_force/Offline#lqt_thread_489
    19:17 < Amgine_> I was very interested to learn that Okawix does not use dumps for their system, but a live-updated model. They are able to make 
                     on-the-fly versions of limited entry content viewers.
    19:17 < hejko> walkerma: yes.
    19:18 < walkerma> Linterweb is working on cellphone releases too
    19:18 < wizzy> and there are choices - pictures or not, just-the-lede or whole articles
    19:18 -!- Kelson [n=Kelson@142-30.105-92.cust.bluewin.ch] has joined #wikimedia-strategy
    19:18 < Amgine_> Hello Kelson!
    19:18 < wizzy> Kelson: hi
    19:19 < Kelson> Hi everybody
    19:19 < wizzy> Kelson: would a cellphone wp release have a reader ? Or just use the browser ?
    19:19 < hejko> I think working directly with the cell phone manufacturers will have the largest impact.
    19:19 < walkerma> Hi Kelson! wizzy: On content, I'd like to revive a question from last week.  How do I find the melting point of benzoic acid, if that 
                      piece of data is hidden in an infobox? We need to have a readable version of the data box content
    19:20 < walkerma> wizzy: Sorry, let's answer your question first!
    19:20 < Kelson> wizzy: not easy to answer... it depends
    19:20 -!- Guill [n=guillaum@LSt-Amand-152-31-19-153.w193-253.abo.wanadoo.fr] has joined #wikimedia-strategy
    19:20 < hejko> walkerma: but this is something that wp not even answers today. why should this be a requirement for mobile phone releases?
    19:20 -!- JC [n=JC@wikimedia/Juliancolton] has quit [Read error: 104 (Connection reset by peer)]
    19:21 < wizzy> Kelson: http://pastie.org/722018 for discussion so far
    19:21 < Amgine_> wizzy: I don't know if there's a specific answer. Some of the offline systems have a dedicated reader. Others can use a browser.
    19:21 -!- pm27 [i=c1fdde99@gateway/web/freenode/x-ubacylbkactkemwb] has joined #wikimedia-strategy
    19:21 < pm27> hello all
    19:21 < wizzy> pm27: hi
    19:21 < pm27> walkerma: nice to see you
    19:22 < wizzy> pm27: http://pastie.org/722018
    19:22 < Amgine_> Speaking of which: I crashed my system last night and lost 2 months worth of work. If anyone has logs, it'd be great if they were 
                     uploaded and linked.
    19:22 < walkerma> pm27, guill, hello!
    19:22 -!- DarkoNeko [n=udontcar@wikipedia/darkoneko] has joined #wikimedia-strategy
    19:22 < hejko> i think this is something that the manufacturers would need to figure out what works best for them. I doubt that there will be a one fits 
                   all software solution as there are various different platforms used.
    19:22 < Guill> hi
    19:22 < walkerma> Amgine_ : http://strategy.wikimedia.org/wiki/Task_force/Offline/IRC
    19:23 < wizzy> html is the common denominator. Even walkerma's infobox fits in there
    19:23 < walkerma> I also wrote up summaries of the discussions on that page
    19:23 -!- JC [n=JC@wikimedia/Juliancolton] has joined #wikimedia-strategy
    19:23 < Amgine_> <nods> Thanks walkerma
    19:23 < Kelson> wizzy: the ZIM format (and the zimlib) is adapted to a cellphone... at least a not to bad one... people use the zimlib already on small 
                    devices (with 32MB or RAM)
    19:24 < walkerma> wizzy: Yes, I agree.  I think something that just needs standard browser software would be much better than standalone reader software, 
                      if it's practicable
    19:24 < wizzy> the only thing we lose is search
    19:24 < pm27> hello Kelson
    19:24 < Amgine_> <grin> I was going to ask Kelson, and pm27 as well?
    19:24 < Kelson> Kelson: more problematic is to have an adapted render engine : current HTML render engine like fennec for exemple need at leas 128 MB or 
                    RAM
    19:25 < Kelson> pm27: hi
    19:25 < walkerma> See this: http://ai.cs.utsa.edu/wikipedia0.7/
    19:25 < walkerma> It is designed to work in a browser, with search, but it can be downloaded in that format
    19:26 < wizzy> but all phones we will target will already have a builtin browser
    19:26 < Kelson> wizzy: IMO spreading Wikipedia on cellphone in poor country will not be possible before a few years
    19:26 < hejko> prefix search can be implement very efficiently using front-coding datastructures.
    19:26 < walkerma> Kelson: What's the problem?
    19:26 < hejko> Kelson: why?
    19:26 < walkerma> (with reaching poor countries)
    19:26 < Kelson> wizzy: an other issue is that cellphone are fully controled by the manufacturer and/or the telco : no the user
    19:27 < Kelson> consequently, on a lot of devices you can not installed what you want
    19:27 < hejko> yes, this is why the WMF must convince the manufacturers.
    19:27 < hejko> they should have an interest to more value to their phones.
    19:27 < walkerma> Kelson: But if the manufacturers WANT to include WP releases...:)
    19:27 < wizzy> my cellphone is unlocked, and I can load anything I like on it. In India they all have cheap chinese knockoffs - no network control
    19:27 < hejko> same for providers who could urge manufacturers to do so
    19:28 < Kelson> wizzy: that's great.
    19:28 < wizzy> http://news.bbc.co.uk/2/hi/south_asia/8387727.stm 
    19:28 < Amgine_> I don't think we can talk about strategies like that. What we can do on this task force is present possibilities. But first we have to 
                     know they are possibilities.
    19:29 < Kelson> walkerma & hejko : Like I have written... devices are pretty small
    19:29 < hejko> Ideally we could conclude with a proposal like: "If the WMF convinces mobile phone manufacturers/providers to bundle WP with mobile phones 
                   starting 2011 then - in 2016 - we estimate that a third of the population of developing countries will have offline access to a static 
                   subset of WP using their mobile phone"
    19:29 < Amgine_> So, we can say "We could provide manufacturers and telcos a nice bundle item" but we can't say "This will work on phone is South Africa"
    19:30 < wizzy> My recommenation is that whatever format the WP 1.0 release is in, it should be convertible to an HTML dump
    19:30 < Amgine_> That's pretty much a given, wizzy. It will be possible.
    19:31 < wizzy> An HTML dump will work on any phone with a browser 
    19:31 < walkerma> wizzy: Yes, I'm inclined to agree.  We can have various formats, but one of them should be plain vanilla HTML if that is possible
    19:31 < Amgine_> Yes.
    19:31 < Kelson> yes
    19:32 < walkerma> So Kelson, we should also offer a ZIM format for cellphones - is that your view?
    19:32 < Amgine_> I'd like to give pm27 a moment to catch up, and answer this question: Can Okawix provide a reader which can work on the forthcoming XUL 
                     from Mozilla?
    19:32 < wizzy> we talked last week about what sections are optionally included - full text, lede, pictures, references, categories
    19:33 < Kelson> walkerma: yes, this is also the point of view of the openZIM project. To achieve to do that we want to try to find a solution to be able 
                    to read a ZIM file on any device
    19:33 < Kelson> walkerma: this is not easy but we want to try...
    19:33 < pm27> Guill: Amgine_
    19:34 < wizzy> Amgine_: tell us a bit about XUL ?
    19:34 < Kelson> page will be in HTML and rendering will certainly be a little bit different depending to the device/reader....
    19:34 < Guill> Amgine, which XUL are you refering to?
    19:34 < Guill> Okawix currently works with 1.9.1
    19:34 < Amgine_> Tomaszf is en route to work; eta 45 minutes. He is the best person to ask.
    19:34 -!- William_Pietri [n=william@dsl017-034-114.sfo4.dsl.speakeasy.net] has joined #Wikimedia-strategy
    19:35 < Amgine_> here is my terrible simplification of XUL: it's a script language to use parts of other software to make a "new" software.
    19:35 < Kelson> walkerma: currently people are trying to make an HTML parser which skips many HTML tags and so be able to have a pretty simple HTML 
                    rendering output.
    19:35 < Guill> I can't wait for 45 minutes, but we are using the last xulrunner releases to develop
    19:36 < Kelson> walkerma: so we want to avoid device specific ZIM file.... Personaly, not sure this is possible, but we try.
    19:36 < Guill> so the answer is "yes" :)
    19:36 < Guill> I have to go now
    19:37 < Amgine_> Thanks Guill: What you're doing is creating a version of Okawix that will work on cellphones that have the FireFox browser.
    19:37 < Amgine_> Okay, thanks Guill!
    19:37 < walkerma> Thanks Guill!
    19:37 < Amgine_> wizzy: FireFox is working on a browser for cellphones. It will allow many other open source projects to work on cell phones.
    19:37 < Guill> np, see ya
    19:37 -!- Guill [n=guillaum@LSt-Amand-152-31-19-153.w193-253.abo.wanadoo.fr] has quit ["Quitte"]
    19:38 < wizzy> If we restrict the HTML dump to a subset of easily-parsed HTML, will that make a reader simpler ?
    19:39 < Kelson> wizzy: yes, but you will have device specifif content.
    19:39 < hejko> I imagine the reader will need to be something similar to lynx with a small footprint.
    19:39 < Amgine_> It's not actually the reader which is the problem, as far as I can tell. It's preparing the content to be viewed.
    19:39 < wizzy> is it possible to write a search function that accesses a multi-part 'words' database and uses a low memory footprint ?
    19:39 < wizzy> in javascript ?
    19:40 < Amgine_> It's probably possible but not necessarily efficient in storage or in operation.
    19:41 < Amgine_> But again, those are not the kinds of things we can do: we suggest priorities based on asking the experts.
    19:41 < wizzy> I am trying to figure a way we can dispense with a reader and yet have a searchable collection
    19:41 < Amgine_> Why?
    19:42 < wizzy> within certain boundaries, I don't think efficiency is a huge priority
    19:42 < wizzy> Amgine_: no disrespect, but I don't like the idea of a custom reader. It makes things a lot more device-specific
    19:43 < Amgine_> One of the guiding criteria we've been discussing is small storage - it requires efficiency to allow the largest possible content 
                     repository.
    19:44 < Amgine_> Yes, I agree. But in order to have an offline browsable repository you would then need an offline server - or a very inefficient bundle 
                     of html pages.
    19:44 < wizzy> the bundle of html pages can all be gzipped ?
    19:44 < Amgine_> That's possible - it may even be the best choice, but it's still a custom server solution.
    19:44 < hejko> wizzy: even if the whole project would cost $1.000.000 it would still be very efficient in terms of reach per dollar spent.
    19:44 < Amgine_> Then you need a software that will unzip just the page you want, and point your browser to it.
    19:45 < wizzy> a custom server is worse than a custom reader
    19:45 < wizzy> mozilla can read an HTML dump - can it read an html dump of gzipped pages ?
    19:46 < hejko> I think we really should get in touch with an industry insider before we continue to discuss technical solutions.
    19:46 < Amgine_> Well, there already *are* servers for cellphones; that may be where this project should go. I don't know if mozilla can read gzip html. I 
                     agree hejko.
    19:46 < wizzy> surely it can. But we want portability - Nokia's browser (which does javascript) needs to read the same dump
    19:47 -!- DarkoNeko [n=udontcar@wikipedia/darkoneko] has quit ["Despite the name, misfortune never misses."]
    19:47 -!- DarkoNeko [n=udontcar@LSt-Amand-152-31-19-153.w193-253.abo.wanadoo.fr] has joined #wikimedia-strategy
    19:47 < Amgine_> So long as the number of custom dump setups is relatively small, there is really no problem creating a custom dump process.
    19:47 -!- DarkoNeko [n=udontcar@LSt-Amand-152-31-19-153.w193-253.abo.wanadoo.fr] has left #wikimedia-strategy []
    19:48 < Amgine_> So, one for nokia, samsung, motorola, etc. would be easily doable.
    19:49 < Amgine_> Anyone know how to turn off the trackpad in ubuntu?
    19:49 < walkerma> What about the WikiPock release?
    19:50 < Amgine_> I haven't managed to research that one yet.
    19:50 < Amgine_> Anyone else get to it?
    19:50 < walkerma> I was given a copy of this at Wikimania by Kul - the WMF people seem to like it
    19:50 < Kelson> Amgine & wizzy: zip files is the worth solution... there is a reason why we invent the ZIM format.
    19:50 < walkerma> http://www.wikipock.com/
    19:51 < walkerma> I suspect that if we propose to work on cellphone releases, Erik and others may well want us to talk to WikiPock
    19:51 < wizzy> walkerma: looks nice - have you tried it ?
    19:52 < walkerma> wizzy: Yes, I opened it up again just a few minutes ago
    19:52 < Amgine_> Kelson: I agree it's *a* solution. I would like to see many innovative solutions being tested and built.
    19:52 < wizzy> one fat file (like zim) ? search ?
    19:53 < Kelson> Amgine: it's not a solution, it won't work.... did you try to make a zip file with 500.000 documents?
    19:53 < Amgine_> Kelson: yes, actually, I have. And it works.
    19:53 < walkerma> It has full text but no pictures or infoboxes.  So it won't tell me the melting point of benzoic acid (even though there is a long 
                      article!)- even though the MP is something my students routinely look for in WP chemicals articles
    19:53 < Amgine_> But it's very slow to retrieve any one of them.
    19:54 < Kelson> Amgine: ok, we are agree.
    19:54 < walkerma> The Wikipock search is dynamic, so you see possible answers while you're typing
    19:54 < Kelson> Amgine: I do not ask how much memory it needs and the time to get one doc.
    19:54 < walkerma> Probably useful if you're typing on  micro buttons!
    19:54 < Amgine_> Heh...
    19:55 < wizzy> walkerma: definitely. Cellphone development is all about typing less
    19:55 < wizzy> (mozilla won't open a gzipped html file ):
    19:56 < Amgine_> Of the top 3 phones we looked at for purchase, one of them had a keyboard. It appears top smart phones are moving away from keyboards.
    19:56 < walkerma> I'll post a few screenshots from my computer, and link them from the strategy wiki page
    19:57 < wizzy> walkerma: are you using WikiPock on a phone ?
    19:57 < walkerma> wizzy: I don't own a cellphone.  I could see if a friend might be willing to try loading this, though
    19:58 < Amgine_> WikiPock appears to be for the largest 4 smartphone OS only.
    19:59 < wizzy> south africa is often used as a testbed for stuff they subsequently roll out in Europe - we have a mature cellphone market, and it is less 
                   expensive to screw up here :)
    19:59 < wizzy> (for the networks)
    19:59 < walkerma> Maybe we should do the same (test, I mean, not screw up!)
    20:00 < Amgine_> <grin> I think "we" (The Wikimedia Foundation + projects) aren't going to be releasing anything.
    20:00 < Amgine_> We'll be enabling others to do so.
    20:00 < Amgine_> So we want to focus on how best to do that.
    20:01 < walkerma> BTW: The Wikipock collection is 7 GB, which includes all the English WP, as well as ES and PT
    20:01 < walkerma> But no pix or infoboxes in there
    20:02 < walkerma> Amgine_ : No, but we want to develop collections that are relevant, and then find partners who WILL do test releases
    20:02 < wizzy> walkerma: I agree about infoboxes - lots of important info (on countries, chemicals) go there, and are even removed from the article
    20:03 < walkerma> wizzy: Exactly - often the infobox is used to make the most important data easy to find!  And they are removed from the article, usually
    20:03 < wizzy> I would be interested to see Wikipock on a phone, and see how they do it
    20:03 < walkerma> wizzy: I'll try to make a small video and link to it from the wiki
    20:03 < wizzy> from a phone ?
    20:03 < Amgine_> What interests me is this claim: "Always up-to-date"
    20:03 < Amgine_> wizzy: I'm working with a developer who is creating python tools for parsing metadata out of en.wp infoboxes.
    20:04 < walkerma> I'll see if I can persuade a friend to put it on their phone
    20:04 < wizzy> walkerma: cool
    20:04 < LauraHale> Finding articles is a pain because the category structure is not always consistent.
    20:05 < walkerma> Amgine_ : Talk with User:Beetstra on en:WP.  Our Chembox is designed deliberately to make the data easy to read by machine.  We're even 
                      writing an academic paper this month on some of this stuff we're doing in chemicals articles
    20:05 < walkerma> For the Journal of Cheminformatics
    20:05 < wizzy> perhaps a recommendation we can push upstairs is less entropy in the categories
    20:06 < Amgine_> I've been in touch with the UDC regarding use of the UDC categorization scheme on WMF projects.
    20:06 < Amgine_> Let me find the e-mail which I copied to someone on strat...
    20:06 < walkerma> wizzy: That's something very hard to enforce online - and impossible to do from the top down.  But for offline, I think the UDC system 
                      would be great
    20:06 < Kelson> wizzy: this would be great... but doing a well working cat. graph is also not an easy task.
    20:07 < walkerma> Amgine_ :Thanks!  You're always about ten steps ahead of me on these things...!
    20:07 < wizzy> University of the District of Columbia ??
    20:08 < Amgine_> http://strategy.wikimedia.org/wiki/User_talk:JakobVoss
    20:08 < Kelson> walkerma:  UDC system ?
    20:08 < walkerma> http://en.wikipedia.org/wiki/Universal_Decimal_Classification
    20:09 < walkerma> http://www.udcc.org/about.htm
    20:09 < Amgine_> Universal Decimal Classification - it's a very well-respected method for categorizing knowledge topics.
    20:09 < Kelson> walkerma: ok, but how do you know which article is in wich cat?
    20:09 < walkerma> Brassratgirl proposed this last week
    20:09 < Amgine_> Kelson: That's always somewhat subjective, Kelson, but having a grid of topics to work with is an important first step.
    20:10 < Kelson> Amgine: ok
    20:10 < walkerma> Kelson: We'd need to work with them, probably, but it's a very comprehensive library category system.
    20:10 < Amgine_> This particular grid is flexible, allowing some boolean applications.
    20:10 < Kelson> I'm agree (1) this would be a good start (2) having a well respected standard is good
    20:11 < wizzy> so it would run in parallel to categories ? or a post-processing step on a collection ?
    20:11 < Kelson> but I see this is not really detailed.... with big collection, we will have at the end thousands of article per cat.
    20:11 < Amgine_> What I was most impressed with is how you can create unique categories: Alpine ski category+Backcountry+trekking is a new category.
    20:12 < Amgine_> Only 2000 of their 68k unique categories are on the website.
    20:12 < Kelson> 68k categories.... OK, sorry for the noise ;)
    20:12 < Amgine_> The full collection is copyrighted, but has a unique licence where we might not have to pay for it.
    20:13 < Amgine_> And with category concatenation... I think that give a couple million possible combinations.
    20:13 < Kelson> This would really interesting to know if the community is interesting in using such a cat. system.
    20:13 < walkerma> wizzy: I think we could use the WP categories as well - two parallel systems - if we set it up properly.  The problem with WP categories 
                      is they don't work very well as a hierarchy.
    20:14 < Amgine_> Well, again, we can only suggest this. The communities must accept it and implement it.
    20:14 < Amgine_> That's a lot of work.
    20:14 < walkerma> I asked Kelson in 2007 to give us a list of all articles that were in Chemicals categories.  He gave us 22,000 articles, of which only 
                      6,000 or so turned out to be actual chemicals.
    20:15 < Amgine_> It would, however, make the content much more easily parsed...
    20:15 < walkerma> One was a bar in England, for example.  It was listed under "places in alcohol serving alcohol" and alcohol was ultimately listed under 
                      the ethanol category
    20:15 < Kelson> walkerma: yes, typical example to show how the cats in wikipedia are weak
    20:16 < Amgine_> That's... terrible.
    20:16 < wizzy> eek
    20:16 < walkerma> Amgine_ :I think we could use UDC for offline releases without too much problem, that wouldn't be controversial at all
    20:16 < Amgine_> true.
    20:16 < walkerma> Sorry, I meant to say "Establishments in England serving alcohol"
    20:17 < walkerma> for that category
    20:17 < walkerma> Or something along those lines
    20:17 < Amgine_> Okay, we're 15 minutes past the hour. We've covered several possible implementations on cellphones, we have one live application doing so.
    20:17 < walkerma> My friend actually checked all 22,000 by hand - it took him months
    20:18 < walkerma> Yes, are there any other issues we need to discuss before we close?
    20:18 < Amgine_> I don't remember... I finished installing software just as the meeting started and didn't have time to look at the agenda.
    20:18 < LauraHale> I spent almost a week or two crawling around the sports section. :/  It was enelightening but wow, frustrating in terms of category 
                       navigation.
    20:18 < LauraHale> I don't think most people navigate that way.
    20:19 < walkerma> LauraHale: I think you're right!
    20:19 < walkerma> But for indexing, believe me, it's essential - and that's something we have to do for smaller offline releases
    20:19 < wizzy> yes
    20:19 < hejko> I have to leave. Let us create a page in the wiki for this topic and structure our ideas and questions there.
    20:20 < wizzy> it would be great to have a bot work on categories somehow
    20:20 < Amgine_> Okay hejko! Thanks!
    20:20 < Amgine_> walkerma: url for page on cellphones?
    20:21 < walkerma> We need a computer-whiz to work on indexing as "their work" to get it to work well
    20:21 < LauraHale> I'm not doubting the indexing part.  I was trying to develop a comprehensive list of sports teams.  wikipedia has the most teams 
                       listed.  Just a pain in the arse to find.
    20:21 < walkerma> Amgine_ : Is that a suggestion?
    20:21 < Amgine_> walkerma: I have one. He's only interested in wiktionary at the moment.
    20:21 < Amgine_> <grin> Who, me?
    20:22 -!- hejko [n=hejko@dslb-084-058-024-223.pools.arcor-ip.net] has quit [Read error: 54 (Connection reset by peer)]
    20:22 < Amgine_> I think a shift to use the parallel UDC categories may be something to discuss on a separate page.
    20:22 < walkerma> Sorry, I missed that
    20:23 < walkerma> hejko's suggestion
    20:23 < walkerma> Good idea, I tihnk
    20:23 < Amgine_> He'd like us to create a page on wiki to discuss cellphone reader, and I think another page for UDC
    20:23 < wizzy> yes
    20:25 < Amgine_> I'm seeing cellphone reader developing as a second priority for the TF.
    20:26 < walkerma> OK, Amgine_ : Could you & I create a page on UDC?  Who's best to start the cellphone reader page?
    20:26 < Amgine_> I think wizzy can take the first run at it?
    20:26 < Amgine_> What url would you suggest for each of these?
    20:26 < Amgine_> (title)
    20:27 < wizzy> ok