Jump to content

Proposal:Dictionary extensions

From Strategic Planning
Status (see valid statuses)

The status of this proposal is:
Request for Discussion / Sign-Ups

Every proposal should be tied to one of the strategic priorities below.

Edit this page to help identify the priorities related to this proposal!


  1. Achieve continued growth in readership
  2. Focus on quality content
  3. Increase Participation
  4. Stabilize and improve the infrastructure
  5. Encourage Innovation


Summary

Create some extensions to make writing a dictionary in MediaWiki both easier and more useful.

Proposals

  • A wikitext->DICT exporter/API. The DICT protocol is very simple, you give the server the "dictionary name" and the "word", and it chucks back a list of "definitions". Each "dictionary" should deal with a specified subset, so we could have a English->Finnish translations "dictionary", or just a "Mandarin definitions" dictionary. Keys should be addable to each dictionary using a parser function, so that modification to a few templates is all that is required to start making the API useful. Ideally it would support live queries, but being able to dump the contents of each dictionary would also help. Although this would not do justice to the wealth of information that Wiktionary has, it would be enough for most consumers.
  • The ability to automatically create placeholders for pages linked to from entries. Most of the pages on Wiktionary are just "form-of" pages, they say "Plural of <blah>". It should be possible to instruct the software to respond to fudges with "plural of fudge" without having to explicitly create the fudges page.
  • A semantic editor, forget WYSIWYG which is not useful for the strict formatting we require, it should be possible to "add a synonym for definition (x)" without having to care about which template is used underneath.
  • A way of getting the current section headings into templates. Currently every template must be passed a language code, it should be possible for the template to look at the previous level-2 heading and use that instead.
  • A "nearby" feature that uses some form of case-and-accent normalisation, similar to what the search does now, to find words that are similar to the current word. Unlike current (manual) versions, the same number of results should be returned on every page, allowing for more-different words to appear.
  • A shared database of interwikis. The policy on Wiktionary is, and has always been, to interwiki link between pages of identical title. We have one robot that maintains this on all Wiktionaries, but it would feel nicer to have some kind of shared storage for this (the same could be used to merge the DICT dictionaries across projects if we wanted to).

Motivation

Wiktionary is an absolute pain to edit, everything is templatised, and formatting is considered policy; yet even though we go to all these lengths, it is still not feasible to use the data we create except in very loose ways. Changing or converting the whole of all the wiktionaries is infeasible, all the editors would have to re-train, much of the conversion would need to be done by hand, there are millions of entries.

Key Questions

  • Could other projects benefit from these extensions too?
  • Would it be worth hiring someone to explicitly work on these?

Potential Costs

Development. Server resources.

References


Community Discussion

Do you have a thought about this proposal? A suggestion? Discuss this proposal by going to Proposal talk:Dictionary extensions.

Want to work on this proposal?

  1. .. Sign your name here!