Proposal:Knowledge Representation One Step Further

From Strategic Planning
Status (see valid statuses)

The status of this proposal is:
Request for Discussion / Sign-Ups

Every proposal should be tied to one of the strategic priorities below.

Edit this page to help identify the priorities related to this proposal!

  1. Achieve continued growth in readership
  2. Focus on quality content
  3. Increase Participation
  4. Stabilize and improve the infrastructure
  5. Encourage Innovation


Wikipedia is a representation of human knowledge. Let's take this one step further by add a knowledge representation capability.

A simple solution is the use of an existing technology of representing knowledge: Resource Description Framework (RDF) or more restrictively The Web Ontology Language (OWL). But the creation of a more simple language can be done like the current formatting language used in wikimedia.


We propose to change the focus of Wikipedia from just collecting what users contribute to categorizing content much better, i.e. to develop software to help contributors do so and to encourage authors to classify content better.

To prevent the excessive complexification of the simple formatted text, we can simply separate this new functionality in another Knowledge tabs.

By using Knowledge Representation, Wikipedia articles would not only be valuable to human beings but software could do all kinds of interesting operations on the data, like :

  • Referencing automatically to related articles, predicting how interesting the article might be to the user and ultimately offering a search engine not just for facts but for knowledge.
  • Generate automatically a lot of interesting conceptual representations of the objects, events and historical facts.


The final motivation for this is not tangible but offering this kind of tool to the AI research community can produce a positive effect on the voluntary development of high level functionalities for wikimedia system. Collaboratively creating a great knowledge base and making it as powerful as it can, possibly get just seems the right way to go.

For the end user of the system, the possibility of the creation of rich conceptual graphics can be a sufficient personal and short time motivation. This basic requirement can be done by the creation of a lot of simple languages for knowledges visualization. The requirements of the final users in matter of knowledges visualization seems the only possible use case driven approach.

Kind of Knowledge to Represents

Lexical semantics (common nouns, verbs and adjectives)

The basic linguistic relationship like : Hypernymy/hyponymy, synonymy, meronymy/holonymym troponymy, entailment, coordinate terms, related nouns, similar to, participle of verb, root adjectives.

  • The interested users for this kind of knowledges visualization are grammarians, linguists and all people who like language.
  • The generated graphic are semantic nets.
  • A link of wikimedia to WordNet ontology can be made (license are compatible).

Historical knowledges (proper nouns)

The relation between historical facts concerning peoples are important and the unstandardized legends in all wikimedia people pages testify to this. The common features are  : Born/Die, Nationality, Fields, Institutions, Alma mater, Doctoral advisor, Known for, Notable awards, Bibliography, etc.

But the relation between individual and places (displacement, discovery, conquest,...), objects (discovery, invention,...), and other peoples (kindship, influenced by, killed by,...) are also very interesting for visualization.

  • The interested users for this kind of knowledges visualization are historians and all people who like historical informations.
  • The generated graphics are for example, sequence diagrams, like this :

        >>> Disciple of >>>
        >>> Disciple of >>>

Key Questions

  • Why hasn't it been done already? ;)
  • How would a roadmap for the transition to knowledge enriched content look like?
  • Will authors understand how to categorize properly?
  • Can there be a solid and consistant way of categorization, spanning different languages and cultures?
  • Can the programming effort be accomplished to develop the necessary software tools?

Potential Costs

Quite some work on ontology, knowledge representation and visualization will be necessary. Additionally, authors need to be educated to use a consistent categorization system that will be much more sophisticated and more difficult to use than the current one.


Ontology in Wikipedia

Ontology editor in Wikipedia

Knowledge representation in Wikipedia

Community Discussion

Do you have a thought about this proposal? A suggestion? Discuss this proposal by going to Proposal talk:Knowledge Representation One Step Further.

Want to work on this proposal?

  1. .. Sign your name here!