Proposal:Data-driven content

From Strategic Planning
Status (see valid statuses)

The status of this proposal is:
Request for Discussion / Sign-Ups

Every proposal should be tied to one of the strategic priorities below.

Edit this page to help identify the priorities related to this proposal!

  1. Achieve continued growth in readership
  2. Focus on quality content
  3. Increase Participation
  4. Stabilize and improve the infrastructure
  5. Encourage Innovation

If not English, in what language is this proposal submitted?:


Originally posted on meta:Talk:Multimedia Usability Project Meeting October 2009, but think it should be mentioned here

Maps, charts, and graphs are currently all hand-made. For those that draw information from statistics and data sources, the statistics (e.g. census numbers) change and the maps/graphics need to be updated. It's a manual process, involving someone recreating a new map or chart.

The OpenStreetMap integration project is a step in the right direction, and will help with automatically generating location maps. Though, Wikipedia also has a need for thematic (statistical) maps and other maps/charts. With the infrastructure in place for the OSM integration project, we will be poised to do more with maps and dynamic data-driven content.

There are technical hurdles along the way, and more resources will probably be needed to put this capability into place and support the projects, with the volume of traffic that the English Wikipedia and other projects get.


We already have the OpenStreetMap integration project underway, though with the database and rendering infrastructure in place, it's possible to build upon that. It should be possible to offer multiple base layer options, such as w:Blue Marble satellite imagery or terrain maps, though with the implementation approach of generating tile sets for each (and each language), it requires additional data storage space.

Could we allow people to upload GIS data files (e.g. shapefiles, GML, KML) and create custom overlays on top of an OpenStreetMap base layer (or other base layer options such as satellite imagery, terrain map, or political boundaries)? Would Wikimedia Commons be the place for such uploads? how would we handle them?

What about allowing people to use external live data and map sources, such as WMS (web map services) services or live data feeds? Some of these are in the public domain, from providers, such as the National Weather Service in the U.S. for hurricanes and from the USGS for earthquakes.

I (User:Aude) am also toying with the idea of developing a mapmaking utility that could reside on the toolserver, for generating custom static maps (without pan/zoom capabilities). If the maps rely on a live data source or need periodic updating, perhaps a bot could be incorporated into the service to regenerate maps on a periodic basis and upload new versions.

Another related idea is to allow dynamically generating charts, graphs, and other content from data sources (e.g. or U.S. census data), similar to dynamically generating maps. It's a pain right now to maintain such graphics on Wikipedia, as they are hand made and people have to redo them in order to update them.

A related issue is how to handle text content that is drawn from data, such as the 2000 U.S. census data that is in many articles on small American towns, added by Rambot. It will be outdated with the 2010 U.S. census. How do we update it? And, the Census Bureau does other surveys more frequently than the census, such as the American Community Survey, that could be useful for updating such statistics. Aude 01:25, 15 August 2009 (UTC)


Maps, graphs, and charts are hand-made now, making for a tedious process and inconsistent results, in terms of style, quality, etc. They are difficult to keep updated, as data behind the maps, graphs, and charts change. It would be good to have tools in place to allow Wikimedians to better deal with and generate such content in a sustainable and more user-friendly way.

Key Questions

Potential Costs

  • Need developers
  • Need system administrator for servers, including for the OSM servers that we already have. (people aren't readily volunteering to do it, and it takes a significant time commitment that volunteers don't necessarily have)
  • May need more hardware


Community Discussion

Do you have a thought about this proposal? A suggestion? Discuss this proposal by going to Proposal Talk:Data-driven content.

Want to work on this proposal?

  1. .. Sign your name here!