On Sat, Dec 8, 2012 at 6:33 PM, Srikanth Ramakrishnan <rsrikanth05(a)gmail.com
*Wikidata: Summing the sum of human knowledge*
*By Rohini Lakshané*
Have you ever wished you could query Wikipedia with a question? Have you
ever wanted to run a search on Wikipedia to extract data rather than trawl
through reams of Wikipedia pages and interlinks? Have you ever imagined a
world where everyone could freely pool knowledge even without sharing a
common language? Wikidata is set to do all this, and more.
Wikidata, deployed in October this year, is the newest project by the
Wikimedia Foundation, the organisation that runs Wikipedia and its sister
projects. It is a semantic, multilingual, structured database that anyone
can read and edit. It is machine readable, which makes searching, sorting
and handling of data efficient and fast. Like with its siblings, Wikidata’s
content and software are published under a free license.
*Central repository of links*
Wikidata aims to streamline Wikipedia’s rich but chaotic linking
structure. Every Wikipedia page has ‘interwiki links’ that link it to other
Wikipedia pages. ‘Interlanguage links’ connect pages to other pages on the
same topic published on Wikipedias in different languages, creating a mesh
of links. With around 23 million articles in 285 languages, the linking
structure leads to immense duplication. Wikidata will act as a central
repository for all language links on all Wikipedias, with the ultimate
objective of levelling off the raw data in the language versions. Under the
*first
phase<http://meta.wikimedia.org/wiki/Wikidata/Development/Sprint_archive…
* of Wikidata deployment, which is now
live<http://meta.wikimedia.org/wiki/Wikidata/Development/Sprint_archive&…ve>,
editors can add interlanguage links (sitelinks) on Wikidata pages.
*Ask Wikipedia a question*
The *future
phases<http://meta.wikimedia.org/wiki/Wikidata/Technical_proposal>
* of Wikidata deployment involve converging the data found
in*Infoboxes<http://en.wikipedia.org/wiki/Help:Infobox>
* (December 2012 or January 2013), and the enabling of the creation of
lists on Wikipedia (April or March 2013).
Infoboxes -- tables with the summary of the most important information on
the pages -- across Wikipedias are not uniform. Internationalised export
and import of data present in Wikipedia templates such as the Infobox would
greatly enrich its encyclopaedic value. All information present in all
Infoboxes will then come to one central repository from where it could be
updated everywhere else at the same time.
Wikipedia houses lists on assorted topics created by users -- episodes of
the most popular TV shows, heads of state, and even inventors killed by
their own inventions. However, it is currently not possible to query these
lists and cull out specific information. For example, searching the string
“countries with highest GDP” leads to a search results page with the
entries ‘List of countries with the highest GDP per capita’ and ‘Historical
list of ten largest countries by GDP’. But if you were to search for ‘ten
countries with highest GDP in Asia’ you wouldn’t quickly reach the relevant
information even though it exists in these lists. Wikidata will
automatically create a user-queried list to answer semantic queries such
as, “What were the ten countries with highest GDP in Asia last year?”
Konarak Ratnakar, who has registered over 8,000 edits on Wikidata, says, *“I
found that Wikidata had very few India-related entries. So I started adding
information on Indian cities and the capitals of small countries. I like
Infoboxes very much, and that fact that this project would make Infoboxes
grow motivated me to contribute.” *
*Supporting smaller Wikipedias*
Of the 285 Wikipedias, only four have more than a million articles. 40
Wikipedias have over a 100,000 articles. Big Wikipedias have legions of
editors; most Wikipedias with a smaller article base have fewer speakers of
the language or less volunteer manpower. Wikidata aspires to enable smaller
Wikipedias to extend their tendrils to bigger ones, and benefit from the
vast amounts of language-independent data present on them. Having
up-to-date data at their disposal will potentially rope in and retain
editors and contributors on smaller Wikipedias, where researching and
writing voluminous content from scratch may be a laborious task. The web
and Wikipedia would get more content, which in turn, would become useful to
those who only speak the languages in which smaller Wikipedias exist.
*Verifiability and difference of opinion*
Wikidata can handle differences of opinion. If different sources of data
portray discrepant or conflicting information, Wikidata has the provision
to include such differences, and the different nuances of information. This
data will be accompanied by its source in keeping with one of Wikipedia’s
cornerstones of verifiability. Lydia Pintscher of Wikimedia Deuthschland,
who handles community communications for Wikidata, explains, *“Say you
have the number of inhabitants of Israel. People don't agree on an exact
number for that for various reasons. In Wikidata it will be possible to add
all of them (within reason) with sources. So it'll not be about what is
right but about what some source says.”*
*Pushing the envelope at Wikipedias and the web*
Wikidata is slated to expand. Once it is fully deployed, it will
apparently allow for more new articles to be created with fewer
contributions. Editors will not need to write full articles on topics that
already exist on other Wikipedias. Instead, they will need to enter data at
the specified data points to get an article with the preliminary framework
ready. Wikidata will also enable micro-contributions to articles that
already exist by allowing easy data entry for editors to expand or update
articles using the available datasets.
Machine-readable data facilitates automatic updating of content that is
currently done manually on Wikipedia, for example, the updating of census
figures or election results. Wikipedia lists too will be automatically
created and updated.
Editing on Wikidata is done through a form-based editor, instead of using
wiki syntax, the mark-up present on Wikipedia, which new editors generally
find tedious.
Once Wikidata is fully operational, it will act as an interchange point
for more Wiki projects, such as the Wikimedia Commons, a media repository.
Wikidata is governed by the Creative Commons Public Domain license, which
renders the data on it copyright-free. As the Wikidata API is freely
available and the software is open licensed, you could run your own
instance. The availability of interwiki data and API opens up many
possibilities -- data extraction, data mash-ups, using interwiki metadata
for regional or localised Wikipedias, and expanding the scope of apps such
as the *Interwiki Redirect Service <http://interwikiredir.appspot.com/>*.
Pintscher says, *“In the near future we'll see the roll-out of phase 2 of
Wikidata and then the adaption of that on the Wikipedias. With time I hope
that more and more players outside Wikimedia will make use of Wikidata and
build great things on top of it. I'm sure they'll come up with many things
we haven't even thought of yet.”*
Wikidata has the potential to be a repository of repositories where the
sum total of human knowledge coalesces, a database that glues together
Wikipedias of all 285 (and counting) languages.
Important Note : Non-commercial reproduction for informative purposes only.
The publisher ( Tech2 ) of the above news article owns the copyrights
of the article / content. All copyrights are duly acknowledged.
--
Regards,
Srikanth Ramakrishnan,
Sathyamangalam-Gobichettipalayam Star Gazers, Coimbatore.
Please sign this petition for Volvo buses in Coimbatore:
https://www.change.org/en-IN/petitions/the-transport-minister-tamil-nadu-in…
_______________________________________________
Wikimediaindia-l mailing list
Wikimediaindia-l(a)lists.wikimedia.org
To unsubscribe from the list / change mailing preferences visit
https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l
--
Regards,
Srikanth Ramakrishnan.
Sign this petition to introduce Volvo buses in Coimbatore city: