Re: [Wikitech-l] Re: WikiData: A rant

24 Oct 2004

Ashar Voultoiz wrote:
<snip attempt to rescue the thing>

...
  So when some site wants to use wikidatas, it sends a
query to the 
 wikidata server associated with their internal reference (ex: name of 
 the wikipedia article and language). Wikidata then send them the 
 requested data and the wikidata internal reference.

 When a wikidata is changed, the site send ping to every site 
 referencing  that set of data with the update. From there the site 
 using data will answer wikidata with a code:
  1/ data change acknowledged.
  2/ no more need for this data, remove me.
  3/ doesnt answer.

 If it doesnt answer, there could be a system that queue the ping so it 
 can be sent later (and eventualy be dropped after x days). 
That will work nicely, if we restrict WikiData access to "show me that 
specific row from that specific table in that specific database". Which 
is fine for "Show me data on that species".

But as soon as we allow queries to return lists (e.g., "show me all 
species of that family"), we cannot do that anymore. Suppone someone 
adds a species to WikiData. How can we know that a wikipedia page needs 
to be updated?

Only one way to do that:
* Store the original query, the wikipedia page for that query, and its 
results
* On changing any WikiData, rerun *all* these queries, compare their 
results to the stored ones, and notify wikipedias if necessary

Rerunning a million queries for each data change will dwarf the possible 
traffic generated from pull (pull isn't really better either; that's the 
dilemma).

Also, pushing will require extensive infrastructure on the recipient's 
site, which is not necessarily a wikimedia project (the data should be 
available to everyone).

Magnus

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Re: WikiData: A rant