[Wikitech-l] Re: WikiData: A rant

22 Oct 2004

Magnus Manske wrote:
<snip stop depressing!>
...
  As good "wiki-fiddlers" (thanks so much,
Register!) we would like to see 
 every change in WikiData on the wikipedia pages real soon. Like, now.
 So the information that something changes, and what changed, has to pass 
 from the data site to the display site. There are two ways to do that: 
 push or pull.

 PUSH means the data site will notify the display site that something has 
 changed, and the display needs to be updated. For that, the data site 
 has to know which pages of the display site are affected by which 
 change. Then, it has to notify the display site of this. Bad things:
 * Needs basically a cache of *all* queries *ever* asked of the data 
 site, as well as their results
 * Has to recalculate *all* of these after *every* change to find which 
 queries produce different results
 * Won't work if the display site is offline
 * Won't work well with non-wikipedias

 That can't be it. <snip PULL>

Hello,

I would personally PUSH datas from the wikidata to the content 
publishers (like wikipedia).

A lot of blog systems have a feature known as trackback. When someone 
publish an article wich contain reference to other blogs, its blog 
system will send a ping (known as XML/RPC ping) to the referenced blogs 
alerting them that their news got reused somewhere.

Simple example:
  Blog slashdot publish a news about nasa discovering martians.

  MartianFan001 wich is part of a "Life on mars foundation" decide to 
publish a news about it and reference slashdot.

  JohnDoe who like things about mars decide to publish a news on his 
personal blog and his article is something like:

  <<The mars foundation [http://marsfoundation/newsid/113] report a news 
originally posted by [http://slashdot/?newsid=123912 slashdot] about 
life on Mars !>>

He submits that news to its blog engine that parse links and try to send 
pings to marsfoundation and slashdot saying :
   johndoe.com/newsid=5 reference your article !

When receiving this ping, marsfoundation and slashdot blogs can update 
their trackback list:

slashdot news #123912 referenced by:
     "GeekHideout", "Nerds.com", "Mars foundation"

Marsfoundation news 113 referenced by:
   News referenced by:
     "JohnDoe"

So when some site wants to use wikidatas, it sends a query to the 
wikidata server associated with their internal reference (ex: name of 
the wikipedia article and language). Wikidata then send them the 
requested data and the wikidata internal reference.

When a wikidata is changed, the site send ping to every site referencing 
  that set of data with the update. From there the site using data will 
answer wikidata with a code:
  1/ data change acknowledged.
  2/ no more need for this data, remove me.
  3/ doesnt answer.

If it doesnt answer, there could be a system that queue the ping so it 
can be sent later (and eventualy be dropped after x days).

I believe the PULL method will generate too much traffic for datas wich 
are probably not meant to be changed between each view. Datas about 
species are probably much more stables than nasdaq stocks.

cheers,

-- 
Ashar Voultoiz - WP++++
http://en.wikipedia.org/wiki/User:Hashar

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[Wikitech-l] Re: WikiData: A rant