Re: [Wikitech-l] WikiData: A rant

21 Oct 2004

Jens Ropers wrote:

...
  Good acumen Magnus. A very incisive "rant".

Thanks.

...

 Anyway, just musing and mulling:
 Could the PULL method be implemented w/ a checksum?
 Say, generate a fairly short checksum (we're talking versioning here, 
 not security) for every article revision.
 Then, with each request hitting an Intarweb-facing (caching) 
 webserver, have that cache box look if there's a version stored in its 
 cache for said article (if not, fetch the article from the actual DB , 
 etc.). IF however there is a cached version, ask the DB server for its 
 current checksum on its current version. If this matches the checksum 
 the cache has for its version, just don't bother the DB any further 
 and serve the page from the cache. If the checksums differ, then again 
 fetch the article from the DB and serve that (and cache the new 
 article and checksum for potential subsequent requests). This entire 
 checksum thing will NOT be required for any cached non-current 
 revisions, because they won't change. So, yes, for each request 
 hitting the cache server, there'd be a short checksum PULL with the 
 actual DB server, but other than that (and provided the article hasn't 
 changed) it can just be served from the cache. 
So, the DB server keeps a list with a checksum (or a version number; 
this is supposed to be wiki-like) for each data entry, and likewise does 
the article, right?
What if it more than a single data entry in that article? Like the list 
of species I mentioned.Say, a new species was added at wikidata; how to 
handle that one?
What if there are multiple queries in one article?
What if (in my example) the actual query is in a template?
What if that template includes other templates that contain queries?

Yes, I think that it could be done. But, and I say that as someone who 
started programming with "spaghetti code", it looks like a mess to me. A 
dependency nightmare. We are already suffering from such effects (think 
categories in templates) without wikidata to look out for.
Also, you will have to query the DB server and wait for its answer on 
*every* page view, including cached/anons, to deliver the checksum(s).
And, this will work only with the most rudimentary database structure, 
like "SELECT * from specieslist where name='Foo'". If wikidata is to 
become more complex than that (and I don't say it should, just 
speculating), if wikidata tables can be interlinked, then there will be 
no "simple" dependency on a single data entry anymore.

...
  Does that make sense to people?
 Or am I reinventing the wheel or something?
 I'm just brainstorming, I'm not even a real programmer. (Translation: 
 The above may be--or may not be--rubbish.) 
Definitely not rubbish. But a lot more complicated than it looks at 
first glance, IMHO.

Magnus

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] WikiData: A rant