Re: [Toolserver-l] Experimental: Live template value search

21 Apr 2009

On Tue, Apr 21, 2009 at 10:13 AM, Daniel Kinzler &lt;daniel(a)brightbyte.de&gt; wrote:
...
  Magnus Manske schrieb:
  I agree about Semantic MediaWiki, which is a
different beast (and
 might one day be used on Wikipedia). 
 That's really the question. Should we work *now* on making it usable for
 wikipedia, or should we focus on something simpler? 
IMHO we should try to harvest the data that is already in Wikipedia
first. Semantic Wikipedia, technical issues aside, relies heavily on
users learning a new syntax, which is a community (read: political;-)
decision. And it will be fought about much harder and longer than the
license question...

...
   The question
seems to be scalability.Extrapolating from my sample data
 set, just the key/value pairs of templates directly included in
 articles would come to over 200 million rows for en.wikipedia at the
 moment. A MediaWiki-internal solution would want to store templates
 included in templates as well, which can be a lot for complicated
 meta-templates. I think a billion rows for the current English
 Wikipedia is not too far-fetched in that model. The table would be
 both constantly updated (potentially hundeds of writes for a single
 article update) and heavily searched (with LIKE "%stuff%", no less).

 Would the RDF extension be up to that? 
 It would in a way: it just wouldn't store all parameters. It would store only
 things explicitly defined to be RDF values. That would greatly reduce the number
 of parameters to store, since all the templates used maintenance, formatting,
 styling and navigation can be omitted. It would be used nearly exclusively for
 infobox-type templates, image meta-info, and cross-links like the PND template.
 Or at least, that'S the idea. It also does away with problems caused by the
 various names a parameters with the same meaning may have in different templates
 (and different wikis). 
Nice! I was thinking along the lines of a template
whitelist/blacklist, but yours would be much more efficient. And it
would hide most of the technical "ugliness" in the templates.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Toolserver-l] Experimental: Live template value search