Re: [Wikitech-l] Interface embarrassment rant

21 Feb 2008

On Thu, Feb 21, 2008 at 11:11 AM, Roan Kattouw &lt;roan.kattouw(a)home.nl&gt; wrote:
...
   Although we may want to add cl_is_category,
retrieving subcategories
  only is already possible with the current schema:

  SELECT whatever FROM categorylinks JOIN page ON page_id=cl_from AND
  page_namespace=14 WHERE cl_title='Foo' AND cl_namespace=0;

  No idea how efficient this query is, maybe it's evil and uses 57
  filesorts, who knows ;) 
No filesorts, since nothing has to be sorted (no ORDER BY or GROUP
BY).  It will just scan every single categorylinks row corresponding
to the given category.  It won't be able to use an index for the
entire query, either: it will have to hit the page table data blocks.
Probably it will take 30 seconds or more on Wikimedia servers for
Living people or other large categories, given that just scanning the
index for all those rows (no joins, no hitting data blocks) is five
seconds.  Could be a few minutes, I guess.  Either way it's not
remotely acceptable.

That such things are possible to do with the current database is
obvious.  After all, we store all the relevant information: there's
got to be some way to retrieve it, somehow, even if it requires
querying all tables in their entirety and manually processing every
row in PHP.  So the only real question in these things is efficiency.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Interface embarrassment rant