[Wikitech-l] Re: Time spent in page rendering

10 Sep 2005


      Jeremy Dunck wrote:
...
On 9/10/05, Timwi timwi@gmx.net wrote:
...
Do we still query the database for all these links to decide whether
they exist and (depending on user settings) whether they are stubs? As I
recall, someone once had an idea to write a simple, light-weight daemon
for this that uses a dictionary data structure (I presume they were
thinking of a DAWG (directed acyclic word graph)) for this.
I'm not familiar with DAWG, but why wouldn't a nested sparse hash work?
link[from-id][to-title] = 0 if no link, 1 if link.
Well, firstly, we're not interested in links between pages here, but 
merely in whether a page exists and if so, whether a page is a "stub".
The answer to your question is that a hash table uses much more memory. 
It says on the page you linked to that it has "2 bits per entry 
overhead", so the entire hash table would be the size of the list of all 
article titles plus 2 bits times the number of articles. A DAWG is 
usually much smaller than the size of the list of all entries.
The downside, of course, is that in a hash you can store any value with 
all the keys, while a DAWG is just a binary (yes/no) decider.
Timwi

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[Wikitech-l] Re: Time spent in page rendering