Re: [Wikitech-l] mimi-wikipedia - Wikitech-l

29 Jan 2003

Hello, everyone

...
 Rendering a Wikipedia article requires, for example, to
 look up all the
...
 links contained in it and determine if the pages exist
or  not. 

What else do we need to render wiki-marked up text? Aside 
from images, I don't see any other parts that require access 
to the main index.

...
 Actually, we had a *decrease* in traffic in the last
month  due to the
...
 Google hiccups. 
I know this. 

We should be able to cope with much higher traffic if we
...
 optimize our queries. Note that pure bandwidth is not a
 problem; the
...
 database tarball downloads are very fast. Ask Brion for
the  server specs
...
 and be impressed. 
Ok, so the problem is cpu usage or database optimization 
stuff. I think you are right. If the database and queries to 
it is too complex, decentralizing hardly improve the 
performance.

By the way, notice I am not talking about only the 
performance but also scalability and extenability. What 
about such a case?

...
 2) You cite Google as an example of a huge centralized 
database

No, I cited it as an of 'decentralized' database.

...
 Trust me, wiki is *very* hard to decentralize. It's
a nice  idea, but it
...
 will take years until it happens. You need an
architecture  like Freenet
...
 ( http://freenetproject.org ), only scalable (which
Freenet  is not),
...
 plus SQL-like query support. 
While it is nice, it seemed for me only solution eventually. 
I think finally see the gap of understanding. I was talking 
about years-long project. I was talking about the time 
wikipedia reaches the next milestone 1,000,000 (million-
pedia? haha). It seemed that eventually we have to go to the 
same path that big sites like google or amazon went. As you 
and I know, Google is heavily decentralized and it is one of 
its strength. I bet you know about load balancers (I know 
almost nothing about it thought). In my knowledge, most of 
huge sites decentraize their site to mimi servers like 
Google. We, of course, don't have finance to sustain such a 
huge decentralized data-center. But we have decent 
democratic community. The strehgth of us is that community.

As I wikipediaholic (ach!), I am so worried about the future 
of wikipedia in terms of the server. We need definitely 
better solution (if not necessary my decentralized server 
idea). Possible solution includes the proposal Pieter's 
published scripts or better web-site for wikipedia 
developers. (It seemed quite a few people who are actually 
coding for wikipedia, compared with the scale of wikipedians 
writing articles)

...
 wiki is *very* hard to decentralize 
I knew this. But can we figure out about how in here? I am 
not saying you do what I told. I can cooporate of course, 
tons of skilled programmers can do too I guess.

If you think this debate is totally wasting time (I mean if 
I, who knows little, am annoying this mailing list), let me 
know then I quit but if not, please give me a comment.

Okay, how about this? It seems to me that one of core 
problems is rendering requires queries about whether the 
page exists or not. I remember the post saying the rendering 
of a page containg many internal links is one of bottleneck 
(I suppose it is still true).

First, each mimi-wikipedia has a database about if the page 
exists or not and subscribes the list of newly-made pages. 
Also, when it is launched, it downloads the complete list of 
new pages. Because now that each mimi knows which page 
exists or not regardless its mimi-database, it is possible 
to render the page without querying the main database.

For disclamiler, probably I am wrong. But if you can, can 
you tell me why and how so?

Anyway, I understand wikipedia still can be optimized much 
(that I didn't know in the first time I posted my proposal) 
Sure, if we can optimize the database and it gains the huge 
increase in performance hopefully, we should head for there 
of course. The priority should be optimization of the 
database. I agree.

Anyhow, I approciate your detailed explanation about the 
problem we face now to me who knows really little.