Re: [Wikitech-l] Postgres

7 Jan 2003

      On mar, 2003-01-07 at 14:40, Tomasz Wegrzanowski wrote:
...
I played a bit with database.
Here is something that looks like Wikipedia database schema but
compiles in Posgres (and no, Wikipedia script won't be able to use
that yet):
:)))) Yay!!!!!!  I *like* it when suggestions come with actual attempts
at preliminary code.
...
Of course, this isn't "real" code, and many changes will be necessary
to make it any useful. Anyway:

user and old are reserved keywords in Postgres

Grrr! Well, 'user' can become 'users' easily enough. ;)
I'm also considering whether 'cur' and 'old' should be split up
differently, into 'pageinfo' (title, restrictions, counter, redirect)
and 'revisions' (text, comment, date, user, minor). This would perhaps
be less radical than Jonathan's suggestion of a raw title<->numeric id
table, but should simplify operations that work on both current and old
revisions. (May need a shortcut for grabbing the current revision,
perhaps a key from the pageinfo.)
...

why are we using tinyblob(1) instead of bool for boolean data ?

Good question.
...

why are we creating so many indexes ? are all of them necessary ?

On cur and old, we need indexes on title (to look up individual pages),
timestamp (for sorting), and user (for contribs). Fulltext indexes for
searching on the special index fields. Namespace presumably helps along
with title?
On link tables, we're keying to indexed fields, and have a *lot* of rows
which we may search from either end (whatlinkshere, orphan checks,
wantedpages, etc etc).
Indexes on ipblocks table are probably useless, as it's very small.
User table has index on user name; helps with sorting the big user list.
(What about lookup of options on every page view by logged in user? Is
this number or name?)
Size, name, and timestamp fields in the image table may be used for
sorting the huge and growing list o' files.
...

shouldn't we use database-provided timestamps instead of varchar(14) ?

As long as we know how to manipulate them programmatically and format
them for the selected language and time zone...
...

rights and restrictions don't seem to be represented very efficiently.
  using integer flag for that seems like a better idea to me

Maybe. It's not really a bottleneck, though, and legibility isn't
something to be hated.
...

it might be good idea to consider making our own searching system
  instead of relying on something provided by database

Hell, why rely on a database? Why not write our own storage system? And
filesystem... and virtual memory... and process manager... :)
We shouldn't have to be a wheel factory; if a good, fast, flexible
search mechanism is available in our database we should use it.
-- brion vibber (brion @ pobox.com)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Postgres