Hi!
Please read my comment over again: "I can't imagine this is a query you want to run over and over again. If it is, you'd probably want to use partitioning."
Which would make sense if no other queries are being ran :) With PG though you can define an index on smaller subset, may be better than partitioning.
The word "DELETE" does not appear anywhere on that page I referred to. The examples on the page are all SELECTs. Try again.
Argh, damn terminology, was thinking about partition drops. Anyway, those SELECTs are 'faster' if they hit partition key, but then, people usually use PKs as their partition keys, so it doesn't really matter :-)
Partitions will make queries faster for people who don't have indexes (that is actually the major use case for people doing DW)
I suspect you either know the answers to these questions or can easily look them up.
Oh well, PG added collation support in 'CREATE DATABASE' in 8.4, and those collations still rely on system ones, that aren't too perfect (how many applications actually do use system collations?)
Is there a particular problem you're having with them which is unsuitable for Wikipedia?
*shrug*, wrong native collations? Not using locale-specific character locality in unique matching (haha, I could use this argument in opposite, when talking about MySQL support :), etc
Does Wikipedia not use a separate database for each language?
In PG terminology, that would 'separate schema', which doesn't really support separate charsets/collations. Though of course, using separate DBs/instances is what we do now.
Sorry, I can't reproduce your error:
Because you didn't read what I wrote. I wrote I was using language- specific collation :) Generic collation will also fail on other characters (e.g. ลก will be mapped to s, when it should be treated as separate letter).
I suspect operator error, but if you want to submit your bug to http://www.postgresql.org/support/submitbug I'm sure someone will go over it with you.
It was collation error, not operator error. I just showed it to illustrate my point, that there's quite some work to get working collations (which usually involves building locales yourself). Do note, that once you have indexes in place, any locale change is really painful and requires full database rebuild. One of reasons we're still 'binary' is that nobody really wants to own the pain of maintaining charsets server-side. It is much bigger project, than most of people see, at our scale. Of course, one may just chose to believe, that there's silver bullet for everything.
Cheers, Domas
P.S. Where is PG's replication? How does it deal with DDL? :) P.P.S. Anyone running PG in production on a big website?