On Wed, Nov 25, 2009 at 2:22 AM, Domas Mituzas midom.lists@gmail.com wrote:
Hi!
Please read my comment over again: "I can't imagine this is a query you want to run over and over again. If it is, you'd probably want to use partitioning."
Which would make sense if no other queries are being ran :)
It'd make sense if most of your queries used one partition or the other, and not both. Kind of like Wikipedia's history/current tables, which are effectively using partitioning, though it's being done in the PHP (and any other language with software which tries to use the database) instead of being done in the database using rules and triggers (and thereby being accessible to software written in any language).
With PG though you can define an index on smaller subset, may be better than partitioning.
Not in this case. You want to physically move the data so you can access fewer pages, not just create an index on it. PG doesn't move data just because you create an index on it.
The word "DELETE" does not appear anywhere on that page I referred to. The examples on the page are all SELECTs. Try again.
Argh, damn terminology, was thinking about partition drops. Anyway, those SELECTs are 'faster' if they hit partition key, but then, people usually use PKs as their partition keys, so it doesn't really matter :-)
Reread my messages now that you realize you were confused the first time: "In MySQL, you could achieve the same thing through clustering, however." MySQL clusters on the primary key. This is great, or at least it would be great if it didn't mean MySQL locks the whole table every time you use any DDL. Still no "create index concurrently", right?
Is there a particular problem you're having with them which is unsuitable for Wikipedia?
*shrug*, wrong native collations?
It's not PostgreSQL's fault if you've got buggy locales installed on your system.
Not using locale-specific character locality in unique matching (haha, I could use this argument in opposite, when talking about MySQL support :), etc
Exactly my point. PostgreSQL does by default exactly what Wikipedia wants to do with respect to uniqueness matching. You can still do it the other way by making an index on an expression, but in the case of Wikipedia, which is what I thought we were talking about, there's no need for that.
MySQL, on the other hand, doesn't give the option. You either choose binary, and lose the collation, or you drop the binary and have to drop the unique key constraint.
Sorry, I can't reproduce your error:
Because you didn't read what I wrote. I wrote I was using language- specific collation :)
Maybe you were using a *broken* language-specific collation? Do you get the same error when you use a locale-aware sorting program, like gnu sort?
It was collation error, not operator error. I just showed it to illustrate my point, that there's quite some work to get working collations (which usually involves building locales yourself). Do note, that once you have indexes in place, any locale change is really painful and requires full database rebuild. One of reasons we're still 'binary' is that nobody really wants to own the pain of maintaining charsets server-side. It is much bigger project, than most of people see, at our scale. Of course, one may just chose to believe, that there's silver bullet for everything.
The thing is, I never claimed there was a silver bullet. You asked for *any reason* to use PostgreSQL. I gave one. Just one, because that's all you asked for.
I'm sure you could find someone to "own the pain of maintaining charsets server-side". Anyone ask Gerard M if he knows somebody? If not, the WMF should hire someone, because the comment in the bug submission is right on that it's just embarrassing for an encyclopedia not to be able to sort things correctly. Just using a generic locale would be orders of magnitude better than using a binary collation. And you could keep the current hacks and kludges in place in each language until a proper locale for that language is written.