Let's give Jan access to whatever he needs, a developer account on the
machine, whatever. He's good, and his advice will be priceless.
Ian Gilfillan, author of Mastering MySQL 4, has also volunteered to
help, and if he wants access, we can give it to him too, even though I
don't know him. He wrote a book, so he's legit.
--Jimbo
It's the sorts on some of our queries that are so damn slow. A few
hundred or a couple thousand rows can take *a couple minutes* for it to
sort on the timestamp field.
I've just put in a hacky manual sort (in CVS) for the page history; all
rows are returned without sorting (so in ID order, usually but not
always chronological) and then output in reverse order. It's *immensely*
faster. Try loading up the history on the Village Pump or Votes for
Deletion, and watch it actually come up! Kinda nice. :)
I also took the opportunity to drop in the see next X links, so you
don't have to load all 2500+ items at once on your (well, my) 56k modem.
(These are used pretty generally, and the code really ought to be moved
out of SearchEngine.php and into globalfunctions...)
Of course, it would be *nicer* to do the ORDER BY and LIMIT in mysql.
The old query:
EXPLAIN SELECT old_id,old_namespace,old_title,old_user,
-> old_comment,old_user_text,old_timestamp,old_minor_edit FROM old
-> WHERE old_namespace=4 AND
-> old_title='Village_pump'
-> ORDER BY old_timestamp DESC;
+-------+------+---------------+---------------+---------+-------------+------+----------------------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+-------+------+---------------+---------------+---------+-------------+------+----------------------------+
| old | ref | old_namespace | old_namespace | 256 | const,const | 1900 | where used; Using filesort |
+-------+------+---------------+---------------+---------+-------------+------+----------------------------+
The new query:
EXPLAIN SELECT old_id,old_namespace,old_title,old_user,
-> old_comment,old_user_text,old_timestamp,old_minor_edit FROM old
-> WHERE old_namespace=4 AND
-> old_title='Village_pump';
+-------+------+---------------+---------------+---------+-------------+------+------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+-------+------+---------------+---------------+---------+-------------+------+------------+
| old | ref | old_namespace | old_namespace | 256 | const,const | 1900 | where used |
+-------+------+---------------+---------------+---------+-------------+------+------------+
Now, in an ideal world, it could do the sorting based on that handy
index it has on old_timestamp. In the topsy-turvy world of MySQL,
however, only the index used for the WHERE clause means anything. Other
ways to speed it up?
http://www.mysql.com/doc/en/ORDER_BY_optimisation.html
Oh, yes, and I changed the index from old_title to a combo on
old_namespace and old_title. This should cut down the number of rows a
little bit where a page and its talk page are both long; we never WHERE
the two columns separately on the old table, only together.
SHOW INDEX FROM old;
+-------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Comment |
+-------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+---------+
| old | 0 | old_id | 1 | old_id | A | 744949 | NULL | NULL | |
| old | 1 | old_timestamp | 1 | old_timestamp | A | 744949 | NULL | NULL | |
| old | 1 | old_user | 1 | old_user | A | 14 | NULL | NULL | |
| old | 1 | old_user_text | 1 | old_user_text | A | 74494 | NULL | NULL | |
| old | 1 | old_namespace | 1 | old_namespace | A | 14 | NULL | NULL | |
| old | 1 | old_namespace | 2 | old_title | A | 148989 | NULL | NULL | |
+-------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+---------+
And, since I was torturing things with slowness anyway, old is now
InnoDB. Whoop.
-- brion vibber (brion @ pobox.com)
I would support any and all temporary "ad hoc" measures that make
sense to the other developers to kick start the server automatically
from time to time.
For example, envision a cron job that semi-intelligently (or
semi-stupidly!) hunts for those runaway killer processes and just kill
-9's them. Whatever someone is doing to kill the machine, they should
stop.
Of course, we should *also* hunt down and resolve the problems that
lead to this, but really, it's just as important to keep this puppy
humming along.
> Seriously: Something that *might* help not only with that problem, but
> would likely reduce server load (and thus, crashes in the first place)
> would be to run apache and mysql on different servers. IIRC, this is
> suggested by both apache and mysql online manuals. Question is wether
> to
> put mysql on the slower or the faster machine (assuming they're not
> identical).
Apache should never crash... if so, something is wrong! i'd try to
install freeBSD with Apache on the slower computer, and redhat-linux
with mysql4 on the faster one.
Even an old PII 400 is fast enough to serve several tousand users at the
same time. it should only have a lot of memory!
And.. of course.... is your internet connection bandwith limited? if
so... this produces a lot of load if the server-load is near the maximum
possible bw, so... keep that in mind,,,
Phil
On Friday 31 January 2003 05:41 pm, Anthere wrote:
> > it up, and I don't think it should be used for
> > personal essays, sorry,
> > Anthere). Using subpages on Meta might also help for
> > organization.
>
> Ah ? Well, I disagree. Of course permanent deletions
> of personal essays can only occur after a consensus is
> reached about that, no ?
I agree with Anthere and strongly disagree with eliminating personal essays
from meta. If POV material isn't allowed on meta then where should it go?
This will only make it more difficult to keep this stuff out of the
encyclopedias.
Meta can and should be many things. Simply create an alternate Main Page for
whatever you want to focus on (software for example) and organize everything
on that page and the pages linked from it in any way you wish. Heck even
create another namespace if you really want to organize things, but I see no
reason whatsoever why meta shouldn't be more like a regular wiki with a
fairly undefined scope. What really is needed is more integration between
topics discussed on the mailing lists and meta.
--mav
Are there any objections to changing the history view to work as it does
on MeatBall? Example:
http://www.usemod.com/cgi-bin/mb.pl?action=history&id=CarnetWiki
(Only the revision selection via radiobuttons, not the change display,
our DiffEngine is far superior to UseMod's.)
Regards,
Erik
--
FOKUS - Fraunhofer Insitute for Open Communication Systems
Project BerliOS - http://www.berlios.de
Hi,
I really do like the new interlanguage links interface, and hope we can
put it in use ASAP. Some comments:
1) Can you check the code into CVS under a separate tag?
2) The interface is well done. It's a neat idea to have a separate edit
screen for this purpose, since editing interlanguage links and editing
text are reasonably separate. However, we probably want the
ilinks+add/edit links to show up on the edit screen as well.
3) We need to log interlanguage link edits. We can't use RC because of
the different logs involved. So we need an ilinks log similar to RC, and
an individual ilinks history as well as a recent_ilinks type page (which
could be integrated into RC). We need to do this to prevent unnoticed
link vandalism. Being able to roll back changes would be a plus. If we
implement that, we can open ilinks to anons, otherwise I suggest we do
not.
4) I'll probably do some layout edits on the interface. We do need the
ability to add several ilinks at once. I don't know if the nice language
dropdown really gains us speed as it's just too damn full. How about:
Source: [en:Main Page ]
Destination(s), one per line:
[de:Hauptseite ]
[ ]
[ ]
[ ]
[ ]
[Do it]
We already have the two language codes and everyone working on one of
those wikis knows them. (Add a link to the list just in case.) Without
the square brackets, there is fairly little typing involved.
5) Once I have the code, I'll probably add "Select all / Select none"
JavaScript buttons to the checkbox lists.
Regards,
Erik
--
FOKUS - Fraunhofer Insitute for Open Communication Systems
Project BerliOS - http://www.berlios.de
On Wednesday 05 February 2003 05:34 am, wikitech-l-request(a)wikipedia.org
wrote:
> Are there any objections to changing the history view to work as it does
> on MeatBall? Example:
>
> http://www.usemod.com/cgi-bin/mb.pl?action=history&id=CarnetWiki
>
> (Only the revision selection via radiobuttons, not the change display,
> our DiffEngine is far superior to UseMod's.)
>
> Regards,
>
> Erik
I was going to suggest this a while ago but forgot. Please do make the
changes!
--mav
On Wednesday 05 February 2003 05:34 am, wikitech-l-request(a)wikipedia.org
wrote:
> We could save a lot of space by summarizing directly subsequent edits
> made by one user into a single edit. User foo edits article bar 10
> times, and for each edit, the previous one is deleted. This would also
> reduce clutter in RC. Just check if OLD contains a top revision by the
> same user and delete it before inserting the new row (preferably as one
> transaction).
>
> To avoid involuntary overwriting of one's own words, we could do this
> only if the previous edit had no edit comment, and occurred less than 10
> minutes ago.
>
> Thoughts?
>
> Regards,
>
> Erik
Although I love the idea I do fear that information may be lost. For example I
will often completely rewrite an article on a temp page, check it, then paste
that text into the main article stating that the article had been completely
rewritten. Then when I reread it a second time I often find an embarrassing
typo within 5 minutes. I then edit the article and say in the edit summary
"typo". It would be very bad if "typo" showed-up as the only comment on the
combined edit.
IMO it would also be good to set the combine edit timeout at an hour or maybe
even all subsequent edits made in the same UTC day (my personal favorite). So
how about this:
Combine all the comments in the order in which they were made and separate
them with a semicolon. For example:
(cur) (last) . . M 21:07 Jun 17, 2002 . . Maveric149 (typo)
(cur) (last) . . 20:46 Jun 17, 2002 . . Maveric149 (opps - forgot some stuff)
(cur) (last) . . 20:28 Jun 17, 2002 . . Maveric149 (new format ready for
general wikipedia consumption -- enjoy)
would become:
(cur) (last) . . 20:28 - 21:07 Jun 17, 2002 . . Maveric149 (new format
ready for general wikipedia consumption -- enjoy ; opps - forgot some stuff ;
M typo)
Notice the range of times given. Combining comments would also mean that we
should allow more text in the combined edit summary than we would allow in a
single edit summary since the combined edit comment will often be larger than
a regular edit comment.
--Daniel Mayer (aka mav)
>Hmm, one could put a short TTL on the DNS entry, and swap it to point to
>another machine in case of problems.
>
>However that could be problematic in terms of hugely increasing traffic
>for the DNS server.
I tried this several times... i don't think it is a good idea.
Isn't it possible to do a normal port-forwarding while doing some
work on the server?
Phil