Hello mediawiki-l,
On http://de.wikipedia.org/wiki/Wikipedia:Download and other wikipedia download sites I found a tutorial on updating my local database with the wikipedia mysql dump. It says afterwords you have to do a "php rebuildlinks.php". But why? Well, I assume it rebuilds some links but which ones? I thought everything is dynamic in the database, and the only links are in the cur_table, which I updated with a mysql command.
I didn't understand the php source, either.
Or isn't it necessary? Thanks for your answers in advance.
On Dec 7, 2003, at 12:56, Freerk wrote:
On http://de.wikipedia.org/wiki/Wikipedia:Download and other wikipedia download sites I found a tutorial on updating my local database with the wikipedia mysql dump. It says afterwords you have to do a "php rebuildlinks.php". But why? Well, I assume it rebuilds some links but which ones? I thought everything is dynamic in the database, and the only links are in the cur_table, which I updated with a mysql command.
There are presently three link tables: 'links' tracks all "live" links from wikipages to other wikipages that do exist; 'brokenlinks' tracks "broken" links, those that go to pages that don't yet exist; 'imagelinks' tracks usage of images in wikipages.
There are a couple of uses for these tables: - enables "What links here" and "Related changes" to work, looking at incoming or outgoing links - provides the list of pages that use an image shown on an image's description page - allows making reports of pages that aren't linked (Orphans, Unused images) or are linked to but don't exist (Most wanted pages) - slightly speed up page rendering by avoiding individual checks for the existence of each linked page to determine how to render its link
(The latest software also adds a 'linkscc' table which caches data from the other three tables. This is used only for speeding rendering.)
If you import data to the cur table and don't rebuild the links, you won't be able to use "What links here", "Related changes", "Orphans", etc. If you don't want to, well I suppose that's okay...
-- brion vibber (brion @ pobox.com)
Hello Mediawiki-l,
thanks for the answer. I updated the database today (after the 2003-12-08 update) like this:
cd ...htdocs/wikipedia wget http://download.wikipedia.org/archives/en/cur_table.sql.bz2 bzip2 -dc cur_table.sql.bz2 | mysql -u wikipedia -pmypassword wikipedia
Was that right, that easy? Or do I have to delete the old cur table first? It seems it wasn't really updated. It took about 8 minutes (athlon 2400+/512mb ddr) and there was no error message.
When I now call pages from my local wikipedia mirror the most recent edits are still from november. I looked up a page that was edited on december 2 in wikipedia but on my local mirror only a version from november appeared. (In november I installed everything the first time). Also new entries are not accessable.
I only want to serve the printable page (via mod_rewrite) so nobody ever will use the "What links here", "Related Pages" or something like that. I understood your answer, Brion, that way that under this circumstances I don't have to do a "php rebuildlinks.php". Or should I? Does it really take several hours?
But well, maybe its just something completey different...
thanks for your help!
On Dec 8, 2003, at 10:49, Freerk wrote:
Hello Mediawiki-l,
thanks for the answer. I updated the database today (after the 2003-12-08 update) like this:
cd ...htdocs/wikipedia wget http://download.wikipedia.org/archives/en/cur_table.sql.bz2
That'd be the December 3 dump off the old server. Try the new server, which should be at download.wikimedia.org (sigh...)
bzip2 -dc cur_table.sql.bz2 | mysql -u wikipedia -pmypassword wikipedia
Was that right, that easy? Or do I have to delete the old cur table first? It seems it wasn't really updated. It took about 8 minutes (athlon 2400+/512mb ddr) and there was no error message.
The first command in the dump deletes and recreates the cur table. Nothing is printed on success; if there were a problem then it would complain.
When I now call pages from my local wikipedia mirror the most
recent edits are still from november. I looked up a page that was edited on december 2 in wikipedia but on my local mirror only a version from november appeared. (In november I installed everything the first time). Also new entries are not accessable.
November when?
I only want to serve the printable page (via mod_rewrite) so nobody ever will use the "What links here", "Related Pages" or something like that. I understood your answer, Brion, that way that under this circumstances I don't have to do a "php rebuildlinks.php". Or should I? Does it really take several hours?
1 hour 45 minutes on a 2GHz Opteron with lots and lots of memory.
-- brion vibber (brion @ pobox.com)
mediawiki-l@lists.wikimedia.org