On Sat, 2002-11-16 at 01:10, Magnus Manske wrote:
Jonathan Walther wrote:
- Similar to the question about the orphanedlinks table, why is l_from
a string, while l_to is an integer?
I think l_from could be an integer as well. Would probably increase lookup speed, and save memory. Some things might get more complicated, though, like "what links here".
On closer inspection, I think using an integer would make whatlinkshere *simpler*. Currently, every linking page's title has to be parsed into namespace/title and checked for redirect status: SELECT DISTINCT l_from FROM links WHERE l_to={$lid} (now loop over the results with:) SELECT cur_id,cur_is_redirect FROM cur WHERE cur_namespace={$ns} AND cur_title='{$t}'
If we used an integer index, I think we could do this in one step:
SELECT cur_id,cur_namespace,cur_title,cur_is_redirect FROM cur,links WHERE l_from=cur_id AND l_to={$lid}
Of course, I'm the guy flipping through "Learn SQL in a weekend" as I type, so take all my SQL advice with a suitably large grain of salt. ;)
- It might be the SQL way to make a "watchlist" table, where each row
contains a user id, and an article id. Or maybe this would be useless in a database that doesn't have subqueries.
You're right on this one. Guess noone really paid attention to that before...
I think I'm converted to this idea; subqueries might help for listing talk pages of watched pages, but I think we can work around it.
The most common case for the watchlist is checking whether the current article is in the list; this can be done a lot faster by checking two integers than by loading up a huge string and parsing it on every page load.
Additionally, when a page is renamed, it should stay in your watchlist under the new name. (I hate clearing out "X moved to Y" redirects from my watchlist...)
-- brion vibber (brion @ pobox.com)