I've implemented a bot, which for Ruwiki solves the problem of lonelypages and even isolated articles. The query you've just cited (an analogue to be honest) is the key point there as well and anyway there is a possibility to make it faster than one you cited on the TS. The problem here is that we deal with views, not the db itself, so this blocks further optimization somehow.

If you are interested in the implementation, refer to https://fisheye.toolserver.org/browse/golem/isolated/namespacer.sql, especially the place where pl table is created and where p0 table is created (cache_namespace_pages function).

Mashiah

2008/11/26 Ilmari Karonen <nospam@vyznev.net>
Ja Ga wrote:
> I was investigating a better version of Special:Lonelypages (ignore
> disambig pages, exclude the already tagged, identify 1- and 2- link
> orphans as well as no-link orphans, etc.) but when I run the original
> Lonelypages SQL query on nightshade against enwiki_p it times out after
> 10 minutes.
>
> SELECT page_namespace, page_title FROM page LEFT JOIN pagelinks ON
> page_namespace=pl_namespace AND page_title=pl_title WHERE pl_namespace
> IS NULL AND page_namespace=".NS_MAIN." AND page_is_redirect=0;

I think the problem is simply that, no matter how you optimize it, this
is a slow query: it has to go through every page in the main namespace
and check if it has incoming links or not.  Adding a limit might help,
but not much: unlinked pages are fairly rare, so it still has to search
through a _lot_ of pages just to find a few dozen unlinked ones.

Anyway, I just tried it with "LIMIT 50", and it finished in two minutes.
 So it certainly works, it just takes time.  I'd suggest just running
it with /* SLOW_OK */ and session transaction isolation level set to
"READ UNCOMMITTED".

--
Ilmari Karonen

_______________________________________________
Toolserver-l mailing list
Toolserver-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/toolserver-l