[Mediawiki-l] parsing Allpages?

Juhana Sadeharju kouhia at nic.funet.fi
Sat Aug 26 18:00:07 UTC 2006


Hello. For my wikiget program I cannot figure out how to parse the Allpages.
The exact url (given by Special:Allpages with Main namespace + Go button) is:
  http://www.nwn2wiki.org/index.php?title=Special%3AAllpages&from=&namespace=0
(Other downloaded namespaces are 4, 6 and 14.)

Is there any rule which could be used to parse the page urls?
The page urls would then be given to exporter:
  http://www.nwn2wiki.org/Special:Export/FULLPAGENAME

Is there an Allpages page which lists the pages in a parseable format?

Additional trouble is that the Allpages are splitted to multiple
parts (for unknown reason).

Suggestions for future needs:
(1) A list of "<namespace> <fullpagename>" of all pages would be nice:
  http://www.nwn2wiki.org/Special:AllpagesRaw
All namespaces would be listed.
(2) The list of pages in
  http://www.nwn2wiki.org/index.php?title=Special%3AAllpages&from=&namespace=0
could be enclosed inside comments
  list of all pages starts here
  list of all pages ends here
for easier parsing.
(3) Or the table containing the list of all pages could be labeled.
(4) Labeling of "Next page" link.
(5) Remove or make optional the limits such as
  class SpecialAllpages {
     var $maxPerPage=960;
One could add
  # GET values
  $maxPerPage = $wgRequest->getInt( 'maxperpage' );
Value 0 could default to 960 when maxperpage is not given.
Then I could get list of all pages with
  http://www.nwn2wiki.org/index.php?title=Special%3AAllpages&from=&namespace=0&maxperpage=1000000

I would prefer the suggestion (1).

Juhana
-- 
  http://music.columbia.edu/mailman/listinfo/linux-graphics-dev
  for developers of open source graphics software



More information about the MediaWiki-l mailing list