Afonso Arantes wrote:
I'm currently writing a custom script that will
translate the english language
wikipedia dump into html. Everything seems ok except that the Notes: sections
with external links seem to be missing.
As an example, the article on Autism (id 25) goes straight from "See also" to
"References". In the
http://en.wikipedia.org/wiki/Autism page there is an
extensive "Notes" section between these two with extensive commentary and
numbered links to external sources. I cannot seem to find this material anywhere
on the page, nor are there template directives that might include them. Links to
the notes are made throughout the article but are not present in the xml.
Am I missing something obvious or do should I download another file with this
extra information? I thought that maybe this was just an older dump but other
articles seem to be affected as well.
Thank you for your help.
I haven't seen the dump myself, but the Autism article utilizes <ref>
and <references /> tags [1], which means that Wikipedia is generating
the Notes section from the many <ref> tags in the article body. This is
thanks to the Cite.php extension [2]. My guess is that the dumps haven't
done anything with the <ref> tags, so you should be able to piece
together a Notes section with your script.
[1]
http://en.wikipedia.org/w/index.php?title=Autism&action=edit§io…
[2]
http://meta.wikimedia.org/wiki/Cite/Cite.php
--
Minh Nguyen <mxn(a)zoomtown.com>
[[en:User:Mxn]] [[vi:User:Mxn]] [[m:User:Mxn]]
AIM: trycom2000; Jabber: mxn(a)myjabber.net; Blog:
http://mxn.f2o.org/