Dear wikimedia Community
My Name is Fidel Gil I am a master's Student from
Technical University of Kaiserslautern in Germany, and I
am currently running some experiments with the enwiki xml
datadumps. where I recreate the linking structure between
articles. when doing so for the article
https://en.wikipedia.org/wiki/Animation
I found that it has a link that has as name 'walt disney
studios' in the subsection 'Animated Features CGI' that
resolves to 'walt disney animation studios'.
When going through the xml file the entry Animation does
reference 'walt disney studios' a disambiguation page
rather than 'walt disney animation studios'.
small excerpt from the line in question 'In 1937, [[Walt
Disney Studios]] premiered their first-ever animated
feature'.
Do the xml file dump use the tag names rather than some
other form of URL resolution to create this [[<name of
article>]] tags.?
Looking forward to your reply
Fidel Gil
Greetings XML Dump users and contributors!
This is your automatic monthly Dumps FAQ update email. This update
contains figures for the 20200501 full revision history content run.
We are currently dumping 913 projects in total.
---------------------
Stats for udmwiki on date 20200501
Total size of page content dump files for articles, current content only:
22584497
Total size of page content dump files for all pages, current content only:
26960582
Total size of page content dump files for all pages, all revisions:
493829286
---------------------
Stats for enwiki on date 20200501
Total size of page content dump files for articles, current content only:
77391323458
Total size of page content dump files for all pages, current content only:
171945757548
Total size of page content dump files for all pages, all revisions:
20707694819796
---------------------
Sincerely,
Your friendly Wikimedia Dump Info Collector