On Wed, Aug 23, 2017 at 10:32 AM, Max Vlasov <max.vlasov(a)gmail.com> wrote:
I'm using two step approach to retrieve "see
also" section links from
articles by retrieving sections and using the section index for links
action.
Today I noticed that for the article "Synchronous programming language"
this gives unexpected results.
The page
https://en.wikipedia.org/w/api.php?action=parse&prop=
sections&page=Synchronous%20programming%20language&format=xml
returns
... <s toclevel="1" level="2" line="See also"
number="4" index="4" .....
and the following query
https://en.wikipedia.org/w/api.php?action=parse&prop=
links&page=Synchronous%20programming%20language§ion=4&format=xml
gives a long list of links very different to the actual correct list shown
in the wikipedia article. Other usage of the same algorithm with other
articles works correctly.
Is this a bug?
No, it's not a bug. The "See also" section on that article also happens to
contain the navbox, so you're getting all the links from the navbox as well
as the links you expect.
Most articles don't have this problem because the "References" and
"External links" sections usually come after "See also", as described
at
https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Layout#ORDER. So
the navboxes would wind up being part of those sections instead.
--
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation