Presumably this is because the infobox is at the end of the article,
so is counted as being part of the final section, which is the see
also section. So you are also getting everything in
https://en.wikipedia.org/wiki/Template:Types_of_programming_languages.
You could maybe adjust this algorithm to look for any pagelinks in ns
10, look at all the things that they link to, and subtract them from
your results, although I doubt that will work perfectly.
You could also try fetching the wikitext of the see also section, and
attempting to parse it just for a list of links, but that's probably
hard to get right.
--
Brian
On Wed, Aug 23, 2017 at 2:32 PM, Max Vlasov <max.vlasov(a)gmail.com> wrote:
Hi,
I'm using two step approach to retrieve "see also" section links from
articles by retrieving sections and using the section index for links
action.
Today I noticed that for the article "Synchronous programming language" this
gives unexpected results.
The page
https://en.wikipedia.org/w/api.php?action=parse&prop=sections&page=…
returns
... <s toclevel="1" level="2" line="See also"
number="4" index="4" .....
and the following query
https://en.wikipedia.org/w/api.php?action=parse&prop=links&page=Syn…
gives a long list of links very different to the actual correct list shown
in the wikipedia article. Other usage of the same algorithm with other
articles works correctly.
Is this a bug?
Thanks
Max
_______________________________________________
Mediawiki-api mailing list
Mediawiki-api(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api