[Xmldatadumps-l] Re: Part of pages missing in N0 enterprise dumps

18 Mar 2022


      Il 18/03/22 14:04, Erik del Toro ha scritto:
...
Just wanted to tell you, thathttp://aarddict.org  users and dictionary
creators also stumbled about these missing namespaces and are now
suggesting to continue scraping these. So is scraping the expected
approach?
Thanks for mentioning this. Not sure what you mean by scraping here 
exactly: if you mean parsing the wikitext, definitely not; if you mean 
getting the already-parsed HTML from the REST API, it's acceptable.
https://www.mediawiki.org/wiki/API:REST_API/Reference#Get_HTML
As for HTML dumps, the ZIM files by Kiwix for the French Wiktionary 
include pages like "Conjugaison:espagnol/aumentar", so that's another 
possible avenue for bulk imports. I've checked the latest version:
https://download.kiwix.org/zim/wiktionary/wiktionary_fr_all_nopic_2022-01.zi...
Federico

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[Xmldatadumps-l] Re: Part of pages missing in N0 enterprise dumps