[pywikibot] about parsing the dump

18 Jan 2016


      hello hello!
about the use of pywikibot:
is it possible to use to parse the xml dump?
I am interested in extracting links from pages (internal, external, with
distinction from ones belonging to category).
I also would like to handle transitive redirect.
I would like to process the dump, without accessing wiki, either access
wiki with proper limits in butch.
Is there maybe something in the package already taking care of this ?
I 've seen in https://www.mediawiki.org/wiki/Manual:Pywikibot/Scripts
there is a "ghost" extracting_links.py" script,
I wonted to ask before re-inventing the wheel, and if pywikibot is suitable
tool for the purpose.
Thank you,
L.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

[pywikibot] about parsing the dump