I'm trying to extract the section 0 text in full (hopefully section 0 is the "summary" of the page), and extract the rest of the article as another string.

The Cloud9 library I'm using can give me html from the xml dump, so I'm working on replicating the regex patterns for the section markers. Unless you think there's a better way to get the section 0 text from the xml? 

Thanks,
Dan

On Fri, Oct 12, 2018 at 4:56 PM Platonides <platonides@gmail.com> wrote:
Hello Daniel

I'm afraid I'm not sure what you are trying to do. What exactly do you want to extract? The section names, the introduction sections (section 0), something different... ?

Kind regards
_______________________________________________
Mediawiki-api mailing list
Mediawiki-api@lists.wikimedia.org
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.wikimedia.org_mailman_listinfo_mediawiki-2Dapi&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=v6T2EyE4KveT7ULVWpZKEQ&m=EL2sTpdXBk1ewu1fXVjDrRMtYCUtO9cqoPJnTuWpNwU&s=1qoYuYmG75JGcAEST0CGvPr2Cm0ltrKMoPSFMEvYkoY&e=