-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello Hannes
Just wondering; is your text parser able to correctly find all headings (e.g. '== bla ==' as well as '<h2>bla</h2>') and distinguish headings from other similar text but within a paragraph? And finally return the byte offset of those headings?
I am using such a piece of code written with help of difflib and it is may be useful here also? (even though I had not that much time to write a unittest with full coverage... but a simple one is there ;)
Greetings DrTrigon
On 23.01.2012 23:34, Hannes Röst wrote:
Hello all
From one of my assignments as a bot operator I have some code which
does template parsing and general text parsing (e.g. Image/File tags). It is not using regex and thus able to correctly parse nested templates and other such nasty things. I have written those as library classes and written tests for them which cover almost all of the code. I would now really like to contribute that code back to the community.
Would you be interested in adding this code to the pywikibot framework? If yes, can I send the code to someone for code review or how do you usually operate?
Greetings
Hannes
PS: wiki userpage is http://en.wikipedia.org/wiki/User:Hannes_R%C3%B6st
_______________________________________________ Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l