Mathieu Stumpf psychoslave@culture-libre.org wrote:
Hello,
I want to add esperanto words to fr.wiktionary using as input a file where each line have the format "word:the fine definition". So I copied the basic.py, and started hacking it to achieve my goal.
Now, it's seems like the -file argument expect a file where each line is formated as "[[Article name]]". Of course I can just create a second input file, and read both in parallel, so I feed the genFactory with the further, and use the second to build the wiktionary entry. But maybe you could give me a hint on how can I write a generator that can feed a pagegenerators.GeneratorFactory() without creating a "miror file" and without loading the whole file in the main memory.
All "pagegenerators" return only a series of Page objects and nothing else; they are useful to create just a list of pages to work on.
I wrote a very simple mini-bot using a different kind of generator that feeds the bot with both pagename and the content.
You can download the code from Gerrit:
https://gerrit.wikimedia.org/r/98457
You should run it like this:
python onelinecontent.py -simulate -contentfile:somecontent
where "somecontent" contains:
A:Test one line B:Second line
Hope that provides some starting point for you,
//Saper