2018-09-16 22:03 GMT+02:00 BinĂ¡ris <wikiposta@gmail.com>:
The bot scanned the latest huwiki dump for 14 hours(!). (Not the whole dump, I used -xmlstart.) It went through 820 thousand pages and found 240+ matches (I displayed every 10th match).
Then the bot worked further 30-40 minutes to check the actual pages from live wiki, this time with namespace filtering on. (I don't replace in this phase, just save the list, so no human interaction is implied in this time.)
Guess the result! 62 out of 240 remained. This means that the bigger part of these 14 hours went into /dev/null.
Now I realize how much time I wasted in the past 10 years. :-(

I was not quite right. With the modified code it took 12 hours instead of 14, 630,000 pages were scanned instead of 820,000 and 83 matches found instead of 240+ (of which 62 are real). Bt this is still not the same.