On Mon, Jul 6, 2009 at 14:43, Andrew Garrett agarrett@wikimedia.org wrote:
On 06/07/2009, at 12:05 PM, Amir E. Aharoni wrote:
- The info won't be up-to-date. Would it be too much to ask to search
the database directly using regexes?
Yes.
We wouldn't allow direct searching from the web interface with regexes for two related reasons:
1/ A single search with the most basic of regexes would take several minutes, if not hours. [...]
2/ Even if we could find a way to make the former performant, a malicious regex could significantly expand this time taken, leading to a denial of service.
I imagined that that would be the answer. Maybe it is possible to set up a service that searches some back-up database, which is almost up-to-date, or to have someone approve the regexes to prevent DOS.
Just throwing a bunch of ideas to the air.
Mostly i was interested to find out whether there is an existing script that searches dumps, instead of writing one myself.
-- אמיר אלישע אהרוני Amir Elisha Aharoni
"We're living in pieces, I want to live in peace." - T. Moore