CCing the Search and Discovery list.
On Sun, May 8, 2016 at 12:24 PM, Stan Zonov <stanzon(a)gmail.com> wrote:
Hi!
I have been trying to gage the speed/efficiency of a database I have setup.
In order to test it, I have filled it with a lot of wikipedia articles from
a specific category (for example history). The database does multi-word
queries and returns the articles that best match the multiword query. For
example if I search up "history in Italy in the past 100 years" then the
best matching articles should pop up.
I was wondering if anyone has any advice how to form sample test queries to
model realistic situations/queries. I don't think it would be fair to do
random phrases (such as "banana the string") and wanted to model queries
based on my data to test performance and correctness of output. Does anyone
have any advice? How or Is this done at wikipedia?
I have looked here
(
http://blog.wikimedia.org/2012/09/19/what-are-readers-looking-for-wikipedia…)
but the data has been down for a while.
Cheers,
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB