I know the best place to ask this would be Labs-l, but I just subscribed to it and am waiting for the list admin to be added. In the meanwhile some good soul could please forward this message there?
Is there a reason why the tables on the Labs replica are not indexed? Drawing a list of random titles with page_random takes more than a minute!
MariaDB [enwiki_p]> select page_id, page_title, page_random from page where page_random > rand() and page_namespace = 0 order by page_random limit 10; +----------+---------------------------------------+----------------+ | page_id | page_title | page_random | +----------+---------------------------------------+----------------+ | 18247710 | Jack_frost_comic | 0.000597330654 | | 2102857 | Prince_Carl_Christian_of_Hohenzollern | 0.000885503126 | | 21550893 | Phoberocyon | 0.000927813129 | | 2836759 | Rinascimento | 0.000943768415 | | 20972414 | Gothic_black_metal | 0.000989090479 | | 1654899 | 1500v_DC | 0.001175443821 | | 43905082 | Juan_García_Costilla | 0.00153940668 | | 1316905 | T_subsurface_stock | 0.001564608112 | | 1447578 | RPGMaker_2000 | 0.00158874586 | | 2034084 | State_Bar_of_California | 0.001697520497 | +----------+---------------------------------------+----------------+ 10 rows in set (1 min 21.14 sec)
MariaDB [enwiki_p]> show index from page; Empty set (0.00 sec)
MariaDB [enwiki_p]>
Giovanni Luca Ciampaglia
✎ 919 E 10th ∙ Bloomington 47408 IN ∙ USA ☞ http://www.glciampaglia.com/ ✆ +1 812 855-7261 ✉ gciampag@indiana.edu
The tables you are access on labs are actually "views". They won't show their indexes because they are the result of a query.
Also keep in mind that some views are *not* indexed.
For example, "revision" is not indexed, but "revision_userindex" is. Similarly, "logging" is not indexed, but "logging_userindex" is. This is confusing and cause for a lot for frustration, but there's good reason. See https://bugzilla.wikimedia.org/show_bug.cgi?id=66786
On Tue, Nov 11, 2014 at 12:23 AM, Giovanni Luca Ciampaglia < gciampag@indiana.edu> wrote:
I know the best place to ask this would be Labs-l, but I just subscribed to it and am waiting for the list admin to be added. In the meanwhile some good soul could please forward this message there?
Is there a reason why the tables on the Labs replica are not indexed? Drawing a list of random titles with page_random takes more than a minute!
MariaDB [enwiki_p]> select page_id, page_title, page_random from page where page_random > rand() and page_namespace = 0 order by page_random limit 10; +----------+---------------------------------------+----------------+ | page_id | page_title | page_random | +----------+---------------------------------------+----------------+ | 18247710 | Jack_frost_comic | 0.000597330654 | | 2102857 | Prince_Carl_Christian_of_Hohenzollern | 0.000885503126 | | 21550893 | Phoberocyon | 0.000927813129 | | 2836759 | Rinascimento | 0.000943768415 | | 20972414 | Gothic_black_metal | 0.000989090479 | | 1654899 | 1500v_DC | 0.001175443821 | | 43905082 | Juan_García_Costilla | 0.00153940668 | | 1316905 | T_subsurface_stock | 0.001564608112 | | 1447578 | RPGMaker_2000 | 0.00158874586 | | 2034084 | State_Bar_of_California | 0.001697520497 | +----------+---------------------------------------+----------------+ 10 rows in set (1 min 21.14 sec)
MariaDB [enwiki_p]> show index from page; Empty set (0.00 sec)
MariaDB [enwiki_p]>
Giovanni Luca Ciampaglia
✎ 919 E 10th ∙ Bloomington 47408 IN ∙ USA ☞ http://www.glciampaglia.com/ ✆ +1 812 855-7261 ✉ gciampag@indiana.edu
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Ah, I didn't know the tables on LabsDB where actually views! That makes sense now. I guess the documentation should be more explicit on this. Right now this detail is only mentioned in passing here (emphasis added):
The revision and logging tables do not have indexes on user columns. In an
email http://lists.wikimedia.org/pipermail/labs-l/2013-September/001607.html, one of the system administrators pointed out that this is because "those values are conditionally nulled when supressed". One has to instead use the correspondingrevision_userindex or logging_userindex for these types of queries. *On those views*, rows where the column would have otherwise been nulled are elided; this allows the indexes to be usable.
https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Database#Tables_for_revis...
Do you guys welcome contribution on wikitech?
G
Giovanni Luca Ciampaglia
✎ 919 E 10th ∙ Bloomington 47408 IN ∙ USA ☞ http://www.glciampaglia.com/ ✆ +1 812 855-7261 ✉ gciampag@indiana.edu
2014-11-11 8:46 GMT-05:00 Aaron Halfaker aaron.halfaker@gmail.com:
The tables you are access on labs are actually "views". They won't show their indexes because they are the result of a query.
Also keep in mind that some views are *not* indexed.
For example, "revision" is not indexed, but "revision_userindex" is. Similarly, "logging" is not indexed, but "logging_userindex" is. This is confusing and cause for a lot for frustration, but there's good reason. See https://bugzilla.wikimedia.org/show_bug.cgi?id=66786
On Tue, Nov 11, 2014 at 12:23 AM, Giovanni Luca Ciampaglia < gciampag@indiana.edu> wrote:
I know the best place to ask this would be Labs-l, but I just subscribed to it and am waiting for the list admin to be added. In the meanwhile some good soul could please forward this message there?
Is there a reason why the tables on the Labs replica are not indexed? Drawing a list of random titles with page_random takes more than a minute!
MariaDB [enwiki_p]> select page_id, page_title, page_random from page where page_random > rand() and page_namespace = 0 order by page_random limit 10; +----------+---------------------------------------+----------------+ | page_id | page_title | page_random | +----------+---------------------------------------+----------------+ | 18247710 | Jack_frost_comic | 0.000597330654 | | 2102857 | Prince_Carl_Christian_of_Hohenzollern | 0.000885503126 | | 21550893 | Phoberocyon | 0.000927813129 | | 2836759 | Rinascimento | 0.000943768415 | | 20972414 | Gothic_black_metal | 0.000989090479 | | 1654899 | 1500v_DC | 0.001175443821 | | 43905082 | Juan_García_Costilla | 0.00153940668 | | 1316905 | T_subsurface_stock | 0.001564608112 | | 1447578 | RPGMaker_2000 | 0.00158874586 | | 2034084 | State_Bar_of_California | 0.001697520497 | +----------+---------------------------------------+----------------+ 10 rows in set (1 min 21.14 sec)
MariaDB [enwiki_p]> show index from page; Empty set (0.00 sec)
MariaDB [enwiki_p]>
Giovanni Luca Ciampaglia
✎ 919 E 10th ∙ Bloomington 47408 IN ∙ USA ☞ http://www.glciampaglia.com/ ✆ +1 812 855-7261 ✉ gciampag@indiana.edu
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Do you guys welcome contribution on wikitech?
yes please! :)
On Tue, Nov 11, 2014 at 11:28 AM, Giovanni Luca Ciampaglia < gciampag@indiana.edu> wrote:
Ah, I didn't know the tables on LabsDB where actually views! That makes sense now. I guess the documentation should be more explicit on this. Right now this detail is only mentioned in passing here (emphasis added):
The revision and logging tables do not have indexes on user columns. In
an email http://lists.wikimedia.org/pipermail/labs-l/2013-September/001607.html, one of the system administrators pointed out that this is because "those values are conditionally nulled when supressed". One has to instead use the correspondingrevision_userindex or logging_userindex for these types of queries. *On those views*, rows where the column would have otherwise been nulled are elided; this allows the indexes to be usable.
https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Database#Tables_for_revis...
Do you guys welcome contribution on wikitech?
G
Giovanni Luca Ciampaglia
✎ 919 E 10th ∙ Bloomington 47408 IN ∙ USA ☞ http://www.glciampaglia.com/ ✆ +1 812 855-7261 ✉ gciampag@indiana.edu
2014-11-11 8:46 GMT-05:00 Aaron Halfaker aaron.halfaker@gmail.com:
The tables you are access on labs are actually "views". They won't show
their indexes because they are the result of a query.
Also keep in mind that some views are *not* indexed.
For example, "revision" is not indexed, but "revision_userindex" is. Similarly, "logging" is not indexed, but "logging_userindex" is. This is confusing and cause for a lot for frustration, but there's good reason. See https://bugzilla.wikimedia.org/show_bug.cgi?id=66786
On Tue, Nov 11, 2014 at 12:23 AM, Giovanni Luca Ciampaglia < gciampag@indiana.edu> wrote:
I know the best place to ask this would be Labs-l, but I just subscribed to it and am waiting for the list admin to be added. In the meanwhile some good soul could please forward this message there?
Is there a reason why the tables on the Labs replica are not indexed? Drawing a list of random titles with page_random takes more than a minute!
MariaDB [enwiki_p]> select page_id, page_title, page_random from page where page_random > rand() and page_namespace = 0 order by page_random limit 10; +----------+---------------------------------------+----------------+ | page_id | page_title | page_random | +----------+---------------------------------------+----------------+ | 18247710 | Jack_frost_comic | 0.000597330654 | | 2102857 | Prince_Carl_Christian_of_Hohenzollern | 0.000885503126 | | 21550893 | Phoberocyon | 0.000927813129 | | 2836759 | Rinascimento | 0.000943768415 | | 20972414 | Gothic_black_metal | 0.000989090479 | | 1654899 | 1500v_DC | 0.001175443821 | | 43905082 | Juan_García_Costilla | 0.00153940668 | | 1316905 | T_subsurface_stock | 0.001564608112 | | 1447578 | RPGMaker_2000 | 0.00158874586 | | 2034084 | State_Bar_of_California | 0.001697520497 | +----------+---------------------------------------+----------------+ 10 rows in set (1 min 21.14 sec)
MariaDB [enwiki_p]> show index from page; Empty set (0.00 sec)
MariaDB [enwiki_p]>
Giovanni Luca Ciampaglia
✎ 919 E 10th ∙ Bloomington 47408 IN ∙ USA ☞ http://www.glciampaglia.com/ ✆ +1 812 855-7261 ✉ gciampag@indiana.edu
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org