Perhaps more controversial, but interesting as a query exploring the instance-of sub-classes, 'fictional characters':

https://github.com/fhoffa/code_snippets/blob/master/wikidata/fictional_characters_all.sql

SELECT en_wiki, SUM(requests) requests, FIRST(instance_of) instance_of, VARIANCE(LOG(requests)) logvar
FROM [fh-bigquery:wikipedia.pagecounts_201602_en_top365k] a
JOIN (
SELECT en_wiki, GROUP_CONCAT(b.en_label) instance_of
FROM FLATTEN([wikidata.latest_en_v1], instance_of) a
JOIN (
    SELECT numeric_id, GROUP_CONCAT(en_label) en_label
    FROM [fh-bigquery:wikidata.subclasses] b
    WHERE subclass_of_numeric_id=95074 # fictional character
    GROUP BY 1
) b
ON a.instance_of.numeric_id=b.numeric_id
GROUP BY 1
) b
ON a.title=b.en_wiki
#WHERE language='en'
GROUP BY 1
HAVING logvar<2
ORDER BY 2 DESC
LIMIT 8000

https://github.com/fhoffa/code_snippets/blob/master/wikidata/fictional_characters_all.csv

Row    en_wiki    requests    instance_of    logvar
1    Deadpool    1728814    mutant,superhero    0.386713546
2    Negasonic_Teenage_Warhead    577439    mutant    0.6765129273
3    Cable_(comics)    356178    mutant,superhero    1.387845036
4    Angel_Dust_(comics)    348442    mutant,superhero    0.5164479432
5    Copycat_(Marvel_Comics)    325989    mutant,supervillain,superhero    0.5722760146
6    Ajax_(comics)    304475    fictional character,mutate,superhero    0.6385063943
7    Colossus_(comics)    229808    mutant,superhero    0.5811843835
8    Geordi_La_Forge    183000    fictional human    1.593820584
9    Batman    168041    fictional character    0.07953897313
10    Hunter_Zolomon    164547    comic character    0.4568131007
11    Jessica_Jones    160619    fictional character,superhero    0.05243812944
12    Jesus    159111    deity,mythical character,human biblical character    0.08203965668
13    Weasel_(Marvel_Comics)    139374    fictional character    0.5396001692
14    ...    ...    ...    ...

On Mon, Mar 14, 2016 at 10:55 PM, Felipe Hoffa <felipe.hoffa@gmail.com> wrote:

Thanks for your feedback on my first try on loading Wikidata on BigQuery

https://lists.wikimedia.org/pipermail/wikidata/2016-March/008414.html

I think I figured out the 'convolution tree' for sub-classes, I left it here:

https://bigquery.cloud.google.com/table/fh-bigquery:wikidata.subclasses

It seems we have:

SELECT level, COUNT(*) c
FROM [fh-bigquery:wikidata.subclasses] b
GROUP BY 1
ORDER BY 1

- 453072 classes (level 0).
- 629663 x subclass_of y relations (level 1).
- 635074 x is a subclass_of y and y is a subclass_of z relations (level 2).
- 773622 level 3.
- ...
- 61920 level 11.
- ...
- 196 level 20.
- and that's it... the tree doesn't go deeper than 20.

https://i.imgur.com/BUv8Hdp.png

Now I can ask for the Wikipedia pageviews of everyone that has an occupation that's a sub-class of 'musician' (or 'politician', or any other class):

SELECT en_wiki, SUM(requests) requests, FIRST(occupation) occupation,
               VARIANCE(LOG(requests)) logvar
FROM [fh-bigquery:wikipedia.pagecounts_201602_en_top365k] a
JOIN (
SELECT en_wiki, GROUP_CONCAT(b.en_label) occupation
FROM FLATTEN([wikidata.latest_en_v1], occupation) a
JOIN (
    SELECT numeric_id, GROUP_CONCAT(en_label) en_label
    FROM [fh-bigquery:wikidata.subclasses] b
    WHERE subclass_of_numeric_id=639669
    GROUP BY 1
) b
ON a.occupation.numeric_id=b.numeric_id
GROUP BY 1
) b
ON a.title=b.en_wiki
#WHERE language='en'
GROUP BY 1
HAVING logvar<2
ORDER BY 2 DESC
LIMIT 8000

https://github.com/fhoffa/code_snippets/blob/master/wikidata/musicians_all_pageviews.sql

And the results:

https://github.com/fhoffa/code_snippets/blob/master/wikidata/musicians_all_pageviews.csv

(this query took 5.2s, for 6.77 GB processed)

Hopefully you'll find this useful! I know that SQL is way less expressive than SPARQL, but it might save the day whenever the speed of BigQuery could be required. Try it out if you have a minute.

Please keep the feedback and advise coming,
Felipe Hoffa
https://twitter.com/felipehoffa