Hi!
FYI to those it may concern - we plan to institute regular WDQS
deployments on Mondays for both code and GUI. Not much going to change
except regular deployments would happen at predictable time instead of
"whenever I feel like it" :) Does not preclude emergency deployments in
case something is broken, of course, but hopefully this would introduce
more predictability as the service matures. Guillaume would be doing the
deployments, usually, with me stepping in if he can not.
-------- Forwarded Message --------
Subject: [Ops] Wikidata Query Service (WQDS) regular deployment window
Date: Tue, 22 Mar 2016 22:04:21 +0100
From: Guillaume Lederrey <glederrey(a)wikimedia.org>
To: A public mailing list about Wikimedia Search and Discovery projects
<discovery(a)lists.wikimedia.org>, Operations Engineers
<ops(a)lists.wikimedia.org>
Hello!
After discussion with Stas, we want to have a regular deployment
window for Wikidata Query Service. This should help give better
visibility on when new version arrives and help track issues with
those new versions. I will take care of the deployments (with Stas'
support, of course).
The deployment window is: every Monday, from 7pm CET (10am PST - 5pm
UTC) starting from Monday April 11th.
Let me know if you have any question or if you know of another place
where I should publicize this deployment window.
Take care,
Guillaume
--
Guillaume Lederrey
Operations Engineer, Discovery
Wikimedia Foundation
_______________________________________________
Ops mailing list
Ops(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ops
Hey folks :)
I am looking for someone to support me me with my product management work
at WMDE as an intern. If you'd love to work with me and the rest of the
team, love Wikidata and want to learn, this might be the thing for you.
More details are here:
https://wikimedia.de/wiki/Internship_Product_Management
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
I've recently established a synchronized stand-alone copy of Wikidata on
a local machine. It's working quite well, but one of the benefits I had
hoped to receive from doing this would be an ability to integrate
federated queries to other services into queries against Wikidata. I can
certainly see from thread associated with Mr. Neubert's message [1]
(ref. below) the motivation for disabling remote SERVICE calls on WD's
public endpoints, but it would be of great value for me if I could do so
in my own private environment.
Could someone provide me with guidance as to how I might enable
federated queries to remote endpoints from queries posed within my local
WD installation?
Thanks,
Eric Scott
[1] https://www.mail-archive.com/wikidata@lists.wikimedia.org/msg02063.html
We have several contributions of relevance to the Wikidata community at our
wiki research workshop hosted at WWW'16 (more papers will be announced soon
from our second workshop at ICWSM '16).
Dario
---------- Forwarded message ----------
From: Dario Taraborelli <dtaraborelli(a)wikimedia.org>
Date: Sat, Mar 19, 2016 at 9:00 AM
Subject: WWW 2016 Wiki Workshop: accepted papers
To: Research into Wikimedia content and communities <
wiki-research-l(a)lists.wikimedia.org>
We're thrilled to announce the list of papers accepted at the WWW 2016 Wiki
Workshop <http://snap.stanford.edu/wikiworkshop2016/>. You can follow
@wikiworkshop16 <https://twitter.com/wikiworkshop16> for updates.
Dario
(on behalf of the organizers)
Johanna Geiß and Michael Gertz
With a Little Help from my Neighbors: Person Name Linking Using the
Wikipedia Social Network
Ramine Tinati, Markus Luczak-Roesch and Wendy Hall
Finding Structure in Wikipedia Edit Activity: An Information Cascade
Approach
Paolo Boldi and Corrado Monti
Cleansing Wikipedia Categories using Centrality
Thomas Steiner
Wikipedia Tools for Google Spreadsheets
Yu Suzuki and Satoshi Nakamura
Assessing the Quality of Wikipedia Editors through Crowdsourcing
Vikrant Yadav and Sandeep Kumar
Learning Web Queries For Retrieval of Relevant Information About an Entity
in a Wikipedia Category
Haggai Roitman, Shay Hummel, Ella Rabinovich, Benjamine Sznajder, Noam
Slonim and Ehud Aharoni
On the Retrieval of Wikipedia Articles Containing Claims on Controversial
Topics
Tanushyam Chattopadhyay, Santa Maiti and Arindam Pal
Automatic Discovery of Emerging Trends using Cluster Name Synthesis on User
Consumption Data
Freddy Brasileiro, João Paulo A. Almeida, Victorio A. Carvalho and
Giancarlo Guizzardi
Applying a Multi-Level Modeling Theory to Assess Taxonomic Hierarchies in
Wikidata
*Dario Taraborelli *Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>
Just a note that I gave a Wikidata-centric presentation entitled "Navigate All the Knowledge" that is available on video online. It demonstrates the ConceptMap application and supportive technologies such as Wikidata and Wikimedia APIs https://t.co/F1msGB9cTL
Regards,
James Weaver
Sent from my iPhone
Hi all,
I have a question about the SPARQL endpoint running on https://query.wikidata.org for which I would like to ask for your help.
I am currently running an experiment to figure out how many Wikidata entries refer to identifiers in our dataset (i.e. using property P727) but I am receiving in the results entries that have apparently been deleted/deprecated (e.g. http://www.wikidata.org/entity/Q18573617)... is there a way to detect them using SPARQL, perhaps some meta-property or some information in a statement, or is it simply because the endpoint is not in sync with the main repo.
Thank you in advance!
Kind regards,
Hugo
Hugo Manguinhas
Technical R&D Coordinator
T: +31 (0)70 314 0967
M:
E: Hugo.Manguinhas(a)europeana.eu
Skype: hugo.manguinhas
Be part of Europe's online cultural movement - join the Europeana Network Association: http://bit.ly/NetworkAssociation
#AllezCulture!
Disclaimer: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. If you are not the named addressee you should not disseminate, distribute or copy this email. Please notify the sender immediately by email if you have received this email by mistake and delete this email from your system.
Thanks for your feedback on my first try on loading Wikidata on BigQuery
https://lists.wikimedia.org/pipermail/wikidata/2016-March/008414.html
I think I figured out the 'convolution tree' for sub-classes, I left it
here:
https://bigquery.cloud.google.com/table/fh-bigquery:wikidata.subclasses
It seems we have:
SELECT level, COUNT(*) c
FROM [fh-bigquery:wikidata.subclasses] b
GROUP BY 1
ORDER BY 1
- 453072 classes (level 0).
- 629663 x subclass_of y relations (level 1).
- 635074 x is a subclass_of y and y is a subclass_of z relations (level 2).
- 773622 level 3.
- ...
- 61920 level 11.
- ...
- 196 level 20.
- and that's it... the tree doesn't go deeper than 20.
https://i.imgur.com/BUv8Hdp.png
Now I can ask for the Wikipedia pageviews of everyone that has an
occupation that's a sub-class of 'musician' (or 'politician', or any other
class):
SELECT en_wiki, SUM(requests) requests, FIRST(occupation) occupation,
VARIANCE(LOG(requests)) logvar
FROM [fh-bigquery:wikipedia.pagecounts_201602_en_top365k] a
JOIN (
SELECT en_wiki, GROUP_CONCAT(b.en_label) occupation
FROM FLATTEN([wikidata.latest_en_v1], occupation) a
JOIN (
SELECT numeric_id, GROUP_CONCAT(en_label) en_label
FROM [fh-bigquery:wikidata.subclasses] b
WHERE subclass_of_numeric_id=639669
GROUP BY 1
) b
ON a.occupation.numeric_id=b.numeric_id
GROUP BY 1
) b
ON a.title=b.en_wiki
#WHERE language='en'
GROUP BY 1
HAVING logvar<2
ORDER BY 2 DESC
LIMIT 8000
https://github.com/fhoffa/code_snippets/blob/master/wikidata/musicians_all_…
And the results:
en_wiki requests occupation logvar
Kanye_West 940181 singer,rapper 0.6882018066
Sia_Furler 562789 singer,songwriter,composer 0.561231088
Brie_Larson 555301 singer,musician,singer-songwriter 0.6390245475
Beyonc%C3%A9 503342 record
producer,singer-songwriter,singer,composer,musician 0.477047463
David_Bowie 502810 singer-songwriter,guitarist,saxophonist,composer,record
producer 0.1213822659
Adele 480541 singer-songwriter,singer,guitarist 0.1618690338
Keanu_Reeves 471244 singer,musician 0.53960377
Rihanna 419017 singer 0.1801038939
Taylor_Swift 409519 singer-songwriter,pianist,bajista,composer,guitarist
0.2761908317
Zayn_Malik 405848 singer 0.08530145229
Kesha 402165 singer,composer,singer-songwriter,yodeler 1.781225996
Lady_Gaga 390866 singer,songwriter,record producer,composer,pianist,musician
0.5030283604
Michael_Jackson 361344
singer,singer-songwriter,composer,musician,songwriter,record
producer 0.1256237352
Bill_Clinton 347856 saxophonist 0.150877313
Kendrick_Lamar 338141 singer,songwriter,rapper 0.4858019514
Justin_Bieber 329738 singer-songwriter,singer,musician 0.1038992156
... ... ... ...
https://github.com/fhoffa/code_snippets/blob/master/wikidata/musicians_all_…
(this query took 5.2s, for 6.77 GB processed)
Hopefully you'll find this useful! I know that SQL is way less expressive
than SPARQL, but it might save the day whenever the speed of BigQuery could
be required. Try it out if you have a minute.
Please keep the feedback and advise coming,
Felipe Hoffa
https://twitter.com/felipehoffa