Wikidata November 2016

wikidata@lists.wikimedia.org

54 participants
41 discussions

Re: [Wikidata] Wikidata Digest, Vol 60, Issue 8
by Aimco Yahoo 12 Nov '16

12 Nov '16

Check out @AgBusinessMedia's Tweet: https://twitter.com/AgBusinessMedia/status/796567980649787392?s=08 Thanks Sent from my iPhone Pradip Dave Aimco Pesticides Limited Akhand Jyoti, 8th Road Mumbai 400 055. India Board Line: +91 22 6760 4000 www.aimcopesticides.com pradip68(a)gmail.com aimco68(a)yahoo.com Ppd(a)aimcopesticides.com > On 09-Nov-2016, at 5:30 PM, wikidata-request(a)lists.wikimedia.org wrote: > > Send Wikidata mailing list submissions to > wikidata(a)lists.wikimedia.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.wikimedia.org/mailman/listinfo/wikidata > or, via email, send a message with subject or body 'help' to > wikidata-request(a)lists.wikimedia.org > > You can reach the person managing the list at > wikidata-owner(a)lists.wikimedia.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Wikidata digest..." > > > Today's Topics: > > 1. Re: demand for .nt dumps? (Kingsley Idehen) > 2. DizzyLogic Wiki Parser (Reem Al-Kashif) > 3. Re: changes to quantity precision handling (Lydia Pintscher) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 8 Nov 2016 10:11:30 -0500 > From: Kingsley Idehen <kidehen(a)openlinksw.com> > To: wikidata(a)lists.wikimedia.org > Subject: Re: [Wikidata] demand for .nt dumps? > Message-ID: <f96bee19-7fdc-42c6-6fbd-dada936f901d(a)openlinksw.com> > Content-Type: text/plain; charset="utf-8" > >> On 11/4/16 12:55 PM, Stas Malyshev wrote: >> Hi! >> >>>> BTW -- does the Wikidata SPARQL endpoint currently support CONSTRUCT and >>>> DESCRIBE queries? I ask because you can use that as one of the many >>>> options for dump production. >> Yes, such queries are supported, but using this for dump production is >> both not possible - since dumps are what the data in SPARQL service is >> loaded from - and very inefficient, we don't really need SPARQL server >> for that, it can be done with much simpler code. > > I was trying to explain to you that you can use CONSTRUCT to produce a > one-off dump, per whatever you have in the dataset modifier (or body) > part of the SPARQL Query . > > -- > Regards, > > Kingsley Idehen > Founder & CEO > OpenLink Software (Home Page: http://www.openlinksw.com) > > Weblogs (Blogs): > Legacy Blog: http://www.openlinksw.com/blog/~kidehen/ > Blogspot Blog: http://kidehen.blogspot.com > Medium Blog: https://medium.com/@kidehen > > Profile Pages: > Pinterest: https://www.pinterest.com/kidehen/ > Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen > Twitter: https://twitter.com/kidehen > Google+: https://plus.google.com/+KingsleyIdehen/about > LinkedIn: http://www.linkedin.com/in/kidehen > > Web Identities (WebID): > Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this > : http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this > >

1 0

Fwd: Re: [wikicite-discuss] Entity tagging and fact extraction (from a scholarly publisher perspective)
by Marco Fossati 12 Nov '16

12 Nov '16

---------- Messaggio inoltrato ---------- Da: "Marco Fossati" <fossati(a)fbk.eu> Data: 11 nov 2016 1:23 PM Oggetto: Fwd: Re: [wikicite-discuss] Entity tagging and fact extraction (from a scholarly publisher perspective) A: "Marco Fossati" <fossati(a)spaziodati.eu> Cc: ---------- Messaggio inoltrato ---------- Da: "Marco Fossati" <fossati(a)fbk.eu> Data: 11 nov 2016 1:18 PM Oggetto: Re: [wikicite-discuss] Entity tagging and fact extraction (from a scholarly publisher perspective) A: "Andrew Smeall" <andrew.smeall(a)hindawi.com> Cc: "Dario Taraborelli" <dtaraborelli(a)wikimedia.org>, "Benjamin Good" < ben.mcgee.good(a)gmail.com>, "Discussion list for the Wikidata project." < wikidata(a)lists.wikimedia.org>, "wikicite-discuss" < wikicite-discuss(a)wikimedia.org>, "Daniel Mietchen" < daniel.mietchen(a)googlemail.com> Hi everyone, Just a couple of thoughts, which are in line with Dario's first message: 1. the primary sources tool lets third party providers release *full datasets* in a rather quick way. It is conceived to (a) ease the ingestion of *non-curated* data and to (b) make the community directly decide which statements should be included, instead of eventually complex a priori discussions. Important: the datasets should comply with the Wikidata vocabulary/ontology. 2. I see the mix'n'match tool as a way to *link* datasets with Wikidata via ID mappings, thus only requiring statements that say "Wikidata entity X links to the third party dataset entity Y". This is pretty much what the linked data community has been doing so far. No need to comply with the Wikidata vocabulary/ontology. Best, Marco Il 11 nov 2016 10:27 AM, "Andrew Smeall" <andrew.smeall(a)hindawi.com> ha scritto: > Regarding the topics/vocabularies issue: > > A challenge we're working on is finding a set of controlled vocabularies > for all the subject areas we cover. > > We do use MeSH for those subjects, but this only applies to about 40% of > our papers. In Engineering, for example, we've had more trouble finding an > open taxonomy with the same level of depth as MeSH. For most internal > applications, we need 100% coverage of all subjects. > > Machine learning for concept tagging is trendy now, partly because it > doesn't require a preset vocabulary, but we are somewhat against this > approach because we want to control the mapping of terms and a taxonomic > hierarchy can be useful. The current ML tools I've seen can match to a > controlled vocabulary, but then they need the publisher to supply the terms. > > The temptation to build a new vocabulary is strong, because it's the > fastest way to get to something that is non-proprietary and universal. We > can merge existing open vocabularies like MeSH and PLOS to get most of the > way there, but we then need to extend that with concepts from our corpus. > > Thanks Daniel and Benjamin for your responses. Any other feedback would be > great, and I'm always happy to delve into issues from the publisher > perspective if that can be helpful. > > On Fri, Nov 11, 2016 at 4:54 PM, Dario Taraborelli < > dtaraborelli(a)wikimedia.org> wrote: > >> Benjamin – agreed, I too see Wikidata as mainly a place to hold all the >> mappings. Once we support federated queries in WDQS, the benefit of ID >> mapping (over extensive data ingestion) will become even more apparent. >> >> Hope Andrew and other interested parties can pick up this thread. >> >> On Wed, Nov 2, 2016 at 12:11 PM, Benjamin Good <ben.mcgee.good(a)gmail.com> >> wrote: >> >>> Dario, >>> >>> One message you can send is that they can and should use existing >>> controlled vocabularies and ontologies to construct the metadata they want >>> to share. For example, MeSH descriptors would be a good way for them to >>> organize the 'primary topic' assertions for their articles and would make >>> it easy to find the corresponding items in Wikidata when uploading. Our >>> group will be continuing to expand coverage of identifiers and concepts >>> from vocabularies like that in Wikidata - and any help there from >>> publishers would be appreciated! >>> >>> My view here is that Wikidata can be a bridge to the terminologies and >>> datasets that live outside it - not really a replacement for them. So, if >>> they have good practices about using shared vocabularies already, it should >>> (eventually) be relatively easy to move relevant assertions into the >>> WIkidata graph while maintaining interoperability and integration with >>> external software systems. >>> >>> -Ben >>> >>> On Wed, Nov 2, 2016 at 8:31 AM, 'Daniel Mietchen' via wikicite-discuss < >>> wikicite-discuss(a)wikimedia.org> wrote: >>> >>>> I'm traveling ( https://twitter.com/EvoMRI/status/793736211009536000 >>>> ), so just in brief: >>>> In terms of markup, some general comments are in >>>> https://www.ncbi.nlm.nih.gov/books/NBK159964/ , which is not specific >>>> to Hindawi but partly applies to them too. >>>> >>>> A problem specific to Hindawi (cf. >>>> https://commons.wikimedia.org/wiki/Category:Media_from_Hindawi) is the >>>> bundling of the descriptions of all supplementary files, which >>>> translates into uploads like >>>> https://commons.wikimedia.org/wiki/File:Evolution-of-Coronar >>>> y-Flow-in-an-Experimental-Slow-Flow-Model-in-Swines-Angiogra >>>> phic-and-623986.f1.ogv >>>> (with descriptions for nine files) >>>> and eight files with no description, e.g. >>>> https://commons.wikimedia.org/wiki/File:Evolution-of-Coronar >>>> y-Flow-in-an-Experimental-Slow-Flow-Model-in-Swines-Angiogra >>>> phic-and-623986.f2.ogv >>>> . >>>> >>>> There are other problems in their JATS, and it would be good if they >>>> would participate in >>>> http://jats4r.org/ . Happy to dig deeper with Andrew or whoever is >>>> interested. >>>> >>>> Where they are ahead of the curve is licensing information, so they >>>> could help us set up workflows to get that info into Wikidata. >>>> >>>> In terms of triple suggestions to Wikidata: >>>> - as long as article metadata is concerned, I would prefer to >>>> concentrate on integrating our workflows with the major repositories >>>> of metadata, to which publishers are already posting. They could help >>>> us by using more identifiers (e.g. for authors, affiliations, funders >>>> etc.), potentially even from Wikidata (e.g. for keywords/ P921, for >>>> both journals and articles) and by contributing to the development of >>>> tools (e.g. a bot that goes through the CrossRef database every day >>>> and creates Wikidata items for newly published papers). >>>> - if they have ways to extract statements from their publication >>>> corpus, it would be good if they would let us/ ContentMine/ StrepHit >>>> etc. know, so we could discuss how to move this forward. >>>> d. >>>> >>>> On Wed, Nov 2, 2016 at 1:42 PM, Dario Taraborelli >>>> <dtaraborelli(a)wikimedia.org> wrote: >>>> > I'm at the Crossref LIVE 16 event in London where I just gave a >>>> presentation >>>> > on WikiCite and Wikidata targeted at scholarly publishers. >>>> > >>>> > Beside Crossref and Datacite people, I talked to a bunch of folks >>>> interested >>>> > in collaborating on Wikidata integration, particularly from PLOS, >>>> Hindawi >>>> > and Springer Nature. I started an interesting discussion with Andrew >>>> Smeall, >>>> > who runs strategic projects at Hindawi, and I wanted to open it up to >>>> > everyone on the lists. >>>> > >>>> > Andrew asked me if – aside from efforts like ContentMine and StrepHit >>>> – >>>> > there are any recommendations for publishers (especially OA >>>> publishers) to >>>> > mark up their contents and facilitate information extraction and >>>> entity >>>> > matching or even push triples to Wikidata to be considered for >>>> ingestion. >>>> > >>>> > I don't think we have a recommended workflow for data providers for >>>> > facilitating triple suggestions to Wikidata, other than leveraging the >>>> > Primary Sources Tool. However, aligning keywords and terms with the >>>> > corresponding Wikidata items via ID mapping sounds like a good first >>>> step. I >>>> > pointed Andrew to Mix'n'Match as a handy way of mapping identifiers, >>>> but if >>>> > you have other ideas on how to best support 2-way integration of >>>> Wikidata >>>> > with scholarly contents, please chime in. >>>> > >>>> > Dario >>>> > >>>> > -- >>>> > >>>> > Dario Taraborelli Head of Research, Wikimedia Foundation >>>> > wikimediafoundation.org • nitens.org • @readermeter >>>> > >>>> > -- >>>> > WikiCite 2016 – May 26-26, 2016, Berlin >>>> > Meta: https://meta.wikimedia.org/wiki/WikiCite_2016 >>>> > Twitter: https://twitter.com/wikicite16 >>>> > --- >>>> > You received this message because you are subscribed to the Google >>>> Groups >>>> > "wikicite-discuss" group. >>>> > To unsubscribe from this group and stop receiving emails from it, >>>> send an >>>> > email to wikicite-discuss+unsubscribe(a)wikimedia.org. >>>> >>>> -- >>>> WikiCite 2016 – May 26-26, 2016, Berlin >>>> Meta: https://meta.wikimedia.org/wiki/WikiCite_2016 >>>> Twitter: https://twitter.com/wikicite16 >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "wikicite-discuss" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to wikicite-discuss+unsubscribe(a)wikimedia.org. >>>> >>>> >>> >> >> >> -- >> >> *Dario Taraborelli *Head of Research, Wikimedia Foundation >> wikimediafoundation.org • nitens.org • @readermeter >> <http://twitter.com/readermeter> >> >> -- >> WikiCite 2016 – May 26-26, 2016, Berlin >> Meta: https://meta.wikimedia.org/wiki/WikiCite_2016 >> Twitter: https://twitter.com/wikicite16 >> --- >> You received this message because you are subscribed to the Google Groups >> "wikicite-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to wikicite-discuss+unsubscribe(a)wikimedia.org. >> > > > > -- > ------------------------------ > Andrew Smeall > Head of Strategic Projects > > Hindawi Publishing Corporation > Kirkman House > 12-14 Whitfield Street, 3rd Floor > London, W1T 2RF > United Kingdom > ------------------------------ > > -- > WikiCite 2016 – May 26-26, 2016, Berlin > Meta: https://meta.wikimedia.org/wiki/WikiCite_2016 > Twitter: https://twitter.com/wikicite16 > --- > You received this message because you are subscribed to the Google Groups > "wikicite-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to wikicite-discuss+unsubscribe(a)wikimedia.org. >

1 0

Refactoring WP:Source Metadata and helping new contributors
by Dario Taraborelli 11 Nov '16

11 Nov '16

Hey all, a conversation that happened on Twitter the other day suggests that we're not doing a good enough job at structuring documentation on wiki. While I think the subpages of [[m:WikiCite]] <https://meta.wikimedia.org/wiki/WikiCite> are fairly well organized, [[WD:WikiProject Source]] <https://www.wikidata.org/wiki/WD:WikiProject Source> badly needs some love. In particular: - the WikiProject's *landing page* contains a lot of unstructured and fairly outdated information: some of this information could be moved to subpages and we could use a navigation template like other WikiProjects do <https://www.wikidata.org/wiki/Wikidata:WikiProject_Commons>. - *data modeling proposals* for different types of works are currently only captured via a template (Template:Bibliographical_properties <https://www.wikidata.org/wiki/Template:Bibliographical_properties>) or buried on the WP's talk page: we should have a dedicated page to host data models, ideally a big table listing and annotating properties for different types of works, as well as their mappings to existing bibliographic models. - other important proposals (such as the use <https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Source_MetaData#Add…> of *stated in* (P248) to represent *provenance of citation data* for statements using *cites *(P2860) or the documentation of specific *data import strategies* (the Zika corpus, the OA review literature, PMCID references in enwiki) are similarly buried in the talk page and hard to find: this may raise a few eyebrows in the Wikidata community if we don't make it clear how and why this data is being imported or represented. If someone on this list is willing to spend some time and help with some documentation/design effort, it would be tremendously useful, especially to people who are not yet regularly following WikiCite and WP:Source Metadata: we need to create an inclusive environment and readability/navigability for newbies is the first important step. Dario

2 5

Fwd: [Wikitech-l] Localizable data for Graphs and Templates on Commons
by Pine W 11 Nov '16

11 Nov '16

Perhaps of interest to other lists. Pine ---------- Forwarded message ---------- From: Yuri Astrakhan <yastrakhan(a)wikimedia.org> Date: Wed, Nov 9, 2016 at 9:46 AM Subject: [Wikitech-l] Localizable data for Graphs and Templates on Commons To: Wikimedia developers <wikitech-l(a)lists.wikimedia.org> Following the localizable maps example, here is a structured tabular data example that also supports localization, shared data, and can be used directly from the graphs or from Lua scripts on any wiki. Note that the graph itself is in English wiki (labs), but data comes from Commons. Feel free to add translations. English: https://en.wikipedia.beta.wmflabs.org/wiki/Dimpvis_-_ Fertility_vs_Life_Expectancy Russian: https://en.wikipedia.beta.wmflabs.org/wiki/Dimpvis_-_ Fertility_vs_Life_Expectancy?uselang=ru On Mon, Nov 7, 2016 at 11:46 PM Yuri Astrakhan <yastrakhan(a)wikimedia.org> wrote: > I would like to show one of the projects that Interactive team has been > hacking on: localizable maps data (GeoJSON), stored on Commons, and usable > from multiple wikis. I hope we can get it polished and enabled in > production soon enough - so far, lab's beta cluster only: > > https://en.wikipedia.beta.wmflabs.org/wiki/Maplink-page > _______________________________________________ Wikitech-l mailing list Wikitech-l(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

1 0

Upgrading WDQS hardware
by Dan Garry 11 Nov '16

11 Nov '16

Hi all, In order to support the continued growth and use of the Wikidata Query Service, Discovery <https://www.mediawiki.org/wiki/Wikimedia_Discovery> and Technical Operations <https://www.mediawiki.org/wiki/Wikimedia_Technical_Operations> are upgrading the hardware of the service. We're adding a single extra server to both our Virginia (eqiad <https://wikitech.wikimedia.org/wiki/Eqiad_cluster>) and Dallas (codfw <https://wikitech.wikimedia.org/wiki/Codfw_cluster>) data centres. These servers will be dedicated to the Wikidata Query Service. We're expecting this upgrade to take a few months, and we'll keep you updated as things progress. Thanks, Dan -- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation

2 1

Unable to edit items on Firefox
by David Cuenca Tudela 10 Nov '16

10 Nov '16

Hi, This problem is quite important, I hope someone can take a look. https://www.wikidata.org/wiki/Wikidata:Project_chat#Unable_to_edit_items_on… Thanks, Micru

6 5

Master thesis
by Nazanin Kamali 10 Nov '16

10 Nov '16

Dear all, My name is Nazanin and I am a Wikipedia editor. I studied statistics in my undergraduate program and recently I have begun my master program also in statistics. According to my interests in Wikimedia projects, I would like to choose a Wikidata-related project as my master thesis. I have already worked on descriptive statistics and I look forward to working on Monte Carlo methods for forcasting, simulation and modelling. How can I help in Wikidara and how could we have a productive cooperation? All the best, Nazanin

5 4

Wikimedia takes part in Google Code-in. Which small Wikidata tasks do you plan to mentor?
by Andre Klapper 10 Nov '16

10 Nov '16

Wikimedia is among the 17 organizations in Google Code-in (GCI) 2016! GCI starts on November 28th. It's a contest for 13-17 year old students working on small tasks and a great opportunity to let new contributors make progress and help with smaller tasks on your To-Do list! There are currently 23 open Wikidata tasks marked as easy: https://phabricator.wikimedia.org/maniphest/query/zH.SGfhpQyC4/#R (and a good bunch of them is already marked for GCI, thanks!) What we want you to do: BECOME A MENTOR: 1. Go to https://www.mediawiki.org/wiki/Google_Code-in_2016 and add yourself to the mentor's table. 2. Get an invitation email to register on the contest site. PROVIDE SMALL TASKS: We want your tasks in the following areas: code, outreach/research, documentation/training, quality assurance, user interface/design. 1. Create a Phabricator task (which would take you 2-3h to complete) or pick an existing Phabricator task you'd mentor. 2. Add the "Google-Code-In-2016" project tag. 3. Add a comment "I will mentor this in #GCI2016". Looking for task ideas? Check the "easy" tasks in Phabricator: https://www.mediawiki.org/wiki/Annoying_little_bugs offers links. Make sure to cover expectations and deliverables in your task. And once the contest starts on Nov 28, be ready to answer and review contributions quickly. Any questions? Just ask, we're happy to help. Thank you for your help broadening our contributor base! andre -- Andre Klapper | Wikimedia Bugwrangler http://blogs.gnome.org/aklapper/

4 3

changes to quantity precision handling
by Lydia Pintscher 09 Nov '16

09 Nov '16

Hi all, over the past years a lot of people have rightfully complained about how we are handling rounding and uncertainty in quanity values. Until now when entering 124 m it will be parsed, stored and displaied as 124 +/-1 m. People then often tried to change this to 124 +/-0 m in order to prevent the uncertainty from being shown. This is often incorrect. People also disagreed with the default of +/-1 instead of +/-0.5. We have sat down and discussed at length how to improve this situation and have now prepared two changes: 1) If no uncertainty is explicitely entered, we do not try to guess it and no uncertainty will be displayed. But if an uncertatinty is given it will always be displayed even if it is +/-0. 2) In order to apply correct rounding (in particular for unit conversion) we still need to know the uncertatiny intervall. If no uncertainty was explicitely given we still will need to guess it. We have now halfed the default uncertainty intervall to be consistent with the rounding intervall. So for example 124 would be treated as 124 +/-0.5 when applying rounding. We now have a lot of quantity values that are marked as exact (+/-0) while their uncertainty is not explicitely known. With the above changes all of these will be shown with +/-0. We also have a lot of values that have the old default precision (+/-1) which prevents the new default from being used. Therefor we suggest the following two bot runs: * Remove +/-0 from quantity values. * Remove the precision if it is equal to the old default precision. Both should be applied to properties that represent measured quantities. We should avoid applying them to properties that represent conversions or similar wexactly-defined values. We can prepare the bot for this. This requires a breaking change to the API and JSON binding. Daniel will send a separate email about that in a minute. We plan to make this change on the live system on November 15th. It can be tested on the beta before that. I will let you know when. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

1 1

DizzyLogic Wiki Parser
by Reem Al-Kashif 09 Nov '16

09 Nov '16

Hi, I'm just wondering if anybody knows what happened to DizzyLogic wiki parser? Best, Reem -- *Kind regards,Reem Al-Kashif*

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata November 2016