Wikidata August 2018

wikidata@lists.wikimedia.org

47 participants
46 discussions

Help us teaching ORES how to better detect vandalism
by Léa Lacroix 18 Oct '18

18 Oct '18

Hello all, As you may know, ORES is a tool analyzing edits to detect vandalism, providing a score per edit. You can see the result on Recent Changes, you can also let us know when you find something wrong <https://www.wikidata.org/wiki/Wikidata:ORES/Report_mistakes/2018>. But do you know that you can also directly help ORES to improve? We just launched a new labeling campaign <https://labels.wmflabs.org/ui/wikidatawiki/>: after authorizing your account with OAuth, you will see some real edits, and you will be asked if you find them damaging or not, good faith or bad faith. Completing a set will take you around 10 minutes. The last time we run this campaign was in 2015. Since then, the way of editing Wikidata changed, some vandalism patterns as well (for example, there are more vandalism on companies). So, if you're familiar with the Wikidata rules and you would be willing to give a bit of time to help fighting against vandalism, please participate <https://labels.wmflabs.org/ui/wikidatawiki/> :) If you encounter any problem or have question about the tool, feel free to contact Ladsgroup <https://www.wikidata.org/wiki/User:Ladsgroup>. Cheers, -- Léa Lacroix Project Manager Community Communication for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

2 2

Wiki PageID
by Gintautas Sulskus 05 Oct '18

05 Oct '18

Hi, I have a couple of questions regarding the Wiki Page ID. Does it always stay unique for the page, where the page itself is just a placeholder for any kind of information that might change over time? Consider the following cases: 1. The first time someone creates page "Moon" it is assigned ID=1. If at some point the page is renamed to "The_Moon", the ID=1 remains intact. Is this correct? 2. What if we have page "Moon" with ID=1. Someone creates a second-page "The_Moon" with ID=2. Is it possible that page "Moon" is transformed into a redirect? Then, "Moon" would be redirecting to page "The_Moon"? 3. Is it possible for page "Moon" to become a category "Category:Moon" with the same ID=1? Thanks, Gintas

7 11

Wikidata HDT dump
by Laura Morales 02 Oct '18

02 Oct '18

Hello everyone, I'd like to ask if Wikidata could please offer a HDT [1] dump along with the already available Turtle dump [2]. HDT is a binary format to store RDF data, which is pretty useful because it can be queried from command line, it can be used as a Jena/Fuseki source, and it also uses orders-of-magnitude less space to store the same data. The problem is that it's very impractical to generate a HDT, because the current implementation requires a lot of RAM processing to convert a file. For Wikidata it will probably require a machine with 100-200GB of RAM. This is unfeasible for me because I don't have such a machine, but if you guys have one to share, I can help setup the rdf2hdt software required to convert Wikidata Turtle to HDT. Thank you. [1] http://www.rdfhdt.org/ [2] https://dumps.wikimedia.org/wikidatawiki/entities/

15 81

Next IRC office hour on September 25th
by Léa Lacroix 25 Sep '18

25 Sep '18

Hello all, The next IRC office hour with the Wikidata team will take place on September 25th, from 18:00 to 19:00 (UTC+2, Berlin time <https://www.timeanddate.com/worldclock/converter.html?iso=20180925T160000&p…>), on the channel #wikimedia-office. As usual, we will present you some news from the development team, the projects to come, and collect your feedback. If you have any special topic you'd like to see as a focus, please share it here! Thanks, -- Léa Lacroix Project Manager Community Communication for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

1 2

WikiCite 2018 applications (including travel funding requests) are now open
by Dario Taraborelli 17 Sep '18

17 Sep '18

We're thrilled to announce that applications to attend WikiCite 2018 are now open <https://goo.gl/forms/hV6rXRCdQ3fAK9v13>. [image: WikiCite 2018.png] WikiCite 2018 <https://meta.wikimedia.org/wiki/WikiCite_2018> is a 3-day conference, summit, and hack day dedicated to the vision of creating an open repository of bibliographic data to support the citation and fact-checking needs of Wikimedia projects, and possibly, to serve as an open infrastructure for research, education, and information quality across the web. WikiCite 2018 expands efforts started with WikiCite 2016 <https://meta.wikimedia.org/wiki/WikiCite_2016> and WikiCite 2017 <https://meta.wikimedia.org/wiki/WikiCite_2017> to explore these possibilities by leveraging Wikidata—Wikimedia’s structured knowledge base—and to develop open source tools to improve citation management and verifiability strategies for free knowledge. Since then, the amount of bibliographic data in Wikidata has grown exponentially, allowing us to glimpse the possibilities of a truly open, universal library and citation graph, while also exposing significant social and technical challenges. This year presents a pivotal moment for WikiCite, wherein its emergent community — consisting of Wikimedians, librarians, LODLAM practitioners, software engineers, data scientists, and open knowledge advocates — must grapple with possible growth scenarios, address critical gaps, and set a course for the project’s future <https://www.wikidata.org/wiki/Wikidata:WikiCite/Roadmap>. If you are passionate about tending Wikipedia’s root system (references!), or believe in the broader possibilities of contributing to the bibliographic commons, WikiCite 2018 presents an unprecedented opportunity to meet fellow dreamers and hackers, and to help shape this vital effort. This year’s event will be hosted at the David Brower Center <https://en.wikipedia.org/wiki/David_Brower_Center> in Berkeley, California, USA, November 27-29, 2018. Applications to attend the event (including travel support requests <https://meta.wikimedia.org/wiki/WikiCite_2018/Travel_funding>) are open until *September 17, 2018*: *WikiCite 2018 application form* <https://docs.google.com/forms/d/e/1FAIpQLScfZdLgVZYUQfj99SSeUsj-uwe3BdHgjRQ…> We hope you will join us! –The WikiCite 2018 organizing committee <https://meta.wikimedia.org/wiki/WikiCite_2018#Organizing_committee> This application will be conducted via Google Forms—a third-party service, which may subject it to additional terms. For more information on privacy and data-handling, see the survey privacy statement <https://foundation.wikimedia.org/wiki/WikiCite_2018_Application_Form_Privac…> .

3 4

Re: [Wikidata] Register for FORCE2018 by this Friday and save $100
by Pine W 12 Sep '18

12 Sep '18

I'm going to ask the opinion of the Wikidata list moderators here. This email appears to be a soloicitation to pay for attendance to an event, which I would consider to be a junk email and would treat accordingly including by blacklisting the sender. Do the Wikidata list moderators agree? Thanks, Pine ( https://meta.wikimedia.org/wiki/User:Pine ) null

8 10

Machine translation efforts for underserved languages
by Olya Irzak 05 Sep '18

05 Sep '18

Dear Wikidata community, We're working on a project called Wikibabel to machine-translate parts of Wikipedia into underserved languages, starting with Swahili. In hopes that some of our ideas can be helpful to machine translation projects, we wrote a blogpost about how we prioritized which pages to translate, and what categories need a human in the loop: https://medium.com/@oirzak/wikibabel-equalizing-information-access-on-a-bud… Rumor has it that the Wikidata community has thought deeply about information access. We'd love your feedback on our work. Please let us know about past / ongoing machine translation related projects so we can learn from & collaborate with them. Best regards, Olya & the Wikibabel crew

7 11

Wikimedia Sverige receives $65k+ in funding for our Library Data project
by Alicia Fagerving 04 Sep '18

04 Sep '18

Wikimedia Sverige is proud to be the recipient of $65,500 in support from the Swedish National Library for our project Library Data. We will work in collaboration with the Swedish National Library to include a number of datasets onto Wikidata, such as data about authors, libraries and different special databases of bibliographies[1]. This is a pilot project where we aim to discuss with the community what to include and what to exclude. Based on the discussions and the requests from the community we will design a continuation of this project (if this first part is deemed successful continuous funding is possible for 3-4 more years). We started investigating a possible long term partnership with the National Library in 2017 when Wikimedia Sverige delivered inputs to the new National Strategy for the Library Sector on how Sweden's libraries can work with Wikimedia for mutual benefits.[2] The National Library has just made history as the world's first national library to fully transition to Linked Open Data (BIBFRAME 2.0),[3] so the timing could not have been better; we are now in position to examine how this move can benefit Wikidata and other Wikimedia projects. Please contact the project manager André Costa (andre.costa(a)wikimedia.se) or the developer Alicia Fagerving (alicia.fagerving(a)wikimedia.se) if you have any questions. As always, you can find the full application on our wiki (in Swedish): * https://se.wikimedia.org/wiki/Projekt:Strategisk_inkludering_av_biblioteksd… [1] <https://libris.kb.se/deldatabas.jsp> [2] < https://commons.wikimedia.org/wiki/File:Wikimedia_Sverige_-_Wikipedia_och_b… > (in Swedish) [3] < http://www.mynewsdesk.com/se/kungliga_biblioteket/pressreleases/kb-becomes-… > Kind regards, ~*~ Alicia Fagerving Developer Wikimedia Sverige (WMSE) e-mail: alicia.fagerving(a)wikimedia.se phone: +46 73 950 09 56

9 10

image re-use information available via a SPARQL query?
by Bob DuCharme 01 Sep '18

01 Sep '18

I know how to run SPARQL queries to find out various properties of the Empire State Building as shown on https://www.wikidata.org/wiki/Q9188, including the URL of the image shown there. When I look at the https://commons.wikimedia.org/wiki/File:Empire_State_Building_from_the_Top_… web page I see that the image has a CC-BY-SA-2.0 license, but I can't work out how to get to that in a query. Can anyone show me a SPARQL query about this image where the query result would show me the license type? Thanks, Bob

3 3

ScienceSource participation: focus list
by Charles Matthews 01 Sep '18

01 Sep '18

It is now possible to participate in the ScienceSource project, by adding "main subject" statements. The ScienceSource focus list has been up for a little while now at https://www.wikidata.org/wiki/Wikidata:ScienceSource_focus_list which has the shortcut WD:SSFL. Wikidata items about biomedical articles can be added to the list as explained on the page, using P5008. That page has other links to expository material. The original grant page at https://meta.wikimedia.org/wiki/Grants:Project/ScienceSource give an overview of the project's aims. The Listeria-generated page https://www.wikidata.org/wiki/Wikidata:ScienceSource_focus_list/Main_subjec… linked from the focus list page shows which of the items on the list (which is around 3K now, see the SPARQL query on the talk page WT:SSFL) lack a main subject (P921) statement. At Wikimania I made the clarification that the focus list is supposed to be better "balanced" than the selection of articles represented on Wikidata as a whole. This is a big issue, but I don't think it is really disputed that the existing literature is more interested in the diseases of prosperous people and prosperous countries. On a straight utilitarian argument about the "greatest good of the greatest number", there is a problem. Therefore, the composition of the focus list should not be a proportionate reflection of the 17.5M articles represented in Wikidata, by topic. We are looking first to include about 0.2% of articles on the list, bringing it up to about 40K. The Listeria page is a sortable table, and if you sort by "published in" you'll see plenty from PLOS Neglected Tropical Diseases - thanks to Daniel Mietchen for adding a collection of well-cited papers from there. Later on there should be other lists by topic area, so we can get an idea of balance. Once main subjects (where type of disease is the most important area) build up, SPARQL aggregates can reveal distribution. This bubble chart query https://tinyurl.com/y89s6nlc gives a baseline, showing that currently the list's subjects are dominated by infectious diseases. Where next? In the coming weeks, the ScienceSource wiki at http://sciencesource.wmflabs.org/ will be developed. Text-mining and annotation there will be the next phase. Downloading of papers to the wiki will depend on accumulating metadata on their Wikidata items. Charles

2 1

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata August 2018