Hi,
I have a couple of questions regarding the Wiki Page ID. Does it always
stay unique for the page, where the page itself is just a placeholder for
any kind of information that might change over time?
Consider the following cases:
1. The first time someone creates page "Moon" it is assigned ID=1. If at
some point the page is renamed to "The_Moon", the ID=1 remains intact. Is
this correct?
2. What if we have page "Moon" with ID=1. Someone creates a second-page
"The_Moon" with ID=2. Is it possible that page "Moon" is transformed into a
redirect? Then, "Moon" would be redirecting to page "The_Moon"?
3. Is it possible for page "Moon" to become a category "Category:Moon" with
the same ID=1?
Thanks,
Gintas
Hi!
I would like to initiate a discussion about coordinate precision in
Wikidata and Query Service. The reason is that right now we do not have
any limit to precision, coordinates are basically doubles, and that
allows to specify over-precise coordinates and makes it harder to
compare them - both between themselves within Wikidata and with outside
services.
>From the precision description in [1], we would rarely need beyond third
or fourth digit after the decimal point. However, we have in the
database coordinates like: Point(13.366666666 41.766666666) which
pretends to specify it with sub-millimeter accuracy - for an entity that
describes a municipality[2]!
We do have precision on values - e.g. the above has specified precision
of "arcseconds" - so it may be just a formatting issue, but even
arcsecond looks somewhat over-precise for a city. And it may be a bit
challenging to convert DMS precision DD precision.
But the bigger question is whether we should store over-precise
coordinates in the database at all, or we should round them up on export
or inside the data. The formulae that are used to calculate distances
have, by obvious reasons, limited precision, and direct comparisons
can't take precision into account, which may lead to such coordinates
very hard to work with. Should we maybe just put a limit on how precise
we put coordinates into RDF and in query service? Would four decimals
after the dot be enough? According to [4] this is what commercial GPS
device can provide. If not, why and which accuracy would be appropriate?
We do export precision of the coordinate as wikibase:geoPrecision[3] -
and we currently have 258060 distinct values for it. This is very weird.
I am not sure precision is useful in this form. Can anybody tell me any
use case for this number now? If not, maybe we should change how we
represent it. I'm also not sure where these come from as we only have
13 options in the UI. Bots?
[1] https://en.wikipedia.org/wiki/Decimal_degrees
[2] https://www.wikidata.org/wiki/Q116746
[3]
https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#Globe_coor…
[4]
https://gis.stackexchange.com/questions/8650/measuring-accuracy-of-latitude…
--
Stas Malyshev
smalyshev(a)wikimedia.org
Hello,
I wanted to make you aware of our new paper "Doctoral Advisor or Medical
Condition: Towards Entity-specific Rankings of Knowledge Base Properties",
which deals with the problem of determining the interestingness of Wikidata
properties for individual entities.
In the paper we develop a dataset of 350 random (entity, property1,
property2) records, and use human judgments to determine the more
interesting property in each record.
We then show that state-of-the-art techniques (Wikidata Property Suggestor,
Google search) achieve 61% precision on predicting the winner in
high-agreement records, which can be lifted to 74% by using linguistic
similarity, but remains still significantly below human performance (87.5%
precision).
Paper: http://www.simonrazniewski.com/2017_ADMA.pdf (to appear at ADMA
2017).
Dataset: https://www.kaggle.com/srazniewski/wikidatapropertyranking
Best wishes,
Simon Razniewski
Hi,
We're more than halfway through mapping YSO places to Wikidata. Most of
the remaining are places that don't exist in Wikidata, and adding them
is quite labor-intensive so we will have to consider our strategy.
Anyway, I did some checking of what remains unmapped and noticed a
potential problem: some mappings for places that we have mapped using
Mix'n'match have not actually been stored in Wikidata. For example Q36
Poland ("Puola" in YSO Places) is such a case. In Mix'n'match it is
shown as manually matched (see attached screenshot), but in Wikidata the
corresponding YSO ID property doesn't actually exist for the entity. I
checked the change history of the Q36 entity and couldn't find anything
relevant there, so it seems that the mapping was never stored in
Wikidata. Maybe there was a transient error of some kind?
Another such case was Q1754 Stockholm ("Tukholma" in YSO places). But
for that one we removed the existing mapping in Mix'n'match and set it
again, and now it is properly stored in Wikidata.
Mix'n'match currently reports 4228 mappings for YSO places, while a
SPARQL query for the Wikidata endpoint returns 4221 such mappings. So I
suspect that this only affects a small number of entities.
Is it possible to compare the Mix'n'match mappings with what actually
exists in Wikidata, and somehow re-sync them? Or just to get the
mappings out from Mix'n'match and compare them with what exists in
Wikidata, so that the few missing mappings may be added there manually?
Thanks,
Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suominen(a)helsinki.fi
http://www.nationallibrary.fi
Hello all,
we’ll be deploying a minor change to the Wikidata interface soon. If your
interface is set up in a language other than English, when you view an
entity page, you can see entity labels in other languages than the one you
selected, according to language fallbacks (e. g. Austrian German users
might see German labels if there is no Austrian German label, or even
English labels if there is no German label either for an entity). The
language of the label being displayed is shown in a small indicator,
e. g. Douglas
Adams <https://www.wikidata.org/wiki/Q42> *English*. From August 29th
forward, that indicator will be hidden by default if the user language and
the language of the label are variants of the same language, e. g. if the
user interface is in Austrian German and the label is in German.
If you don’t like this change, you can override it by adding a small piece
of code <https://phabricator.wikimedia.org/P5929> to your your common.css
<https://www.wikidata.org/wiki/Special:MyPage/common.css>. If you encounter
any problem with this change, feel free to leave a comment under this
Phabricator ticket <https://phabricator.wikimedia.org/T174318>.
Best regards,
Lucas Werkmeister
--
Lucas Werkmeister
Software Developer (Intern)
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
https://wikimedia.de
Imagine a world, in which every single human being can freely share in the
sum of all knowledge. That‘s our commitment.
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hello all,
WikiArabia 2017 is taking place in Cairo, Egypt, on October 23-25.
https://meta.wikimedia.org/wiki/WikiArabia/2017
They are looking for someone to handle a Wikidata workshop there.
If a volunteer from the area is willing to go there, please contact the
organizers for more details! You can write an e-mail to
ah.hamdi(a)wikimedia-eg.org
Please share in your local networks if relevant.
Thanks a lot,
--
Léa Lacroix
Project Manager Community Communication for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hey all,
I wanted to followup from the Zooniverse comments/thread a couple months
back, as part of Pharos's comments about crowd-sourcing depicts statements (
https://lists.wikimedia.org/pipermail/wikidata/2017-June/010795.html). I
met with Zooniverse staff during the week leading up to Wikimania, and was
able to get a rough outline of how they might be connected with our
ecosystem:
- They have a project builder system, that allows anyone to create a
project that uses their crowdsourcing structures, to do description and
identification of components of media files. The documentation is at:
https://www.zooniverse.org/lab
- They have an undocumented method for drawing media from external
repositories via a simple API -- so we could be calling on content
generated from sets via Commons or Wikidata.
- They would be interested in exploring if there is a good process for
connecting Commons as a source for their project builder, and would be
willing to provide some developer advisory support for either a volunteer
or partner institution to develop an appropriate link between the two apis.
I am talking with Pharos about potentially doing something with the Met set
on Commons, but if you think you have a set of media files already on
Commons that might be of interest to a citizen science-type crowdsourcing
project - please let me know. (Tools developed by the Wikidata community
might be more appropriate in the short term for the Met collection [1])
- Cool-less relevant note: Zooniverse is experimenting with using
Machine learning models to both prompt citizen science actions, and to sort
various sets of media in projects.
I am interested in exploring this relationship, because as Structured Data
on Commons gains the ability to store structured media file information in
the next few years, we have an increased ability to absorb simple
descriptive and other crowdsourcing on top of media files, and the
Zooniverse community provides access to a very wide group of
crowd-contribution interested communities. Moreover, if one of the value
statements for uploading to Commons, was a simple access point to
Zooniverse Crowdsourcing for either scientific purposes or enriching
structured descriptive metadata, we might have a lot more contributions of
both scientific and cultural heritage collections to Commons.[2]
If you have a set of media on Commons that you might be interested in
testing in the Zooniverse Project Builder, and have some developer
capacity, please let me know offlist. We don't need to move quickly on the
offer of consultation, but I would like to continue talking with Zooniverse
about how to create a better relationship.
Cheers,
Alex
--
Alex Stinson
GLAM-Wiki Strategist
Wikimedia Foundation
Twitter:@glamwiki/@sadads
[1] see the update from Gordibach:
https://meta.wikimedia.org/wiki/Grants_talk:IdeaLab/Wikidata_Paintbrush
[2] I am also keenly aware that any projects that pilot this kind of data
enrichment, will require a fair amount of Commons and/or Wikidata community
discussion about data quality, and its use.
Learn more about how the communities behind Wikipedia, Wikidata and other
Wikimedia projects partner with cultural heritage organizations:
http://glamwiki.org
One of this month's WMF research showcase presentations is by Andrew Su of
Scripps Institute, the coordinator of Gene Wiki
<https://en.wikipedia.org/wiki/Portal:Gene_Wiki>.
---------- Forwarded message ----------
From: Sarah R <srodlund(a)wikimedia.org>
Date: Mon, Aug 21, 2017 at 3:22 PM
Subject: [Analytics] Research Showcase Wednesday, August 23, 2017 at 11:30
AM (PST) 18:30 UTC
To: wikimedia-l(a)lists.wikimedia.org, analytics(a)lists.wikimedia.org,
wiki-research-l(a)lists.wikimedia.org
Hi Everyone,
The next Research Showcase will be live-streamed this Wednesday, August 23,
2017 at 11:30 AM (PST) 18:30 UTC.
YouTube stream: https://www.youtube.com/watch?v=Fa0Ztv2iF4w
As usual, you can join the conversation on IRC at #wikimedia-research. And,
you can watch our past research showcases here
<https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#August_2017>.
This month's presentation:
Sneha Narayan (Northwestern University)
*The Wikipedia Adventure: Field Evaluation of an Interactive Tutorial for
New Users*
Integrating new users into a community with complex norms presents a
challenge for peer production projects like Wikipedia. We present The
Wikipedia Adventure (TWA): an interactive tutorial that offers a structured
and gamified introduction to Wikipedia. In addition to describing the
design of the system, we present two empirical evaluations. First, we
report on a survey of users, who responded very positively to the tutorial.
Second, we report results from a large-scale invitation-based field
experiment that tests whether using TWA increased newcomers' subsequent
contributions to Wikipedia. We find no effect of either using the tutorial
or of being invited to do so over a period of 180 days. We conclude that
TWA produces a positive socialization experience for those who choose to
use it, but that it does not alter patterns of newcomer activity. We
reflect on the implications of these mixed results for the evaluation of
similar social computing systems.
Andrew Su (Scripps Research Institute)
*The Gene Wiki: Using Wikipedia and Wikidata to organize biomedical
knowledge*
The Gene Wiki project began in 2007 with the goal of creating a
collaboratively-written, community-reviewed, and continuously-updated
review article for every human gene within Wikipedia. In 2013, shortly
after the creation of the Wikidata project, the project expanded to include
the organization and integration of structured biomedical data. This talk
will focus on our current and future work, including efforts to encourage
contributions from biomedical domain experts, to build custom applications
that use Wikidata as the back-end knowledge base, and to promote
CC0-licensing among biomedical knowledge resources. Comments, feedback and
contributions are welcome at https://github.com/SuLab/genewikicentral and
https://www.wikidata.org/wiki/WD:MB.
Kindly,
Sarah R. Rodlund
Senior Project Coordinator-Product & Technology, Wikimedia Foundation
srodlund(a)wikimedia.org
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>