Bad merges have been mentioned a couple of times recently and I think one
of the contexts with Ben's gene/protein work.
I think there are two general issues here which could be improved:
1. Merging is too easy. Because splitting/unmerging is much harder than
merging, particularly after additional edits, the process should be biased
to mark merging more difficult.
2. The impedance mismatch between Wikidata and Wikipedias tempts
wikipedians who are new to wikidata to do the wrong thing.
The second is a community education issue which will hopefully improve over
time, but the first could be improved, in my opinion, by requiring more
than one person to approve a merge. The Freebase scheme was that duplicate
topics could be flagged for merge by anyone, but instead of merging, they'd
be placed in a queue for voting. Unanimous votes would cause merges to be
automatically processed. Conflicting votes would get bumped to a second
level queue for manual handling. This wasn't foolproof, but caught a lot of
the naive "these two things have the same name, so they must be the same
thing" merge proposals by newbies. There are lots of variations that could
be implemented, but the general idea is to get more than one pair of eyes
involved.
A specific instance of the structural impedance mismatch is enwiki's
handling of genes & proteins. Sometimes they have a page for each, but
often they have a single page that deals with both or, worse, a page who's
text says its about the protein, but where the page includes a gene infobox.
This unanswered RFC from Oct 2015 asks whether protein & gene should be
merged:
https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Oxytocin_and_OX…
I recently ran across a similar situation where this Wikidata gene SPATA5
https://www.wikidata.org/wiki/Q18052679 is linked to an enwiki page about
the associated protein https://en.wikipedia.org/wiki/SPATA5, while the
Wikidata protein is not linked to any wikis
https://www.wikidata.org/wiki/Q21207860
These differences in handling make the reconciliation process very
difficult and the resulting errors encourage erroneous merges. The
gene/protein case probably needs multiple fixes, but many mergers harder
would help.
Tom
********************************************
15th International Semantic Web Conference
October 17-21, 2016
Kobe, Japan
http://iswc2016.semanticweb.org
Call for Posters and Demos
********************************************
The ISWC 2016 Posters and Demonstrations complement the paper tracks of
the conference and offer an opportunity for presenting late-breaking
research results, on-going research projects, and speculative or
innovative work in progress. The informal setting of the Posters and
Demonstrations encourages presenters and participants to engage in
discussions about the work. Such discussions can be invaluable inputs
for the future work of the presenters, while offering participants an
effective way to broaden their knowledge of the emerging research trends
and to network with other researchers.
We invite submissions relevant to the area of the Semantic Web and which
address, but are not limited to, the topics of the Research Track; the
Application Track; and the Resource Track. Technical posters, reports on
Semantic Web software systems (free or commercial), descriptions of
completed work, and work in progress are all welcome. Demonstrations are
intended to showcase innovative Semantic Web related implementations and
technologies, both in academia and in industry.
We explicitly welcome entries from the industry. However, submissions
for posters and demos should go beyond pure advertisements of commercial
software packages and convey a minimal scientific contribution.
Authors of full papers accepted for the Research Track; the Application
Track; and the Resource Track are explicitly invited to submit a
demonstration. The submission should be formatted as the other posters
and demonstrations but must cite the accepted full paper and needs to
include an explanation of its added value with respect to the conference
paper. The added value could include: a) extended results and
experiments not presented in the conference paper for reasons of space,
or b) a demonstration of a supporting prototype implementation.
Submission Information
Authors must submit a four-page extended abstract for evaluation. All
submissions will undergo a common review process, including those
related to already accepted full papers. For demonstrations, authors are
strongly encouraged to include in their submission a link to where the
demo (live or recorded video) can be found. They should also make clear
what exactly will be demonstrated to the participants (e.g., what
datasets will be used, which functionalities will be shown).
All submissions must be made electronically via the EasyChair conference
submission system:
https://www.easychair.org/conferences/?conf=iswc2016pd
Submissions must use the PDF file format and must adopt the style of the
Springer Publications format for Lecture Notes in Computer Science
(LNCS). Details are provided on Springer's Author Instructions page.
Submissions that exceed the page limit will be rejected without review.
For details on the HTML format, see the HTML submission guide.
At least one of the authors must be a registered participant at the
conference, and attend the Poster/Demo Session to present the work. The
abstracts of accepted posters and demonstrations will be given to all
conference attendees and published on the conference web site, but will
not be published by Springer in the printed conference proceedings. They
will, however, be compiled into a CEUR-WS Proceedings for easy Web
retrieval and archival.
Metadata for all successful submissions will be included in the
conference metadata corpus. Detailed information about metadata creation
will be provided with the acceptance notification of the successful
submissions.
As in previous years, student-authors of accepted papers will be able to
apply for travel support to attend the conference.
Review Criteria
All papers submitted to the Posters & Demos Track will be reviewed by at
least three program committee members. Decisions about acceptance will
be based on relevance to the Semantic Web, originality, potential
significance, relevance, and clarity. The purpose of the track is to
allow the presentation of preliminary results to the community, provided
that originality and significance of the contribution are ensured.
Authors submitting a demo are strongly encouraged to include in the
paper a pointer to the online demo or video of the application to be
presented. The absence of a pointer will affect the overall evaluation
of the paper.
Important Dates
* Poster & Demo Submission: July 7th, 2016
* Notifications: August 7th, 2016
* Camera-Ready Versions: August 30th, 2016
All deadlines are midnight Hawaii time.
Posters and Demos Track Chairs
* Takahiro Kawamura, Japan Science and Technology Agency, Japan
* Heiko Paulheim, University of Mannheim, Germany
On 9 June 2016 at 05:42, Biyanto Rebin <biyanto.rebin(a)wikimedia.or.id> wrote:
> PS: I'm making WikiProjects Languages in Indonesia, anyone who want to join
> please let me know :)
This is great news, and deserving of its own thread. Subject changed!
--
Andy Mabbett
@pigsonthewing
http://pigsonthewing.org.uk
It is great that WikiData provides a way for data to be curated in a
crowd-sourced way.
It would be even better if changes (especially corrections) could be
communicated back to the original source so that all could benefit.
Has this been discussed previously? Considered?
Julie
Hey folks :)
Wikimania is coming up and quite a few Wikidata things will happen there. I
tried to summarize them all here:
https://wikimania2016.wikimedia.org/wiki/Wikidata. If you know of more
please add them.
Hope to see many of you at the meetup!
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hey folks :)
Just a quick heads-up that for the second round 3 more Wikipedias (Gujarati,
Latvian, Nynorsk) requested the ArticlePlaceholder be enabled. We have
done this last night. This means it is now live on the following Wikipedias:
* Napolitan
* Esperanto
* Odia
* Haitian Creole
* Gujarati (
https://gu.wikipedia.org/wiki/%E0%AA%B5%E0%AA%BF%E0%AA%B6%E0%AB%87%E0%AA%B7…
)
* Latvian (https://lv.wikipedia.org/wiki/Special:AboutTopic/Q308928)
* Nynorsk (https://nn.wikipedia.org/wiki/Spesial:AboutTopic/Q2013)
If your Wikipedia would like to be included in the next round please start
a discussion on-wiki and then let me know or file a ticket on Phabricator.
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hey everyone :)
Over the past 4 years - since the beginning of the development - I have
been doing the community communications around Wikidata. The main
objectives being that there is good communication between the editors, the
development team and everyone else around Wikidata and that the project is
a healthy and welcoming place. Since then Wikidata has grown massively and
I have also taken on the product management for Wikidata. Both tasks are
getting too big to be handled by one person while still giving editors and
the development team the attention they deserve. Because of this I am now
looking to hire someone who takes over the community communications part. I
hope we can find an enthusiastic and knowledgeable person who can support
you all.
The job add is here:
https://wikimedia.de/wiki/Project_Manager_(m/f)_Community_Communication_Wik…
If
you have any questions feel free to reach out to me. If you know someone
who would be a good fit please do reach out to them and encourage them to
apply.
Cheers
Lydia
PS: Don't worry. I'll still be around and available to you all but focusing
more on the product management. Wikidata and you all are still very
precious to me.
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.