Hoi,
Much of the content of DBpedia and Wikidata have the same origin;
harvesting data from a Wikipedia. There is a lot of discussion going on
about quality and one point that I make is that comparing "Sources" and
concentrating on the differences particularly where statements differ is
where it is easiest to make a quality difference.
So given that DBpedia harvests both Wikipedia and Wikidata, can it provide
us with a view where a Wikipedia statement and a Wikidata statement differ.
To make it useful, it is important to subset this data. I will not start
with 500.000 differences but I will begin when they are about a subset that
I care about.
When I care about entries for alumni of a university, I will consider
curating the information in question. Particularly when I know the language
of the Wikipedia.
When we can do this, another thing that will promote the use of a tool like
this is when regularly (say once a month) numbers are stored and trends are
published.
How difficult is it to come up with something like this. I know this tool
would be based on DBpedia but there are several reasons why this is good.
First it gives added relevance to DBpedia (without detracting from
Wikidata) and secondly as DBpedia updates on RSS changes for several
Wikipedias, the effect of these changes is quickly noticed when a new set
of data is requested.
Please let us know what the issues are and what it takes to move forward
with this, Does this make sense?
Thanks,
GerardM
http://ultimategerardm.blogspot.nl/2017/03/quality-dbpedia-and-kappa-alpha-…
Hello everyone,
I'm working on a research project on wikidata queries and was wondering
if the wdqs-app-footer-updated element uses a sparql-query to determine
when the data was updated. The reason I'm asking is that we have the
same query asking when wikidata was last modified (all formated in the
same way) from a bunch of different userAgents (mostly browsers).
Greetings,
Adrian
Hello Wikidata folks,
I hope this email finds you all well. I'm working as a freelance researcher
and trying to figure out an answer or a tool that would lead me to an
answer for the following questions:
1. How many citations exist on Wikipedia
2. How many of those citations link to a full-text version of the
reference document?
Thanks a bunch for the help. I hope someone has a good idea.
Cheerios,
J
Hello everyone,
I am looking for a text corpus that is annotated with Wikidata entites.
I need this for the evaluation of an entity linking tool based on
Wikidata, which is part of my bachelor thesis.
Does such a corpus exist?
Ideal would be a corpus annotated in the NIF format [1], as I want to
use GERBIL [2] for the evaluation. But it is not necessary.
Thanks for hints!
Samuel
[1] https://site.nlp2rdf.org/
[2] http://aksw.org/Projects/GERBIL.html
Hello,
Our next Wikidata IRC office hour will take place on April 5th, 16:00 UTC
<https://www.timeanddate.com/worldclock/meetingdetails.html?year=2017&month=…>
(18:00 in Berlin), on the channel #wikimedia-office.
During one hour, you'll be able to chat with the development team about the
past, current and future projects, and ask any question you want.
See you there,
<http://webchat.freenode.net/?channels=#wikimedia-office>
--
Léa Lacroix
Project Manager Community Communication for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Dear all,
I was just pinged by a Wikidata hackathon in Suriname
(cf. https://www.spangmakandra.com/big-data-seminar-suriname )
that they can't edit Wikidata any more - see also
https://twitter.com/twitferry/status/851907389087502338 .
We are musing that this may be due to an IP ban, since more than six
new accounts were registered from the same IP (186.179.xxx.xx).
Can anyone help sort this out quickly, so that the event can move on?
Thanks,
Daniel
The current spec of the data model states that an L-Item has a lemma, a
language, and several forms, and the forms in turn have representations.
https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/Data_Model
The language is a Q-Item, the lemma and the representations are
Multilingual Texts. Multilingual texts are sets of pairs of strings and
UserLanguageCodes.
My question is about the relation between representing a language as a
Q-Item and as a UserLanguageCode.
A previous proposal treated lemmas and representations as raw strings, with
the language pointing to the Q-Item being the only language information.
This now is gone, and the lemma and representation carry their own language
information.
How do they interact? The language set referencable through Q-Items is much
larger than the set of languages with a UserLanguageCode, and indeed, the
intention was to allow for every language to be representable in Wikidata,
not only those with a UserLanguageCode.
I sense quite a problem here.
I see two possible ways to resolve this:
- return to the original model and use strings instead of Multilingual
texts (with all the negative implications for variants)
- use Q-Items instead of UserLanguageCodes for Multilingual texts (which
would be quite a migration)
I don't think restricting Wiktionary4Wikidata support to the list of
languages with a UserLanguageCode is a viable solution, which would happen
if we implement the data model as currently suggested, if I understand it
correctly.
Cheers,
Denny
Hi!
I've added more endpoint to SPARQL federation whitelist, including
attribution-licensed ones. Please see the full list in the User
Manual[1]. All endpoints are acknowledged at WDQS Licensing page[2].
You are welcome to nominate new endpoints at Wikidata SPARQL federation
input[3] page. I've made some cleanups and edits there to make
nominating easier, hopefully that helps. Examples and links to any
materials showing how to integrate data between endpoints are very
welcome too.
[1]
https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual#Federation
[2] https://query.wikidata.org/copyright.html
[3] https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input
Thanks,
--
Stas Malyshev
smalyshev(a)wikimedia.org
Hi all!
I'm curious about how the SPARQL endpoint is quite up to date with live
editing. So far I've found only the RDF dump generator via Wikidata Toolkit
[1] and seems to me a too heavy process for this.
There are others way to extract triples from wikibase?
New edits on Wikidata are loaded on its SPARQL endpoint via SPARUL?
Thanks,
Alessio
[1] https://github.com/Wikidata/Wikidata-Toolkit