> From: JFC Morfin <jefsey(a)jefsey.com>
> Thank you for this detailed explanation.
> How do you see the integration/impact of Wikidata on both projects?
My intuition is that the impact could be mutual:
* for YAGO and DBpedia, the impact would be immediate, because
Wikidata could essentially provide cleaner infobox data for these
projects. Yet, we have to see how Wikidata will position itself to
Freebase, which seems to pursue a similar goal:
http://en.wikipedia.org/wiki/Freebase
(If you have thoughts on distinguishing Wikidata from Freebase, we'd
be happy to know)
* for Wikidata, there could be some leverage in the ontologies as
well, possibly for bootstrapping.
- YAGO, e.g., has mappings of infobox data to relations with domains
and ranges, with a quality guarantee. One naive idea is that these
could contribute to filling up Wikidata initially, because it seems
easier for humans to correct or complete data than to insert it from
scratch.
- Another aspect is that YAGO connects the Wikipedia categories and
pages to WordNet, the major digital lexicon of English
(http://wordnet.princeton.edu/). This could contribute a strict
semantic typing / class hierarchy / taxonomy to Wikidata, which is so
far absent in Wikipedia.
- Last, YAGO has the connection to Geonames (providing data about
geographical entities), and also the connection to the Universal
Wordnet (providing translations of class names and entity names to 200
other languages -- basically a cleaned and expanded version of the
Wikipedia interlanguage links).
- DBpedia, too, could contribute, because its hub position in the
cloud of linked data connects it to many other resources.
Cheers
Fabian
(putting it back on list as I think it helps to reach a common
understanding of what we try and what we should achieve)
2012/4/6 Gregor Hagedorn <g.m.hagedorn(a)gmail.com>
> off list, because I am afraid this is getting off-topic, but you can
> put it back to list if you like:
>
> > In Wikidata they would be represented as two pages, one for the Kubistar
> > (which would link to the Danish and German page for the Kubistar), and
> one
> > for the Kangoo (which would link to the 20 language versions of the
> Kangoo
> > article, including a Danish and a German one). This is a rather simple
> > example, which would be easily expressed with the exact matches that we
> > suggest.
>
>
> So far I fail to understand: where would the actual data for the model
> live (the kangoo page is a summary page for models with different
> specs in the infoboxes)
>
> I assume on separate wikidata pages, that have no relation to the kangoo
> page?
>
>
Correct. Information about different models would be given on different
Kangoo pages. If there is no Wikipedia page for that model in a given
Wikipedia, no Wikipedia link should be given. (This does not mean that
there can be no information displayed about them in the given Wikipedia:
any Wikipedia article will be able to display any information about any
item in Wikidata in phase 3).
> I am concerned with reverse discovery of information by editors coming
> from wikipedia to wikidata. I realize this is a separate topic, but
> one in the back of my mind (or rather in my use case scenario) and why
> I am arguing to allow such relations.
>
>
There will be plenty of links connecting Wikidata items with each other. I
don't think that this kind of information discovery will be a major hurdle.
>
> ---
>
> The kubistar example is interesting to me, because wikidata would then
> suggest the english wikipedia has no information on the kubistar,
> whereas it really is available on the kangoo page.
>
> this is in fact one thing that the interlanguage links have been used
> for, but because of situations as above with limited success.
>
>
The assumption that because there is no article about X in a given
Wikipedia, there is no information about X in that Wikipedia, is not
correct, and should not be made.
> > from one single Wikidata article. Two Wikidata pages cannot claim the
> same
> > Wikipedia article in a single language as their defining article.
>
> here you speak of defining article, where currently it is a set of
> more or less roughly related wikipedia pages in different languages.
>
>
Our assumption is that in general if one Wikipedia page is identifying a
topic, the ones connected through interwiki links are identifying the same
topic. The rules for interwiki links in the German Wikipedia mandate that
[1], the ones in the English are only little bit less strict about that
[2]. For the few exceptions where that is not the case, interwiki links can
be set and overwritten locally.
[1] https://de.wikipedia.org/wiki/Hilfe:Internationalisierung
[2] https://en.wikipedia.org/wiki/Help:Interlanguage_links
>
> late in the night, like for you as well...
>
> Greogr
>
--
Project director Wikidata
Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
Dear all,
now is a good time to fix your favorite errors (all languages) in
DBpedia for the 3.8 release. Please note the email below.
All the best,
Sebastian
PS: of course the live version should always be up to date:
http://wiki.dbpedia.org/DBpediaLive
-------- Original Message --------
Subject: [Dbpedia-discussion] Mapping Sprin(g|t) 2012 to DBpedia
release 3.8
Date: Wed, 18 Apr 2012 10:41:31 +0200
From: Jona Christopher Sahnwaldt <jc(a)sahnwaldt.de>
To: dbpedia-discussion(a)lists.sourceforge.net,
dbpedia-developers(a)lists.sourceforge.net
Hi everyone,
the new mapping statistics from the latest wikipedia data are finally online at
http://mappings.dbpedia.org/sprint/ and
http://mappings.dbpedia.org/server/statistics/
I would like to invite everyone to join the DBpedia Mapping Sprin(g|t)
2012! Get your language to the top until April 28. The new and
improved mappings will be used for the upcoming DBpedia 3.8 release.
Also, please find, discuss and fix problems in the DBpedia ontology.
We will add a few templates that make it easier to mark problems on
ontology pages.
Because of some hiccups in the last week, the curves look broken, but
as the new data is added, they will get prettier in a few days. The
new coverage percentage numbers may be a bit lower than before because
there are now more templates in Wikipedia.
Happy mappings!
Best regards,
Christopher
------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion(a)lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
Hi,
thanks for all of these information!
I wanted to clarify the "quality guarantee" in YAGO. It is true that
we did not evaluate every statement individually. Rather, we proceeded
as follows: For each relation in YAGO, we have taken a random sample,
and evaluated it manually (wrt Wikipedia). From the ratio of correct
statements on the sample, we have used statistic techniques to
estimated the ratio of correct statements for the whole of YAGO. This
ratio will usually be lower than the ratio on the sample. Thus, more
precisely, the guarantee reads: With a probability of 95%, the ratio
of correct statements for "actedInMovie" is in the interval
97.36%+/-2.64%.
I also wanted to ask again on the relationship between Freebase and
Wikidata: Freebase was bootstrapped from the infoboxes of Wikipedia,
but I think its main selling point is that volunteers can add and
correct data. Thus, my understanding is that, both in Wikidata and in
Freebase, volunteers would fill up structured, factual information. Is
that right? My intuition is that Wikidata will have a more principled
approach, because it can build on the Wikipedia/Wikimedia culture.
Other comments are appreciated.
Thanks
Fabian
This is an interesting criticism, and there's an excellent retort by Denny in
the comments. Just fyi.
-- daniel
-------- Original Message --------
Subject: [Wiki-research-l] Wikidata opinion piece in The Atlantic
Date: Tue, 10 Apr 2012 16:50:49 -0700
From: En Pine <deyntestiss(a)hotmail.com>
Reply-To: Research into Wikimedia content and communities
<wiki-research-l(a)lists.wikimedia.org>
To: <wiki-research-l(a)lists.wikimedia.org>
Here's an opinion piece, "The Problem with Wikidata", by Mark Graham, who
"is a Research Fellow at the Oxford Internet Institute," which appears on
The Atlantic's website. I'm not personally supporting or opposing his views
but I found this to be an interesting read.
http://www.theatlantic.com/technology/archive/2012/04/the-problem-with-wiki…
Pine
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Heya :)
There is a new mailing list that will get all the bugmail related to
Wikidata. You can subscribe at
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs if you want
to get them.
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata
Wikimedia Deutschland e.V.
Eisenacher Straße 2
10777 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
https://bugzilla.wikimedia.org/show_bug.cgi?id=35960
Web browser: ---
Bug #: 35960
Summary: add config varible to hide incubator test wikipedia
langlinks on wikipedia projects by default
Product: MediaWiki extensions
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: major
Priority: Unprioritized
Component: WikidataClient
AssignedTo: wikibugs-l(a)lists.wikimedia.org
ReportedBy: bugreporter(a)to.mabomuja.de
CC: wikidata-l(a)lists.wikimedia.org
Classification: Unclassified
Mobile Platform: ---
Currently langlinks on commons, wikispecies and incubator are undirectional,
because they only links to wikipedia wikis, but no backlink is possible/wanted.
So several incubator testwiki must be included at one interwiki group.
Langlinks data from wikidata should be included on article at incubator, but on
main wikipedia projects no langlink to incubator should be shown.
So there should be a config variable to hide all links to
incubator/wikisource/commons without adding e.g.
{{NOEXTERNALINTERLANG:incubator}} to each article.
--
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
At 17:18 15/04/2012, Fabian M. Suchanek wrote:
>The WWW conference is a scientific conference in computer science on
>the newest developments of the Web. It is in Lyon this year:
>http://www2012.wwwconference.org
In Lyon: today and this week.
Thank you for this detailed explanation.
How do you see the integration/impact of Wikidata on both projects?
jfc
PS. I see that the "interlink" concept is still not documented on
en:wikipedia.
-- Nadja Kutz wrote
>Is it possible to briefly explain the major differences between
>DBpedia and the Yago Knowledge graph?
Both projects aim to extract a so-called ontology from Wikipedia. An
ontology in this sense is a graph (= a kind of net), in which the
nodes are entities (like Albert Einstein, or the city of Ulm) and the
links between the nodes are relationships (like "wasBornIn"). See here
for an example:
http://www.mpi-inf.mpg.de/departments/ontologies/areas/index.html
For this purpose, both projects use the structured information of
Wikipedia, i.e., its category system and its infoboxes. Both projects
have extracted a graph of several million nodes, and dozens of
millions of links between them. Seen this way, the projects go in the
same direction as what Wikidata aims to do, but in an automated
fashion.
Both projects share the same goal, but have different foci:
* YAGO has a set of around 100 relationships and maps Wikipedia
infobox attributes to them. DBpedia, in contrast, has two systems
- one system, in which each Wikipedia infobox attribute becomes a
relationship. This set of data is rather noisy, but very exhaustive.
- another system, in which relationships are defined and mapped from
infobox attributes by a community of voluteers.
The differences between these two systems are summarized here in Chapter 10.3
http://www.mpi-inf.mpg.de/yago-naga/yago/publications/aij.pdf
* DBpedia is the hub of the linked data cloud. YAGO is also in this
cloud, but not as central as DBpedia.
* YAGO attaches time and space information to many of its entities,
i.e., it knows when and where certain facts happened, and integrates
this information with data from Geonames. This aspect is less
prominent in DBpedia.
* YAGO has traditionally put much emphasis on logical constraint
checking, type checking, and a strong type hierarchy -- all in order
to maintain a high precision of the data. DBpedia, in contrast,
imports one of its type hierarchies from YAGO, and builds its own,
flatter, type hierarchy through a community of volunteers.
* YAGO has been evaluated manually, thus attaching a probabilistic
precision value to each of its relations. That is, for the relation
"actedInMovie", e.g., we know that 97% of the statements are correct.
Details are here:
http://www.mpi-inf.mpg.de/yago-naga/yago/statistics.html
For DBpedia, there is no such analysis to our knowledge.
* Both ontologies have a surprisingly small overlap of data and
instances, if mapped naively, see Chapter 5.2.3 in
http://suchanek.name/work/publications/iswc2011.pdf
... but a larger overlap if mapped in a more sophisticated way, see
Section 6.4 in
http://suchanek.name/work/publications/vldb2012.pdf
* there are certainly a number of more differences, which I may not
know or I may have overseen, please feel free to add.
>what is the www conference ?
The WWW conference is a scientific conference in computer science on
the newest developments of the Web. It is in Lyon this year:
http://www2012.wwwconference.org
Cheers
Fabian
--
Fabian online: http://suchanek.name