This is an interesting criticism, and there's an excellent retort by Denny in
the comments. Just fyi.
-- daniel
-------- Original Message --------
Subject: [Wiki-research-l] Wikidata opinion piece in The Atlantic
Date: Tue, 10 Apr 2012 16:50:49 -0700
From: En Pine <deyntestiss(a)hotmail.com>
Reply-To: Research into Wikimedia content and communities
<wiki-research-l(a)lists.wikimedia.org>
To: <wiki-research-l(a)lists.wikimedia.org>
Here's an opinion piece, "The Problem with Wikidata", by Mark Graham, who
"is a Research Fellow at the Oxford Internet Institute," which appears on
The Atlantic's website. I'm not personally supporting or opposing his views
but I found this to be an interesting read.
http://www.theatlantic.com/technology/archive/2012/04/the-problem-with-wiki…
Pine
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Heya :)
There is a new mailing list that will get all the bugmail related to
Wikidata. You can subscribe at
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs if you want
to get them.
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata
Wikimedia Deutschland e.V.
Eisenacher Straße 2
10777 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
https://bugzilla.wikimedia.org/show_bug.cgi?id=35960
Web browser: ---
Bug #: 35960
Summary: add config varible to hide incubator test wikipedia
langlinks on wikipedia projects by default
Product: MediaWiki extensions
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: major
Priority: Unprioritized
Component: WikidataClient
AssignedTo: wikibugs-l(a)lists.wikimedia.org
ReportedBy: bugreporter(a)to.mabomuja.de
CC: wikidata-l(a)lists.wikimedia.org
Classification: Unclassified
Mobile Platform: ---
Currently langlinks on commons, wikispecies and incubator are undirectional,
because they only links to wikipedia wikis, but no backlink is possible/wanted.
So several incubator testwiki must be included at one interwiki group.
Langlinks data from wikidata should be included on article at incubator, but on
main wikipedia projects no langlink to incubator should be shown.
So there should be a config variable to hide all links to
incubator/wikisource/commons without adding e.g.
{{NOEXTERNALINTERLANG:incubator}} to each article.
--
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
At 17:18 15/04/2012, Fabian M. Suchanek wrote:
>The WWW conference is a scientific conference in computer science on
>the newest developments of the Web. It is in Lyon this year:
>http://www2012.wwwconference.org
In Lyon: today and this week.
Thank you for this detailed explanation.
How do you see the integration/impact of Wikidata on both projects?
jfc
PS. I see that the "interlink" concept is still not documented on
en:wikipedia.
-- Nadja Kutz wrote
>Is it possible to briefly explain the major differences between
>DBpedia and the Yago Knowledge graph?
Both projects aim to extract a so-called ontology from Wikipedia. An
ontology in this sense is a graph (= a kind of net), in which the
nodes are entities (like Albert Einstein, or the city of Ulm) and the
links between the nodes are relationships (like "wasBornIn"). See here
for an example:
http://www.mpi-inf.mpg.de/departments/ontologies/areas/index.html
For this purpose, both projects use the structured information of
Wikipedia, i.e., its category system and its infoboxes. Both projects
have extracted a graph of several million nodes, and dozens of
millions of links between them. Seen this way, the projects go in the
same direction as what Wikidata aims to do, but in an automated
fashion.
Both projects share the same goal, but have different foci:
* YAGO has a set of around 100 relationships and maps Wikipedia
infobox attributes to them. DBpedia, in contrast, has two systems
- one system, in which each Wikipedia infobox attribute becomes a
relationship. This set of data is rather noisy, but very exhaustive.
- another system, in which relationships are defined and mapped from
infobox attributes by a community of voluteers.
The differences between these two systems are summarized here in Chapter 10.3
http://www.mpi-inf.mpg.de/yago-naga/yago/publications/aij.pdf
* DBpedia is the hub of the linked data cloud. YAGO is also in this
cloud, but not as central as DBpedia.
* YAGO attaches time and space information to many of its entities,
i.e., it knows when and where certain facts happened, and integrates
this information with data from Geonames. This aspect is less
prominent in DBpedia.
* YAGO has traditionally put much emphasis on logical constraint
checking, type checking, and a strong type hierarchy -- all in order
to maintain a high precision of the data. DBpedia, in contrast,
imports one of its type hierarchies from YAGO, and builds its own,
flatter, type hierarchy through a community of volunteers.
* YAGO has been evaluated manually, thus attaching a probabilistic
precision value to each of its relations. That is, for the relation
"actedInMovie", e.g., we know that 97% of the statements are correct.
Details are here:
http://www.mpi-inf.mpg.de/yago-naga/yago/statistics.html
For DBpedia, there is no such analysis to our knowledge.
* Both ontologies have a surprisingly small overlap of data and
instances, if mapped naively, see Chapter 5.2.3 in
http://suchanek.name/work/publications/iswc2011.pdf
... but a larger overlap if mapped in a more sophisticated way, see
Section 6.4 in
http://suchanek.name/work/publications/vldb2012.pdf
* there are certainly a number of more differences, which I may not
know or I may have overseen, please feel free to add.
>what is the www conference ?
The WWW conference is a scientific conference in computer science on
the newest developments of the Web. It is in Lyon this year:
http://www2012.wwwconference.org
Cheers
Fabian
--
Fabian online: http://suchanek.name
I meanwhile found a public accessible link to your publication: http://www.mpi-inf.mpg.de/yago-naga/yago/publications/aij.pdf in which you write:
"However, in contrast to the original YAGO, the methodology for building YAGO2 (and also maintaining it) is systematically designed top-down with the goal of integrating entity-relationship-oriented facts with the spatial and tempo- ral dimensions. To this end, we have developed an extensible approach to fact extraction from Wikipedia and other sources, and we have tapped on specific inputs that contribute to the goal of enhancing facts with spatio-temporal scope. Moreover, we have developed a new representation model, coined SPOTL tuples (SPO + Time + Location), which can co-exist with SPO triples, but provide a much more convenient way of browsing and querying the YAGO2 knowl- edge base. " etc. page 3
so it seems one special feature is the explicit treatment of space and time, which sounds interesting. So I would like make some of my questions more precise
"YAGO has about 100 manually defined relations, such as wasBornOnDate, locatedIn and hasPopulation. Categories and infoboxes can be exploited to deliver instances of these relations. (p.3) "
...
"The new YAGO2 architecture is based on declarative rules that are stored in text files."
-Is there a wiki or some other public accessible place for those like the mapping wiki of dbpedia?
-what do you do if infoboxes change?
-How do you treat microformats?
"Instead of seeing only SPO triples and thus having to perform an explicit de-reification join for associated meta-facts, the user should see extended 5-tuples where each fact already includes its associated temporal and spatial information. We refer to this view of the data as the SPOTL view: SPO triples augmented by Time and Location. We also discuss a further optional extension into SPOTLX 6-tuples where the last component offers keywords or key phrases from the conteXt of sources where the original SPO fact occurs.(p. 20)"
It is not yet fully clear to me how your concept go together with other concepts to include contexts like with named graphs or with the inclusion of context-ontologies within formats like JSON-LD, eventually that would need a longer discussion, are you planning to set up a wiki page on datawiki, like for example there is one for JSON-LD: http://meta.wikimedia.org/wiki/Talk:Wikidata/Data_model/JSON ?
Is it possible to extract the Yago queries in some RDF serialization format?
2012/4/14 JFC Morfin <jefsey(a)jefsey.com>
> Lydia,
> May be you could create and maintain http://meta.wikimedia.org/**
> wiki/Wikidata/Status_updates/<http://meta.wikimedia.org/wiki/Wikidata/Status_updates/>as a menu page for all the monthly reports? This way we could quote and use
> it as a single permanent URL for the Status Reports.
> Thank you and best
JFC, http://meta.wikimedia.org/wiki/Wikidata/Status_updates already exists.
Regards,
Sylvain.
--
Sylvain Boissel
Chargé de mission communauté et technologie de Wikimédia France
tél 07.62.93.42.02 - email sylvain.boissel(a)wikimedia.fr - twitter
@sboissel<https://twitter.com/#!/sboissel>
*Imaginez un monde où chaque personne sur la planète aurait librement accès
à la totalité du savoir humain. C'est notre engagement. Aidez Wikimedia
France à en faire une réalité <https://dons.wikimedia.fr>.*
www.wikimedia.fr
(apologies for multiple posts; please forward; please email
i-challenge2012_a_t_easychair.org for questions)
********************
NEWS:
- Deadline extension until April 25th, 2012
- The total amount of 2,000 EUR (sponsored by Wolters Kluwer) will be
awarded in prizes and split among the most promising applications.
- Linked Data Cup Board updated:
http://i-challenge.blogs.aksw.org/chairs-committee
********************
Linked Data Cup 2012
http://i-challenge.blogs.aksw.org/
co-located with the I-Semantics 2012
Graz, Austria, 5 - 7 September 2012
http://www.i-semantics.at
********************
The yearly organised Linked Data Cup (formerly Triplification Challenge)
awards prizes to the most promising innovation involving linked data.
Four different technological topics are addressed: triplification,
interlinking, cleansing, and application mash-ups. The Linked Data Cup
invites scientists and practitioners to submit novel and innovative (5
star) linked data sets and applications built on linked data technology.
Although more and more data is triplified and published as RDF and
linked data, the question arises how to evaluate the usefulness of such
approaches. The Linked Data Cup therefore requires all submissions to
include a concrete use case and problem statement alongside a solution
(triplified data set, interlinking/cleansing approach, linked data
application) that showcases the usefulness of linked data. Submissions
that can provide measurable benefits of employing linked data over
traditional methods are preferred.
Note that the call is not limited to any domain or target group. We
accept submissions ranging from value-added business intelligence use
cases to scientific networks to the longest tail [1] of information
domains. The only strict requirement is that the employment of linked
data is very well motivated and also justified (i.e. we rank approaches
higher that provide solutions, which could not have been realised
without linked data, even if they lack technical or scientific
brilliance). The total amount of 2,000 EUR (sponsored by Wolters Kluwer)
will be awarded in prizes and split among the most promising applications.
Evaluation Criteria
===================
The submissions will be initially evaluated with a well-known five star
ranking system [2]. Furthermore, entries will be assessed according to
the extent to which they
1. motivate the relevancy of their use case for their respective domain;
2. justify the adequacy of linked data technologies for their solution;
3. demonstrate that all alternatives to linked data would have resulted
in an inferior solution;
4. provide an evaluation that can measure the benefits of linked data
Topics
======
Ideas for topics include (but are not limited to):
* Improving traditional approaches with help of linked data
* Linked data use in science and education
* Linked data supported multimedia applications
* Linked data in the open source context
* Web annotation
* Generic applications
* Internationalization of linked data
* Visualization of linked data
* Linked government data
* Business models based on linked data
* Recommender systems supported by linked data
* Integrating microposts with linked data
* Distributed social web based on linked data
* Linked data sensor networks
Submission and Reviewing
========================
Submissions to the Linked Data Cup will be reviewed by members of the
Linked Data Cup Board and invited experts from the Linked Data community.
Submissions should consist of 4 pages and must be original and must not
have been submitted for publication elsewhere. Papers should follow the
ACM ICPS guidelines for formatting as accepted submissions will be
published in the I-SEMANTICS 2012 proceedings in the digital library of
the ACM ICP series. Please read the submission page[a] for detailed
information on how to submit.
Important Dates (Linked Data Cup)
1. Paper Submission Deadline: April 25, 2012
2. Notification of Acceptance: May 21, 2012
3. Camera-Ready Paper: June 11, 2012
Links
=====
[1]
http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=uleiauersemmedmainz20…
[2] http://www.w3.org/DesignIssues/LinkedData.html
--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Projects: http://nlp2rdf.org , http://dbpedia.org
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org