Cool, thanks! I read this a while ago, rereading again.
On Tue, Jan 15, 2019 at 3:28 AM Sebastian Hellmann < hellmann@informatik.uni-leipzig.de> wrote:
Hi all,
let me send you a paper from 2013, which might either help directly or at least to get some ideas...
A lemon lexicon for DBpedia, Christina Unger, John McCrae, Sebastian Walter, Sara Winter, Philipp Cimiano, 2013, Proceedings of 1st International Workshop on NLP and DBpedia, co-located with the 12th International Semantic Web Conference (ISWC 2013), October 21-25, Sydney, Australia
https://github.com/ag-sc/lemon.dbpedia
https://pdfs.semanticscholar.org/638e/b4959db792c94411339439013eef536fb052.p...
Since the mappings from DBpedia to Wikidata properties are here: http://mappings.dbpedia.org/index.php?title=Special:AllPages&namespace=2... e.g. http://mappings.dbpedia.org/index.php/OntologyProperty:BirthDate
You could directly use the DBpedia-lemon lexicalisation for Wikidata.
The mappings can be downloaded with
git clone https://github.com/dbpedia/extraction-framework ; cd core ; ../run download-mappings
All the best,
Sebastian
On 14.01.19 18:34, Denny Vrandečić wrote:
Felipe,
thanks for the kind words.
There are a few research projects that use Wikidata to generate parts of Wikipedia articles - see for example https://arxiv.org/abs/1702.06235 which is almost as good as human results and beats templates by far, but only for the first sentence of biographies.
Lucie Kaffee has also quite a body of research on that topic, and has worked very succesfully and tightly with some Wikipedia communities on these questions. Here's her bibliography: https://scholar.google.com/citations?user=xiuGTq0AAAAJ&hl=de
Another project of hers is currently under review for a grant: https://meta.wikimedia.org/wiki/Grants:Project/Scribe:_Supporting_Under-reso...
- I would suggest to take a look and if you are so inclined to express
support. It is totally worth it!
My opinion is that these projects are great for starters, and should be done (low-hanging fruits and all that), but won't get much further at least for a while, mostly because Wikidata rarely offers more than a skeleton of content. A decent Wikipedia article will include much, much more content than what is represented in Wikidata. And if you only use that for input, you're limiting yourself too much.
Here's a different approach based on summarization over input sources: https://www.wired.com/story/using-artificial-intelligence-to-fix-wikipedias-... - this has a more promising approach for the short- to mid-term.
I still maintain that the Abstract Wikipedia approach has certain advantages over both learned approaches, and is most aligned with Lucie's work. The machine learned approaches always fall short on the dimension of editability, due to the black-boxness of their solutions.
Also, furthermore, agree to Jeblad.
Remains the question, why is there not more discussion? Maybe because there is nothing substantial to discuss yet :) The two white papers are rather high level and the idea is not concrete enough yet, so that I wouldn't expect too much discussion yet going on on-wiki. That was similar to Wikidata - the number who discussed Wikidata at this level of maturity was tiny, it increased considerably once an actual design plan was suggested, but still remained small - and then exploded once the system was deployed. I would be surprised and delighted if we managed to avoid this pattern this time, but I can't do more than publicly present the idea, announce plans once they are there, and hope for a timely discussion :)
Cheers, Denny
On Mon, Jan 14, 2019 at 2:54 AM John Erling Blad jeblad@gmail.com wrote:
An additional note; what Wikipedia urgently needs is a way to create and reuse canned text (aka "templates"), and a way to adapt that text to data from Wikidata. That is mostly just inflection rules, but in some cases it involves grammar rules. To create larger pieces of text is much harder, especially if the text is supposed to be readable. Jumbling sentences together as is commonly done by various botscripts does not work very well, or rather, it does not work at all.
On Mon, Jan 14, 2019 at 11:44 AM John Erling Blad jeblad@gmail.com wrote:
Using an abstract language as an basis for translations have been tried before, and is almost as hard as translating between two common languages.
There are two really hard problems, it is the implied references and the cultural context. An artificial language can get rid of the implied references, but it tend to create very weird and unnatural expressions. If the cultural context is removed, then it can be extremely hard to put it back in, and without any cultural context it can be hard to explain anything.
But yes, you can make an abstract language, but it won't give you any high quality prose.
On Mon, Jan 14, 2019 at 8:09 AM Felipe Schenone schenonef@gmail.com
wrote:
This is quite an awesome idea. But thinking about it, wouldn't it be
possible to use structured data in wikidata to generate articles? Can't we skip the need of learning an abstract language by using wikidata?
Also, is there discussion about this idea anywhere in the Wikimedia
wikis? I haven't found any...
On Sat, Sep 29, 2018 at 3:44 PM Pine W wiki.pine@gmail.com wrote:
Forwarding because this (ambitious!) proposal may be of interest to
people
on other lists. I'm not endorsing the proposal at this time, but I'm curious about it.
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
---------- Forwarded message --------- From: Denny Vrandečić vrandecic@gmail.com Date: Sat, Sep 29, 2018 at 6:32 PM Subject: [Wikimedia-l] Wikipedia in an abstract language To: Wikimedia Mailing List wikimedia-l@lists.wikimedia.org
Semantic Web languages allow to express ontologies and knowledge
bases in a
way meant to be particularly amenable to the Web. Ontologies
formalize the
shared understanding of a domain. But the most expressive and
widespread
languages that we know of are human natural languages, and the
largest
knowledge base we have is the wealth of text written in human
languages.
We looks for a path to bridge the gap between knowledge
representation
languages such as OWL and human natural languages such as English. We propose a project to simultaneously expose that gap, allow to
collaborate
on closing it, make progress widely visible, and is highly
attractive and
valuable in its own right: a Wikipedia written in an abstract
language to
be rendered into any natural language on request. This would make
current
Wikipedia editors about 100x more productive, and increase the
content of
Wikipedia by 10x. For billions of users this will unlock knowledge
they
currently do not have access to.
My first talk on this topic will be on October 10, 2018,
16:45-17:00, at
the Asilomar in Monterey, CA during the Blue Sky track of ISWC. My
second,
longer talk on the topic will be at the DL workshop in Tempe, AZ,
October
27-29. Comments are very welcome as I prepare the slides and the
talk.
Link to the paper: http://simia.net/download/abstractwikipedia.pdf
Cheers, Denny _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe _______________________________________________ Wikipedia-l mailing list Wikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing listWikidata@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata
-- All the best, Sebastian Hellmann
Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center at the Institute for Applied Informatics (InfAI) at Leipzig University Executive Director of the DBpedia Association Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt http://www.w3.org/community/ld4lt Homepage: http://aksw.org/SebastianHellmann Research Group: http://aksw.org
wikipedia-l@lists.wikimedia.org