Wikidata April 2013

wikidata@lists.wikimedia.org

51 participants
37 discussions

[Wikidata-l] WikiData change propagation for third parties

by Hady elsahar

Hello All , i'm planning to write a proposal for WikiData to DBpedia project in GSoC2013 i've found in the change propagation pag<http://meta.wikimedia.org/wiki/Wikidata/Notes/Change_propagation>e : Support for 3rd party clients, that is, client wikis and other consumers > outside of Wikimedia, is currently not essential and will not be > implemented for now. It shall however be kept in mind for all design > decisions. and i wanted to know two things : 1- what would be the time frame for change propagation to be ready , even a rough estimation ? could it be ready within 2 or 3 months ? 2- is there any design pattern or a brief outline for the change propagation design , how it would be ? in order that i could make a rough plan and estimation about how it could be consumed from the DBpedia side ? thanks regards ------------------------------------------------- Hady El-Sahar Research Assistant Center of Informatics Sciences | Nile University<http://nileuniversity.edu.eg/> email : hadyelsahar(a)gmail.com Phone : +2-01220887311 http://hadyelsahar.me/ <http://www.linkedin.com/in/hadyelsahar>

10 years, 11 months

[Wikidata-l] [ANN] Call for Papers - Workshop on "NLP & DBpedia" at ISWC 2013 - Deadline July 8th

by Sebastian Hellmann

Apologies for cross-posting! ======================= NLP & DBpedia Workshop 2013 ======================= Free, open, interoperable and multilingual NLP for DBpedia and DBpedia for NLP: http://nlp-dbpedia2013.blogs.aksw.org/ Collocated with the International Semantic Web Conference 2013 (ISWC 2013) 21-22 October 2013, in Sydney, Australia (*Submission deadline July 8th*) ********************************** Recently, the DBpedia community has experienced an immense increase in activity and we believe, that the time has come to explore the connection between DBpedia & Natural Language Processing (NLP) in a yet unpreceded depth. The goal of this workshop can be summarized by this (pseudo-) formula: NLP & DBpedia == DBpedia4NLP && NLP4DBpedia http://db0.aksw.org/downloads/CodeCogsEqn_bold2.gif DBpedia has a long-standing tradition to provide useful data as well as a commitment to reliable Semantic Web technologies and living best practices. With the rise of WikiData, DBpedia is step-by-step relieved from the tedious extraction of data from Wikipedia's infoboxes and can shift its focus on new challenges such as extracting information from the unstructured article text as well as becoming a testing ground for multilingual NLP methods. Contribution ========= Within the timeframe of this workshop, we hope to mobilize a community of stakeholders from the Semantic Web area. We envision the workshop to produce the following items: * an open call to the DBpedia data consumer community will generate a wish list of data, which is to be generated from Wikipedia by NLP methods. This wish list will be broken down to tasks and benchmarks and a GOLD standard will be created. * the benchmarks and test data created will be collected and published under an open license for future evaluation (inspired by OAEI and UCI-ML). An overview of the benchmarks can be found here: http://nlp-dbpedia2013.blogs.aksw.org/benchmarks Please sign up to our mailing list, if you are interested in discussing guidelines and NLP benchmarking: http://lists.informatik.uni-leipzig.de/mailman/listinfo/nlp-dbpedia-public Important dates =========== 8 July 2013, Paper Submission Deadline 9 August 2013, Notification of accepted papers sent to authors Motivation ======= The central role of Wikipedia (and therefore DBpedia) for the creation of a Translingual Web has recently been recognized by the Strategic Research Agenda (cf. section 3.4, page 23) and most of the contributions of the recently held Dagstuhl seminar on the Multilingual Semantic Web also stress the role of Wikipedia for Multilingualism. As more and more language-specific chapters of DBpedia appear (currently 14 language editions), DBpedia is becoming a driving factor for a Linguistic Linked Open Data cloud as well as localized LOD clouds with specialized domains (e.g. the Dutch windmill domain ontology created from http://nl.dbpedia.org ). The data contained in Wikipedia and DBpedia have ideal properties for making them a controlled testbed for NLP. Wikipedia and DBpedia are multilingual and multi-domain, the communities maintaining these resource are very open and it is easy to join and contribute. The open license allows data consumers to benefit from the content and many parts are collaboratively editable. Especially, the data in DBpedia is widely used and disseminated throughout the Semantic Web. NLP4DBpedia ========== DBpedia has been around for quite a while, infusing the Web of Data with multi-domain data of decent quality. These triples are, however, mostly extracted from Wikipedia infoboxes. To unlock the full potential of Wikipedia articles for DBpedia, the information contained in the remaining part of the articles needs to be analysed and triplified. Here, the NLP techniques may be of favour. DBpedia4NLP ========== On the other hand NLP, and information extraction techniques in particular, involve various resources while processing texts from various domains. These resources may be used e.g. as an element of a solution e.g. gazetteer being an important part of a rule created by an expert or disambiguation resource, or while delivering a solution e.g. within machine learning approaches. DBpedia easily fits in both of these roles. We invite papers from both these areas including: 1. Knowledge extraction from text and HTML documents (especially unstructured and semi-structured documents) on the Web, using information in the Linked Open Data (LOD) cloud, and especially in DBpedia. 2. Representation of NLP tool output and NLP resources as RDF/OWL, and linking the extracted output to the LOD cloud. 3. Novel applications using the extracted knowledge, the Web of Data or NLP DBpedia-based methods. The specific topics are listed below. Topics ===== - Improving DBpedia with NLP methods - Finding errors in DBpedia with NLP methods - Annotation methods for Wikipedia articles - Cross-lingual data and text mining on Wikipedia - Pattern and semantic analysis of natural language, reading the Web, learning by reading - Large-scale information extraction - Entity resolution and automatic discovery of Named Entities - Multilingual entity recognition task of real world entities - Frequent pattern analysis of entities - Relationship extraction, slot filling - Entity linking, Named Entity disambiguation, cross-document co-reference resolution - Disambiguation through knowledge base - Ontology representation of natural language text - Analysis of ontology models for natural language text - Learning and refinement of ontologies - Natural language taxonomies modeled to Semantic Web ontologies - Use cases for potential data extracted from Wikipedia articles - Use cases of entity recognition for Linked Data applications - Impact of entity linking on information retrieval, semantic search Furthermore, an informal list of NLP tasks can be found on this Wikipedia page: http://en.wikipedia.org/wiki/Natural_language_processing#Major_tasks_in_NLP These are relevant for the workshop as long as they fit into the DBpedia4NLP and NLP4DBpedia frame (i.e. the used data evolves around Wikipedia and DBpedia). Submission formats ============== Paper submission ----------------------- All papers must represent original and unpublished work that is not currently under review. Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop. At least one author of each accepted paper is expected to attend the workshop. * Full research paper (up to 12 pages) * Position papers (up to 6 pages) * Use case descriptions (up to 6 pages) * Data/benchmark paper (2-6 pages, depending on the size and complexity) Note: data and benchmarks papers are meant to provide a citable reference for your data and benchmarks. We kindly require, that you upload any data you use to our benchmark repository in parallel to the submission. We recommend to use an open license (e.g. CC-BY), but minimum requirement is free use. Please write to the mailing list, if you have any problems. Full instructions are available at: http://nlp-dbpedia2013.blogs.aksw.org/submission/ Submission of data and use cases -------------------------------------------- This workshop also targets non-academic users and developers. If you have any (open) data (e.g. texts or annotations) that can be used for benchmarking NLP tools, but do not want or needd to write an academic paper about it, please feel free to just add it to this table: http://tinyurl.com/nlp-benchmarks or upload it to our repository: http://github.com/dbpedia/nlp-dbpedia Full instructions are available at: http://nlp-dbpedia2013.blogs.aksw.org/benchmarks/ Also if you have any ideas, use cases or data requests please feel free to just post them on our mailing list: nlp-dbpedia-public [at] lists.informatik.uni-leipzig.de or send them directly to the chairs: nlp-dbpedia2013 [at] easychair.org Program committee ============== * Guadalupe Aguado, Universidad Politécnica de Madrid, Spain * Chris Bizer, Universität Mannheim, Germany * Volha Bryl, Universität Mannheim, Germany * Paul Buitelaar, DERI, National University of Ireland, Galway * Charalampos Bratsas, OKFN, Greece, ???????????? ???????????? ????????????, (Aristotle University of Thessaloniki), Greece * Philipp Cimiano, CITEC, Universität Bielefeld, Germany * Samhaa R. El-Beltagy, ?????_????? (Nile University), Egypt * Daniel Gerber, AKSW, Universität Leipzig, Germany * Jorge Gracia, Universidad Politécnica de Madrid, Spain * Max Jakob, Neofonie GmbH, Germany * Anja Jentzsch, Hasso-Plattner-Institut, Potsdam, Germany * Ali Khalili, AKSW, Universität Leipzig, Germany * Daniel Kinzler, Wikidata, Germany * David Lewis, Trinity College Dublin, Ireland * John McCrae, Universität Bielefeld, Germany * Uroš Miloševic', Institut Mihajlo Pupin, Serbia * Roberto Navigli, Sapienza, Università di Roma, Italy * Axel Ngonga, AKSW, Universität Leipzig, Germany * Asunción Gómez Pérez, Universidad Politécnica de Madrid, Spain * Lydia Pintscher, Wikidata, Germany * Elena Montiel Ponsoda, Universidad Politécnica de Madrid, Spain * Giuseppe Rizzo, Eurecom, France * Harald Sack, Hasso-Plattner-Institut, Potsdam, Germany * Felix Sasaki, Deutsches Forschungszentrum für künstliche Intelligenz, Germany * Mladen Stanojevic', Institut Mihajlo Pupin, Serbia * Hans Uszkoreit, Deutsches Forschungszentrum für künstliche Intelligenz, Germany * Rupert Westenthaler, Salzburg Research, Austria * Feiyu Xu, Deutsches Forschungszentrum für künstliche Intelligenz, Germany Contact ===== Of course we would prefer that you will post any questions and comments regarding NLP and DBpedia to our public mailing list at: nlp-dbpedia-public [at] lists.informatik.uni-leipzig.de If you want to contact the chairs of the workshop directly, please write to: nlp-dbpedia2013 [at] easychair.org Kind regards, Sebastian Hellmann, Agata Filipowska, Caroline Barrière, Pablo N. Mendes, Dimitris Kontokostas -- Dipl. Inf. Sebastian Hellmann Department of Computer Science, University of Leipzig Events: NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, Deadline: *July 8th*) Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf Projects: http://nlp2rdf.org , http://linguistics.okfn.org , http://dbpedia.org/Wiktionary , http://dbpedia.org Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann Research Group: http://aksw.org

10 years, 12 months

Re: [Wikidata-l] [Wiki Loves Monuments] Importing to Wikidata

by Sven Manguard

Forwarding this to the discussion list for Wikidata. Sven On Sat, Apr 27, 2013 at 10:53 AM, Maarten Dammers <maarten(a)mdammers.nl>wrote: > Hi everyone, > > Now is a good time to start information about our cultural heritage to > Wikidata. Not everything is possible yet, but we can at least start with > the simple things that need to be done anyway: > * Have items for every monument article and list > * Add claims to every monument article (P31) and list (P360) linking them > to the article about the local cultural heritage (example: Rijksmonument) > * Add country (P17) to all lists and articles > > Some things that are also possible (but not in all countries yet): > * Add the identifier > * Add the type of building > * Add the administrative unit (state/province/municipality, etc) > * Image > * Commons category > > To keep track of this, we created https://www.wikidata.org/wiki/** > Wikidata:Cultural_heritage_**task_force<https://www.wikidata.org/wiki/Wikidata:Cultural_heritage_task_force>. Who wants to help out? You don't need to be a bot or a wikidata wizard, > just start with looking up the article about cultural heritage in your > region. You can see several examples in the list. > > For bot owners: I'm using a claimit.py . It's in the rewrite branch of > Pywikipedia. You can use it to mass add claims to Wikidata. For example for > the lists of Rijksmonumenten: > claimit.py -catr:Lijsten_van_**rijksmomumenten -namespace:0 P360 Q916333 > > Maarten > > > ______________________________**_________________ > Wiki Loves Monuments mailing list > WikiLovesMonuments(a)lists.**wikimedia.org<WikiLovesMonuments(a)lists.wikimedia.org> > https://lists.wikimedia.org/**mailman/listinfo/**wikilovesmonuments<https://lists.wikimedia.org/mailman/listinfo/wikilovesmonuments> > http://www.wikilovesmonuments.**org <http://www.wikilovesmonuments.org>

10 years, 12 months

Re: [Wikidata-l] [Wikitech-l] GSoC proposals: Wikidata language fallback and conversion ( + one backup: category redirects )

by Liangent

On Sun, Apr 28, 2013 at 5:05 AM, Paul Selitskas <p.selitskas(a)gmail.com> wrote: > How about if I don't want such fallback to work for me? What if I'd > like to see what is labeled and what is not? Have you considered this > a user option with a flexible fallback schema or a site-wide > preference with a fixed one? > > In general, this is a very good Wikidata feature yet not implemented. > And thanks for raising the category redirects once again! :) > I guess being able to see what's translated and what's not can be resolved by appending language names to labels when it's falling back to another language. On Sun, Apr 28, 2013 at 5:55 AM, Lukas Benedix <benedix(a)zedat.fu-berlin.de> wrote: > Hi, > > If I understand your proposal on User:Liangent/wb-lang right you want to > write an extension or implement the language-fallback in directly wikibase. > > I thougt about doing something about the missing-lang-issue by myself and > think writing a gadget would have a higher possibility to get it deployed on > wikidata.org. And users who don't like it could easily disable the gadget in > their preferences. I guess I prefer to patch Wikibase directly. > > You should keep in mind that the userinterface of such a feature is not easy > to design. > So there're two weeks used for collecting feedback about designs in my proposal. On Sun, Apr 28, 2013 at 10:03 AM, Daniel Friesen <daniel(a)nadir-seen-fire.com> wrote: > We already have a way to handle that kind of thing with normal language > fallbacks. &uselang=qqx. > Wikidata should be able to do something similar trivially. > I don't understand how &uselang=qqx would work for this. Any explanation? -Liangent

10 years, 12 months

[Wikidata-l] GSoC proposals: Wikidata language fallback and conversion ( + one backup: category redirects )

by Liangent

Hello, I've drafted my proposal about language fallback and conversion issues for Wikidata at [1]. Currently Wikidata stores multilingual contents. Labels (names, descriptions etc) are expected to be written in every language, so every user can read them in their own language. But there're some problems currently: * If some content doesn't exist in some specific language, users with this exact language set in their preferences see something meaningless (its ID instead). This renders some language with fewer users (thus fewer labels filled) even unusable. * There're some similar languages which may often share the same value. Having strings populated for every language one by one wastes resources and may allow them out of sync later. * Even for languages which are not "that similar", MediaWiki already has some facility to transliterate (aka. convert) contents from its another sister language (aka. variant) which can be used to provide better results for users. This proposal aims at resolving these issues by displaying contents from another language to users based on user preferences (some users may know more than one languages), language similarity (language fallback chain), or the possibility to do transliteration, and allow proper editing on these contents. Although Wikidata is in its fast development stage, lots of data have been added to it. The later we resolve these issues, the more duplications may be created which will require more clean up work in the future, like what we had to face before / when the language converter (that transliteration system) was introduced for the Chinese Wikipedia. So I'm planning to do this project in this summer. There's also a backup proposal about category redirects at [2]. I wrote it because I really want to see it implemented too, either by me or someone else. Some of its contents may be also useful for other participants willing to do this project. Comments are welcome and appreciated. [1] https://www.mediawiki.org/wiki/User:Liangent/wb-lang [2] https://www.mediawiki.org/wiki/User:Liangent/cat-redir -Liangent

10 years, 12 months

[Wikidata-l] [Wikidata-I] [GSoC 2013] Mobilize Wikidata Proposal Draft

by Pragun Bhutani

Hello, I've been having discussions about my GSoC 2013 project with the Wikidata group on the IRC(#mediawiki-wikidata) for a few days and have completed the first draft of my proposal. I'd really appreciate some feedback on it. I welcome any queries that you may have and would love to get tips on how to improve it. http://www.mediawiki.org/wiki/User:Pragunbhutani/GSoC_2013_Proposal On Sumanah's suggestion, I've limited the scope of the project to 6 weeks to allow time for code review and bug fixes. I thought it best to run it by Wikidata-I before sharing it with Wikitech-I for their opinion. Many thanks! -- Pragun Bhutani http://pragunbhutani.in Skype : pragun.bhutani

11 years

[Wikidata-l] weekly summary #55

by Lydia Pintscher

Heya folks :) Lot's of good stuff happened around Wikidata this week. Your summary is here: http://meta.wikimedia.org/wiki/Wikidata/Status_updates/2013_04_26 Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Technical Projects Wikimedia Deutschland e.V. Obentrautstr. 72 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

11 years

Re: [Wikidata-l] [Mediawiki-api-announce] BREAKING CHANGE: Wikidata langlinks handling in action=parse, list=allpages, list=langbacklinks, prop=langlinks

by Yuri Astrakhan

CCing wikidata. I don't think this is a good approach. We shouldn't be breaking API just because there is a new under-the-hood feature (wikibase). From the API client's perspective, it should work as before, plus there should be an extra flag notifying if the sitelink is stored in wikidata or locally. Sitelinks might be the first, but not the last change - e.g. categories, etc. As for the implementation, it seems the hook approach might not satisfy all the usage scenarios: * Given a set of pages (pageset), give all the sitelinks (possibly filtered with a set of wanted languages). Rendering page for the UI would use this approach with just one page. * langbacklinks - get a list of pages linking to a site. * filtering based on having/not having specific langlink for other modules. E.g. list all pages that have/don't have a link to a site X. * alllanglinks (not yet implemented, but might be to match corresponding allcategories, ...) - list all existing langlinks in the site. We could debate the need of some of these scenarios, but I feel that we shouldn't be breaking existing API. On Thu, Apr 25, 2013 at 2:24 PM, Brad Jorsch <bjorsch(a)wikimedia.org> wrote: > Language links added by Wikidata are currently stored in the parser > cache and in the langlinks table in the database, which means they > work the same as in-page langlinks but also that the page must be > reparsed if these wikidata langlinks change. The Wikidata team has > proposed to remove the necessity for the page reparse, at the cost of > changing the behavior of the API with regard to langlinks. > > Gerrit change 59997[1] (still in review) will make the following > behavioral changes: > * action=parse will return only the in-page langlinks by default. > Inclusion of Wikidata langlinks may be requested using a new > parameter. > * list=allpages with apfilterlanglinks will only consider in-page > langlinks. > * list=langbacklinks will only consider in-page langlinks. > * prop=langlinks will only list in-page langlinks. > > Gerrit change 60034[2] (still in review) will make the following > behavioral changes: > * prop=langlinks will have a new parameter to request inclusion of the > Wikidata langlinks in the result. > > A future change, not coded yet, will allow for Wikidata to flag its > langlinks in various ways. For example, it could indicate which of the > other-language articles are Featured Articles. > > At this time, it seems likely that the first change will make it into > 1.22wmf3.[3] The timing of the second and third changes are less > certain. > > > [1]: https://gerrit.wikimedia.org/r/#/c/59997 > [2]: https://gerrit.wikimedia.org/r/#/c/60034 > [3]: https://www.mediawiki.org/wiki/MediaWiki_1.22/Roadmap > > -- > Brad Jorsch > Software Engineer > Wikimedia Foundation > > _______________________________________________ > Mediawiki-api-announce mailing list > Mediawiki-api-announce(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce >

11 years

[Wikidata-l] we're live on all Wikipedias with phase 2

by Lydia Pintscher

Heya folks :) The start of phase 2 has just been deployed on all 274 remaining Wikipedias \o/ http://blog.wikimedia.de/2013/04/24/wikidata-all-around-the-world/ Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Technical Projects Wikimedia Deutschland e.V. Obentrautstr. 72 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

11 years

[Wikidata-l] Usage of Wikidata: the brilliance of Wikipedians

by Denny Vrandečić

I am completely amazed by a particularly brilliant way that Wikipedia uses Wikidata. Instead of simply displaying the data from Wikidata and removing the local data, a template and workflow is proposed, which... * grabs the relevant data from Wikidata * compares it with the data given locally in the Wikipedia * displays the Wikipedia data * adds a maintenance category in case the data is different This allows both communities to check the maintenance category, provide a security net for vandal changes, still notice if some data has changed, etc. -- and to phase out the local data over time when they get comfortable and if they want to. It is a balance of maintenance effort and data quality. I am not saying that is the right solution in every use case, for every topic, for every language. But it is a perfect example how the community will surprise us by coming up with ingenious solutions if they get enough flexibility, powerful tools, and enough trust. Yay, Wikipedia! The workflow is described here: < http://en.wikipedia.org/wiki/Template_talk:Commons_category#Edit_request_on… > There is an RFC currently going on about whether and how to use Wikidata data in the English Wikipedia, coming out of the discussion that was here a few days ago. If you are an English Wikipedian, you might be interested: < http://en.wikipedia.org/wiki/Wikipedia:Requests_for_comment/Wikidata_Phase_2 > -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

11 years

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata April 2013