Hello,
I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement.
http://www.wikidata.org/wiki/Wikidata:Wiktionary
To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that.
I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal.
It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there.
http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary
Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously).
Cheers, Denny
If you haven't already, it might be worth looking at the Freebase schema for Wordnet, especially how it connects synsets to Freebase topics:
https://www.freebase.com/base/wordnet/synset?schema=
Tom
On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić < denny.vrandecic@wikimedia.de> wrote:
Hello,
I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement.
http://www.wikidata.org/wiki/Wikidata:Wiktionary
To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that.
I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal.
It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there.
http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary
Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously).
Cheers, Denny
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 19/06/13 15:03, Tom Morris wrote:
If you haven't already, it might be worth looking at the Freebase schema for Wordnet, especially how it connects synsets to Freebase topics:
https://www.freebase.com/base/wordnet/synset?schema=
Tom
WordNet does not seem to be under a free license -- see
http://wordnet.princeton.edu/wordnet/license/
Since Wikidata's CC0 licensing allows commercial use, surely integrating any kind of data from WordNet risks conflict with WordNet's license?
Neil
On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić < denny.vrandecic@wikimedia.de> wrote:
Hello,
I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement.
http://www.wikidata.org/wiki/Wikidata:Wiktionary
To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that.
I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal.
It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there.
http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary
Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously).
Cheers, Denny
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
It's not about the actual content, but rather about the data model.
2013/6/19 Neil Harris neil@tonal.clara.co.uk
On 19/06/13 15:03, Tom Morris wrote:
If you haven't already, it might be worth looking at the Freebase schema for Wordnet, especially how it connects synsets to Freebase topics: https://www.freebase.com/base/wordnet/synset?schema=
Tom
WordNet does not seem to be under a free license -- see
http://wordnet.princeton.edu/wordnet/license/
Since Wikidata's CC0 licensing allows commercial use, surely integrating any kind of data from WordNet risks conflict with WordNet's license?
Neil
On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić denny.vrandecic@wikimedia.de wrote:
Hello,
I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement. http://www.wikidata.org/wiki/Wikidata:Wiktionary http://www.wikidata.org/wiki/Wikidata:Wiktionary
To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that.
I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal.
It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there. http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary
Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously).
Cheers, Denny
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing listWikidata-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing listWikidata-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On Wed, Jun 19, 2013 at 03:59:13PM +0100, Neil Harris wrote:
WordNet does not seem to be under a free license -- see
http://wordnet.princeton.edu/wordnet/license/
Since Wikidata's CC0 licensing allows commercial use, surely integrating any kind of data from WordNet risks conflict with WordNet's license?
I think that despite the confusing verbiage at the top, this is actually a BSD license. It's also what https://en.wikipedia.org/wiki/WordNet indicates. It's certainly more restrictive than CC0, but it's a free license.
Thanks, I did, and did now again. As far as I can tell, it seems compatible (and even would be compatible with the simpler current Wikidata model, actually).
Cheers, Denny
2013/6/19 Tom Morris tfmorris@gmail.com
If you haven't already, it might be worth looking at the Freebase schema for Wordnet, especially how it connects synsets to Freebase topics:
https://www.freebase.com/base/wordnet/synset?schema=
Tom
On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić < denny.vrandecic@wikimedia.de> wrote:
Hello,
I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement.
http://www.wikidata.org/wiki/Wikidata:Wiktionary
To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that.
I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal.
It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there.
http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary
Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously).
Cheers, Denny
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Am 19.06.2013 15:57, schrieb Denny Vrandečić:
http://www.wikidata.org/wiki/Wikidata:Wiktionary
To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that.
I would like to point out that the "meaning" mentioned in the draft is really the "word sense" - it's not an abstract platonical concept, but something very concrete bound to a specific word. Perhaps we should change our internal terminology there.
-- daniel
Hi Denny,
Thank you very much for this fantastic update about the intentions of supporting a semantic dictionary in Wikidata :) Just a minor correction: I think instead of "word", it should be "expression" because some languages don't follow the same logic.
On the other hand, do you think it would be possible to accommodate grammar rules too? I have added some people from Apertium that might have some insights about it.
Cheers, Micru
On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić < denny.vrandecic@wikimedia.de> wrote:
Hello,
I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement.
http://www.wikidata.org/wiki/Wikidata:Wiktionary
To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that.
I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal.
It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there.
http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary
Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously).
Cheers, Denny
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
The current proposal does not cover grammar rules explicitly. If at all, I would regard that as a later extension once the lexical information is in place. Also, my limited understanding of the topic does not even allow for coming up with a data model to cover grammar rules, or to know whether there is something like sufficiently widely accepted models to represent grammar, or if there are still discussions whether Chomsky or Systemic Functional Grammars or whatever else would make the cut...
Regarding "word" vs "expression" - I do not care much about the actual term, and it seems that both seem valid. With the suggested change from "meaning" to "word sense" though, it might make more sense to keep "word" here. But as said, no strong opinion here. I definitively see that saying that http://en.wiktionary.org/wiki/carry_coals_to_Newcastle is a "word" is kinda weird. "expression" would fix that.
Any further opinions?
Cheers, Denny
2013/6/19 David Cuenca dacuetu@gmail.com
Hi Denny,
Thank you very much for this fantastic update about the intentions of supporting a semantic dictionary in Wikidata :) Just a minor correction: I think instead of "word", it should be "expression" because some languages don't follow the same logic.
On the other hand, do you think it would be possible to accommodate grammar rules too? I have added some people from Apertium that might have some insights about it.
Cheers, Micru
On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić < denny.vrandecic@wikimedia.de> wrote:
Hello,
I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement.
http://www.wikidata.org/wiki/Wikidata:Wiktionary
To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that.
I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal.
It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there.
http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary
Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously).
Cheers, Denny
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
-- Etiamsi omnes, ego non
Thinking about it again, and discussing it internally, maybe we should replace "word" with "expression" and "meaning" with "sense"?
Any +1's or differing opinions?
2013/6/20 Denny Vrandečić denny.vrandecic@wikimedia.de
The current proposal does not cover grammar rules explicitly. If at all, I would regard that as a later extension once the lexical information is in place. Also, my limited understanding of the topic does not even allow for coming up with a data model to cover grammar rules, or to know whether there is something like sufficiently widely accepted models to represent grammar, or if there are still discussions whether Chomsky or Systemic Functional Grammars or whatever else would make the cut...
Regarding "word" vs "expression" - I do not care much about the actual term, and it seems that both seem valid. With the suggested change from "meaning" to "word sense" though, it might make more sense to keep "word" here. But as said, no strong opinion here. I definitively see that saying that http://en.wiktionary.org/wiki/carry_coals_to_Newcastle is a "word" is kinda weird. "expression" would fix that.
Any further opinions?
Cheers, Denny
2013/6/19 David Cuenca dacuetu@gmail.com
Hi Denny,
Thank you very much for this fantastic update about the intentions of supporting a semantic dictionary in Wikidata :) Just a minor correction: I think instead of "word", it should be "expression" because some languages don't follow the same logic.
On the other hand, do you think it would be possible to accommodate grammar rules too? I have added some people from Apertium that might have some insights about it.
Cheers, Micru
On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić < denny.vrandecic@wikimedia.de> wrote:
Hello,
I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement.
http://www.wikidata.org/wiki/Wikidata:Wiktionary
To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that.
I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal.
It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there.
http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary
Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously).
Cheers, Denny
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
-- Etiamsi omnes, ego non
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
done this change.
2013/6/20 Denny Vrandečić denny.vrandecic@wikimedia.de
Thinking about it again, and discussing it internally, maybe we should replace "word" with "expression" and "meaning" with "sense"?
Any +1's or differing opinions?
2013/6/20 Denny Vrandečić denny.vrandecic@wikimedia.de
The current proposal does not cover grammar rules explicitly. If at all, I would regard that as a later extension once the lexical information is in place. Also, my limited understanding of the topic does not even allow for coming up with a data model to cover grammar rules, or to know whether there is something like sufficiently widely accepted models to represent grammar, or if there are still discussions whether Chomsky or Systemic Functional Grammars or whatever else would make the cut...
Regarding "word" vs "expression" - I do not care much about the actual term, and it seems that both seem valid. With the suggested change from "meaning" to "word sense" though, it might make more sense to keep "word" here. But as said, no strong opinion here. I definitively see that saying that http://en.wiktionary.org/wiki/carry_coals_to_Newcastle is a "word" is kinda weird. "expression" would fix that.
Any further opinions?
Cheers, Denny
2013/6/19 David Cuenca dacuetu@gmail.com
Hi Denny,
Thank you very much for this fantastic update about the intentions of supporting a semantic dictionary in Wikidata :) Just a minor correction: I think instead of "word", it should be "expression" because some languages don't follow the same logic.
On the other hand, do you think it would be possible to accommodate grammar rules too? I have added some people from Apertium that might have some insights about it.
Cheers, Micru
On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić < denny.vrandecic@wikimedia.de> wrote:
Hello,
I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement.
http://www.wikidata.org/wiki/Wikidata:Wiktionary
To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that.
I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal.
It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there.
http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary
Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously).
Cheers, Denny
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
-- Etiamsi omnes, ego non
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Hoi,
Denny, when you look at the data currently in Wikidata, you find what is in essence more than a basis for a translation dictionary.
The notion that we need something separate is a notion you should reasses. What we need is some clean-up of the labels currently in use. What we also need are more definitions. We do not need another Wikidata for Wiktionary
Thanks, GerarM Hello,
I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement.
http://www.wikidata.org/wiki/Wikidata:Wiktionary
To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that.
I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal.
It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there.
http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary
Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously).
Cheers, Denny
It was never intended to create a Wiktionary Database separate from Wikidata, but have it being a part of Wikidata.
2013/6/21 Gerard Meijssen gerard.meijssen@gmail.com
Hoi,
Denny, when you look at the data currently in Wikidata, you find what is in essence more than a basis for a translation dictionary.
The notion that we need something separate is a notion you should reasses. What we need is some clean-up of the labels currently in use. What we also need are more definitions. We do not need another Wikidata for Wiktionary
Thanks, GerarM Hello,
I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement.
http://www.wikidata.org/wiki/Wikidata:Wiktionary
To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that.
I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal.
It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there.
http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary
Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously).
Cheers, Denny
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Ww do need another wikidata, only separate namespace for "items" (words) and some separate properties
JAnD
2013/6/21 Gerard Meijssen gerard.meijssen@gmail.com: We do not need another Wikidata for Wiktionary
Thanks, GerarM
Did you mean to say "We do *not* need another Wikidata"? Otherwise I am confused by your comment.
On Jun 21, 2013 12:08 PM, "Jan Dudík" jan.dudik@gmail.com wrote:
Ww do need another wikidata, only separate namespace for "items" (words) and some separate properties
JAnD
2013/6/21 Gerard Meijssen gerard.meijssen@gmail.com: We do not need another Wikidata for Wiktionary
Thanks, GerarM
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Am 21.06.2013 14:44, schrieb Gerard Meijssen:
Hoi,
Denny, when you look at the data currently in Wikidata, you find what is in essence more than a basis for a translation dictionary.
I would say it's excellent as a thesaurus which can be used for cross-lingual tagging, named entity recognition, topic detection, and similar NLP tasks.
For literature grade translations, that is, for a true dictionary, I believe that you need to full range of nuances attached to each word and each word sense, which is distinct from the platonic concepts described by data items.
-- daniel
On Fri, Jun 21, 2013 at 2:54 PM, Daniel Kinzler <daniel.kinzler@wikimedia.de
wrote:
For literature grade translations, that is, for a true dictionary, I believe that you need to full range of nuances attached to each word and each word sense, which is distinct from the platonic concepts described by data items.
Literature grade translations require a knowledge of both cultural domains and context, sometimes there is no correspondence between concepts. This is also an amazing quote: "Why does a translator need a whole workday to translate five pages, and not an hour or two? ..... About 90% of an average text corresponds to these simple conditions. But unfortunately, there's the other 10%. It's that part that requires six [more] hours of work. There are ambiguities one has to resolve. For instance, the author of the source text, an Australian physician, cited the example of an epidemic which was declared during World War II in a "Japanese prisoner of war camp". Was he talking about an American camp with Japanese prisoners or a Japanese camp with American prisoners? The English has two senses. It's necessary therefore to do research, maybe to the extent of a phone call to Australia." -- Claude Piron, Le défi des langues (The Language Challenge)
However we have a wonderful situation, because: 1) Wikipedia is not a literary work, so the translation requirements are not that high. 2) It has a lot of users that can manually disambiguate the source text with semantic annotations, and users in the target language that can fill the gaps 3) There is prior work done in the RBMT open source world, so there is no need to start from scratch 4) A translation portal for the wiki world already exists and it is going to be expanded
Basically almost all the blocks needed to create a powerful MT system for WP are already there or waiting to be integrated. What I believe is missing it is the model for storing structurally the morphological information from Wiktionary templates so the data becomes machine readable and usable. It will require some prior work to create a coherent model based on the expression/sense entity types. Doable with some intense full-time dedication :)
Cheers, Micru