Indeed, Wiktionary-l is the list you might find more help on. Look at the archives, they're mostly discussions of similar problems. There was also some attempt to merge another similar mailing list and some effort on DBpedia-like projects, but I don't remember the conclusion.
Nemo
Judit, Ács, 23/11/2012 11:18:
Hi,
I am trying to tranlations from Wiktionaries in different languages. Currently I use the "All pages, current versions only" dump. Is there a way to find out the language template tags (is that the correct term?) for each Wiktionary and each language?
For example: This is the Hungarian page 'karcsu' (slim, slender) http://hu.wiktionary.org/wiki/karcs%C3%BA (the edit page: http://hu.wiktionary.org/w/index.php?title=karcs%C3%BA&action=edit) The translation table always (?) starts like this: {{-ford-}} {{trans-top}} *{{en}}: {{t|en|slim}}, {{t|en|slender}}
Where {{-ford-}} comes from the word forditas (translation in Hungarian, I skipped the accents). The translations look like the 3rd row and (hopefully) contain the other languages wiki codes (en, fr, de).
Also on the page 'slim' in the Hungarian Wiktionary there are some tags which nobody would understand unless they are Hungarian and they have learned some Hungarian grammar. http://hu.wiktionary.org/wiki/slim and http://hu.wiktionary.org/w/index.php?title=slim&action=edit The first line is: {{engmell|comp=slimmer|sup=slimmest|pron=/slɪm/|audio=us}}
Where 'engmell' is derived from 'english melleknev', melleknev meaning adjective in Hungarian. There rest is similarly confusing.
It gets even more confusing if I look at other Wiktionaries. It seems that there are no standards that all Wiktionaries follow.
Is this meta-information available somewhere?
I hope I managed to explain it clearly and I am asking on the right list.
Thank you in advance, Judit Acs
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
There are more or less different rules in each language edition of Wiktionary, at my sorrow. So, if you want to extract (e.g. translations) from several Wiktionaries, you have to take into accout the formatting rules of each Wiktionary.
So, there are no any common meta-information. At least language codes, I hope, are the same in each Wiktionary project. I collected these codes from English and Russian Wiktionaries here (Java project wikokit): http://code.google.com/searchframe#nnQwlFITwiU/trunk/common_wiki/src/wikiped...
Best regards, Andrew Krizhanovsky.
On Fri, Nov 23, 2012 at 9:49 PM, Federico Leva (Nemo) nemowiki@gmail.comwrote:
Indeed, Wiktionary-l is the list you might find more help on. Look at the archives, they're mostly discussions of similar problems. There was also some attempt to merge another similar mailing list and some effort on DBpedia-like projects, but I don't remember the conclusion.
Nemo
Judit, Ács, 23/11/2012 11:18:
Hi,
I am trying to tranlations from Wiktionaries in different languages. Currently I use the "All pages, current versions only" dump. Is there a way to find out the language template tags (is that the correct term?) for each Wiktionary and each language?
For example: This is the Hungarian page 'karcsu' (slim, slender) http://hu.wiktionary.org/wiki/**karcs%C3%BAhttp://hu.wiktionary.org/wiki/karcs%C3%BA(the edit page: http://hu.wiktionary.org/w/**index.php?title=karcs%C3%BA&**action=edithttp://hu.wiktionary.org/w/index.php?title=karcs%C3%BA&action=edit ) The translation table always (?) starts like this: {{-ford-}} {{trans-top}} *{{en}}: {{t|en|slim}}, {{t|en|slender}}
Where {{-ford-}} comes from the word forditas (translation in Hungarian, I skipped the accents). The translations look like the 3rd row and (hopefully) contain the other languages wiki codes (en, fr, de).
Also on the page 'slim' in the Hungarian Wiktionary there are some tags which nobody would understand unless they are Hungarian and they have learned some Hungarian grammar. http://hu.wiktionary.org/wiki/**slim http://hu.wiktionary.org/wiki/slimand http://hu.wiktionary.org/w/**index.php?title=slim&action=**edithttp://hu.wiktionary.org/w/index.php?title=slim&action=edit The first line is: {{engmell|comp=slimmer|sup=**slimmest|pron=/slɪm/|audio=us}**}
Where 'engmell' is derived from 'english melleknev', melleknev meaning adjective in Hungarian. There rest is similarly confusing.
It gets even more confusing if I look at other Wiktionaries. It seems that there are no standards that all Wiktionaries follow.
Is this meta-information available somewhere?
I hope I managed to explain it clearly and I am asking on the right list.
Thank you in advance, Judit Acs
______________________________**_________________ Xmldatadumps-l mailing list Xmldatadumps-l@lists.**wikimedia.org Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/xmldatadumps-**lhttps://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
______________________________**_________________ Wiktionary-l mailing list Wiktionary-l@lists.wikimedia.**org Wiktionary-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wiktionary-lhttps://lists.wikimedia.org/mailman/listinfo/wiktionary-l
Hi,
It looks like the DBpedia Wiktionary is the best fit here. You can have a look at the homepage [1] and an example case [2] and I could porvide more info if you'd like.
Cheers, Dimitris
[1] http://wiki.dbpedia.org/Wiktionary [2] http://wiktionary.dbpedia.org/page/dog
On Fri, Nov 23, 2012 at 7:49 PM, Federico Leva (Nemo) nemowiki@gmail.comwrote:
Indeed, Wiktionary-l is the list you might find more help on. Look at the archives, they're mostly discussions of similar problems. There was also some attempt to merge another similar mailing list and some effort on DBpedia-like projects, but I don't remember the conclusion.
Nemo
Judit, Ács, 23/11/2012 11:18:
Hi,
I am trying to tranlations from Wiktionaries in different languages. Currently I use the "All pages, current versions only" dump. Is there a way to find out the language template tags (is that the correct term?) for each Wiktionary and each language?
For example: This is the Hungarian page 'karcsu' (slim, slender) http://hu.wiktionary.org/wiki/**karcs%C3%BAhttp://hu.wiktionary.org/wiki/karcs%C3%BA(the edit page: http://hu.wiktionary.org/w/**index.php?title=karcs%C3%BA&**action=edithttp://hu.wiktionary.org/w/index.php?title=karcs%C3%BA&action=edit ) The translation table always (?) starts like this: {{-ford-}} {{trans-top}} *{{en}}: {{t|en|slim}}, {{t|en|slender}}
Where {{-ford-}} comes from the word forditas (translation in Hungarian, I skipped the accents). The translations look like the 3rd row and (hopefully) contain the other languages wiki codes (en, fr, de).
Also on the page 'slim' in the Hungarian Wiktionary there are some tags which nobody would understand unless they are Hungarian and they have learned some Hungarian grammar. http://hu.wiktionary.org/wiki/**slim http://hu.wiktionary.org/wiki/slimand http://hu.wiktionary.org/w/**index.php?title=slim&action=**edithttp://hu.wiktionary.org/w/index.php?title=slim&action=edit The first line is: {{engmell|comp=slimmer|sup=**slimmest|pron=/slɪm/|audio=us}**}
Where 'engmell' is derived from 'english melleknev', melleknev meaning adjective in Hungarian. There rest is similarly confusing.
It gets even more confusing if I look at other Wiktionaries. It seems that there are no standards that all Wiktionaries follow.
Is this meta-information available somewhere?
I hope I managed to explain it clearly and I am asking on the right list.
Thank you in advance, Judit Acs
______________________________**_________________ Xmldatadumps-l mailing list Xmldatadumps-l@lists.**wikimedia.org Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/xmldatadumps-**lhttps://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
______________________________**_________________ Wiktionary-l mailing list Wiktionary-l@lists.wikimedia.**org Wiktionary-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wiktionary-lhttps://lists.wikimedia.org/mailman/listinfo/wiktionary-l
wiktionary-l@lists.wikimedia.org