There are more or less different rules in each language edition of
Wiktionary, at my sorrow.
So, if you want to extract (e.g. translations) from several Wiktionaries,
you have to take into accout the formatting rules of each Wiktionary.
So, there are no any common meta-information.
At least language codes, I hope, are the same in each Wiktionary project.
I collected these codes from English and Russian Wiktionaries here (Java
project wikokit):
Best regards,
Andrew Krizhanovsky.
On Fri, Nov 23, 2012 at 9:49 PM, Federico Leva (Nemo) <nemowiki(a)gmail.com>wrote;wrote:
Indeed, Wiktionary-l is the list you might find more
help on. Look at the
archives, they're mostly discussions of similar problems.
There was also some attempt to merge another similar mailing list and some
effort on DBpedia-like projects, but I don't remember the conclusion.
Nemo
Judit, Ács, 23/11/2012 11:18:
Hi,
I am trying to tranlations from Wiktionaries in different languages.
Currently I use the "All pages, current versions only" dump. Is there a
way to find out the language template tags (is that the correct term?)
for each Wiktionary and each language?
For example:
This is the Hungarian page 'karcsu' (slim, slender)
http://hu.wiktionary.org/wiki/**karcs%C3%BA<http://hu.wiktionary.org/wik…
edit page:
http://hu.wiktionary.org/w/**index.php?title=karcs%C3%BA&**action=edit&…
)
The translation table always (?) starts like this:
{{-ford-}}
{{trans-top}}
*{{en}}: {{t|en|slim}}, {{t|en|slender}}
Where {{-ford-}} comes from the word forditas (translation in Hungarian,
I skipped the accents). The translations look like the 3rd row and
(hopefully) contain the other languages wiki codes (en, fr, de).
Also on the page 'slim' in the Hungarian Wiktionary there are some tags
which nobody would understand unless they are Hungarian and they have
learned some Hungarian grammar.
http://hu.wiktionary.org/wiki/**slim <http://hu.wiktionary.org/wiki/slim>and
http://hu.wiktionary.org/w/**index.php?title=slim&action=**edit<http…
The first line is:
{{engmell|comp=slimmer|sup=**slimmest|pron=/slɪm/|audio=us}**}
Where 'engmell' is derived from 'english melleknev', melleknev meaning
adjective in Hungarian. There rest is similarly confusing.
It gets even more confusing if I look at other Wiktionaries. It seems
that there are no standards that all Wiktionaries follow.
Is this meta-information available somewhere?
I hope I managed to explain it clearly and I am asking on the right list.
Thank you in advance,
Judit Acs
______________________________**_________________
Xmldatadumps-l mailing list
Xmldatadumps-l(a)lists.**wikimedia.org <Xmldatadumps-l(a)lists.wikimedia.org>
https://lists.wikimedia.org/**mailman/listinfo/xmldatadumps-**l<https://…
______________________________**_________________
Wiktionary-l mailing list
Wiktionary-l(a)lists.wikimedia.**org <Wiktionary-l(a)lists.wikimedia.org>
https://lists.wikimedia.org/**mailman/listinfo/wiktionary-l<https://list…