Rodolfo Raya wrote:
On 7/14/05, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hi,
I have started anew on an ERD for the UW.
Have you considered designing an XCS template to use with TBX exports?
You may want to take a look at the default template included in TBX specs and remove the fields that you consider irrelevant. Once you define the template, it would be easier to identify the requirements of the application and write an UML or ERD diagram.
Regards, Rodolfo
Hoi, The first thing I need to do is to make sure that we can host the data in the database. To do that there are several requirements that I have to allow for; * The user interface must be in the language indicated by the user. * Ultimate Wiktionary is to use its own dog food or, if some terminology required in the UI is not there, it must be possible to add it within Ultimate Wiktionary * It must be possible to host all words of all languages in this database. * We need cooperation of many people to make it all possible so I need to acknowledge glossaries and thesauri that we are given to host within UW * Users must be able to select the languages that they want to see.
The budget that we have to create UW is minimal. We will be happy if we can get it to work and host the data.within a limited amount of time.We have considered TBX. There are however TWO crucial things before we can consider exporting to TBX, the first is IMPORT, the second one is analysis of what the export should be for. If the Dutch content will be 222.000+ words to start of with and everyone starts hitting our servers because you can, it makes for some optimalisation. If the export is based on a need of only the changed content, it is different again. When the need is associated with the projected reference implementation, it makes sense to consider how we will ask translators to aid in the content of the UW.
Sabine for instance already adds content to the Italian wiktionary based on her translations. When we ask it as part of the reference implementation for a translation glossary, we want to encourage translators to work on the content like Sabine does, finding the right way is crucial. When we start with UW, it may be important to have some Quality Assurance measures built in. This is also very much intrinsic to the database design and that is what is currently being worked on. Currently there are some 17 tables that make up the ERD, there is need for some more.
Over the last year I have had many conversations about what should be in an Ultimate Wiktionary. Now I am trying to integrate it all. I have posted the design to the uw-creations@googlegroups.com and I have posted it at http://commons.wikimedia.org/wiki/Image:ERD.jpg As you should understand it is a working document and if you have questions of suggestions do not hesitate to discuss them with me and with others.
Thanks, GerardM
Gerard Meijssen wrote:
Rodolfo Raya wrote:
On 7/14/05, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hi,
I have started anew on an ERD for the UW.
Have you considered designing an XCS template to use with TBX exports? You may want to take a look at the default template included in TBX specs and remove the fields that you consider irrelevant. Once you define the template, it would be easier to identify the requirements of the application and write an UML or ERD diagram.
Regards, Rodolfo
Hoi, The first thing I need to do is to make sure that we can host the data in the database. To do that there are several requirements that I have to allow for;
- The user interface must be in the language indicated by the user.
- Ultimate Wiktionary is to use its own dog food or, if some
terminology required in the UI is not there, it must be possible to add it within Ultimate Wiktionary
- It must be possible to host all words of all languages in this
database.
- We need cooperation of many people to make it all possible so I need
to acknowledge glossaries and thesauri that we are given to host within UW
- Users must be able to select the languages that they want to see.
The budget that we have to create UW is minimal. We will be happy if we can get it to work and host the data.within a limited amount of time.We have considered TBX. There are however TWO crucial things before we can consider exporting to TBX, the first is IMPORT, the second one is analysis of what the export should be for. If the Dutch content will be 222.000+ words to start of with and everyone starts hitting our servers because you can, it makes for some optimalisation. If the export is based on a need of only the changed content, it is different again. When the need is associated with the projected reference implementation, it makes sense to consider how we will ask translators to aid in the content of the UW.
Sabine for instance already adds content to the Italian wiktionary based on her translations. When we ask it as part of the reference implementation for a translation glossary, we want to encourage translators to work on the content like Sabine does, finding the right way is crucial. When we start with UW, it may be important to have some Quality Assurance measures built in. This is also very much intrinsic to the database design and that is what is currently being worked on. Currently there are some 17 tables that make up the ERD, there is need for some more.
Over the last year I have had many conversations about what should be in an Ultimate Wiktionary. Now I am trying to integrate it all. I have posted the design to the uw-creations@googlegroups.com and I have posted it at http://commons.wikimedia.org/wiki/Image:ERD.jpg As you should understand it is a working document and if you have questions of suggestions do not hesitate to discuss them with me and with others.
Thanks, GerardM
Hi Gerard,
They can say what they want about Dutch people, but they can't say you aren't persistent! I'm looking at the ERD. The first remark I have is that English people won't understand what wordtype means. The term they use is 'part of speech' or POS.
Keep up the good work. I'll give you more remarks later, when I encounter something to say.
Polyglot
wiktionary-l@lists.wikimedia.org