I saw that some people are interested in SC/S/C/B problematic and I would ask all of them for some attention.
0. For people who don't know problematics, just to say that Serbian language is written in two alphabets (Cyrillic and Latin) and it has two standard variants (Ekavian and Iyekavian). Cyrillic and Latin are not geographic specific (in Belgrade, Podgorica or Banja Luka you can find both), but Ekavian and Iyekavian are. There are around 8 millions of Ekavian speakers and around 2 millions of Iyekavian speakers. Ekavian is used in Serbia (but officially, both standard variants are equal in Serbia), Iyekavian is used in Republika Srpska (the part of Bosnia and Herzegovina; officially, both standard variants are equal) and Montenegro (only Iyekavian standard is official). It can be said that all of that can be implemented for sh: and that bs: can implement only Latin<->Cyrillic conversion.
Zhengzhu did a lot of work until now and we are waiting for the first implementation of his software on sr:. The software is based on his previous work on Chinese problem.
1. Zhengzhu would implement the basic part of software for sr: (which would be used on sh:, too; and maybe on bs:). However, it is just the beginning of the work and I think that all of that issue would need some help from the people (both: contributors and developers) who are interested in linguistics.
2. The first implementation of the software (on sr:) should be implemented in month or two (as I know). Implementation assumes:
a) Keeping sr: policy that articles should be written in Cyrillic and using Cyrillic-based syntax (in the sense of the starting alphabet).
b) Writing in Ekavian and/or specific syntax for marking Ekavian-Iyekavian variants. Also, Ekavian-Iyekavian dictionary would be used for automatic conversion and admins would have possibility to update dictionary.
c) General conversion would work in both ways, but we don't want to mix Latin, Cyrillic, Ekavian and Iyekavian (it is chaotic, silly for average user, as well as it is not standard).
d) All changes are on the read level. There would not be any change on the write level in MediaWiki.
3. It can be said that "classic" implementation of Zhengzhu's software would be the next step and (as I think) it would be finished in the next couple of months. Implementation assumes:
a) Possibility for writing in different alphabets and variants.
b) Conversion would be implemented on the write and read level. Database would be written in Ekavian Cyrillic with markup; when contributor writes something in Iyekavian or Ekavian Latin, it would be converted into Ekavian Cyrillic.
4. The next step is Serbo-Croatian Wikipedia where more complex (but more linguistically interesting) rules should be added.
I think that almost all people on the lists know that Serbian, Croatian, Bosnian and Serbo-Croatian standards have minimal linguistic differences. The most of differences are cultural and political. So, we should be very careful with any decision related to that problem. Actually, sr:, hr: and bs: should not be forced to become one Wikipedia never.
But, we can work on sh: with a lot of care.
First of all, at sh: should be implemented extended Zhengzhu's software; which would take care about different standard variants (four Serbian, two Bosnian and one Croatian).
Less complex is implementation of S<->C<->B dictionaries. More complex is starting to work on syntax (and maybe stylistic) differences. That step assumes that we would need help from educated people in linguistics.
Also, database should not stay in Ekavian Cyrillics (as exclusive Serbian standard). We should make some kind of meta-alphabet and meta-orthography for writing data into database.
And the last problem which I noticed are naming conventions. Would it be in Latin? Would it be in Serbian variant? Would it be in Iyekavian? Would it be...? This set of problems assumes that we need to make good political solutions.
It is not good to make any kind of majorization. We can say that the most of Serbs, Croats, Bosniaks and Montenegrins write in Latin alphabet (around 50% of Serbs and Montenegrins, 90% of Bosniaks and 100% of Croats), but it would be very bad to implement sh: interwiki links etc. in Latin alphabet because around 1/3 of speakers would think that is is majorization. It can be said that maybe 60% of all speakers are Ekavians, but all Croatians, Bosniaks and Montenegrins are Iyekavians. Language policy in former Yugoslavia failed on principle "Ekavian and Latin Serbo-Croatian language for all people in Yugoslavia" (note that Slovenians and Macedonians have different languages!). Only military partially implemented that principle.
5. When I am talking about linguistics and technical implementation, I have clear solution. Any cultural/political problem which can be solved in those ways -- can be easily done.
For example, we can call Serbo-Croatian in the sense of it's linguistic base: Shtokavian; even two letter ISO code (sh) is correct :) We have a lot of naming problems if we want to name the language correct: correct name in English translation is "Serbo-Croatian, Croato-Serbian, Croatian or Serbian, Serbian or Croatian" (because Serbian construction was "Serbo-Croatian" and Croatian construction was "Croatian or Serbian"). But, where are Bosniaks and Montenegrins in that name?
I wanted to say that we can make little clever tricks for a number of problems, but there is a big field of other cultural and political problems. And if people here think that we are enough strong to work on that problems, I would need a lot of help.
I think that the first step toward that solution is to make a workgroup of Wikipedians who are interested to work on that problems. The focus of that group should not be any (N)POV question nor the question of the sense of existence of sr:, hr: and bs:; but only making the solution which can allow possibility that people from sr:, hr: and bs: can work together.
Although it is not official in any capacity, I think it's unfair to completely exclude Ikavian.
For the unititiated, Ikavian (confusingly similar in name to both Ekavian and Iyekavian!) is the 3rd variety of Serbo-Croatian, the oft-forgotten outcast stepsister of the Ekavian and Iyekavian variants.
It has no official consideration anywhere in the world, but then again neither does Sicilian but we have a Sicilian Wikipedia.
Mark
On 27/06/05, Milos Rancic millosh@gmail.com wrote:
I saw that some people are interested in SC/S/C/B problematic and I would ask all of them for some attention.
- For people who don't know problematics, just to say that Serbian
language is written in two alphabets (Cyrillic and Latin) and it has two standard variants (Ekavian and Iyekavian). Cyrillic and Latin are not geographic specific (in Belgrade, Podgorica or Banja Luka you can find both), but Ekavian and Iyekavian are. There are around 8 millions of Ekavian speakers and around 2 millions of Iyekavian speakers. Ekavian is used in Serbia (but officially, both standard variants are equal in Serbia), Iyekavian is used in Republika Srpska (the part of Bosnia and Herzegovina; officially, both standard variants are equal) and Montenegro (only Iyekavian standard is official). It can be said that all of that can be implemented for sh: and that bs: can implement only Latin<->Cyrillic conversion.
Zhengzhu did a lot of work until now and we are waiting for the first implementation of his software on sr:. The software is based on his previous work on Chinese problem.
- Zhengzhu would implement the basic part of software for sr: (which
would be used on sh:, too; and maybe on bs:). However, it is just the beginning of the work and I think that all of that issue would need some help from the people (both: contributors and developers) who are interested in linguistics.
- The first implementation of the software (on sr:) should be
implemented in month or two (as I know). Implementation assumes:
a) Keeping sr: policy that articles should be written in Cyrillic and using Cyrillic-based syntax (in the sense of the starting alphabet).
b) Writing in Ekavian and/or specific syntax for marking Ekavian-Iyekavian variants. Also, Ekavian-Iyekavian dictionary would be used for automatic conversion and admins would have possibility to update dictionary.
c) General conversion would work in both ways, but we don't want to mix Latin, Cyrillic, Ekavian and Iyekavian (it is chaotic, silly for average user, as well as it is not standard).
d) All changes are on the read level. There would not be any change on the write level in MediaWiki.
- It can be said that "classic" implementation of Zhengzhu's software
would be the next step and (as I think) it would be finished in the next couple of months. Implementation assumes:
a) Possibility for writing in different alphabets and variants.
b) Conversion would be implemented on the write and read level. Database would be written in Ekavian Cyrillic with markup; when contributor writes something in Iyekavian or Ekavian Latin, it would be converted into Ekavian Cyrillic.
- The next step is Serbo-Croatian Wikipedia where more complex (but
more linguistically interesting) rules should be added.
I think that almost all people on the lists know that Serbian, Croatian, Bosnian and Serbo-Croatian standards have minimal linguistic differences. The most of differences are cultural and political. So, we should be very careful with any decision related to that problem. Actually, sr:, hr: and bs: should not be forced to become one Wikipedia never.
But, we can work on sh: with a lot of care.
First of all, at sh: should be implemented extended Zhengzhu's software; which would take care about different standard variants (four Serbian, two Bosnian and one Croatian).
Less complex is implementation of S<->C<->B dictionaries. More complex is starting to work on syntax (and maybe stylistic) differences. That step assumes that we would need help from educated people in linguistics.
Also, database should not stay in Ekavian Cyrillics (as exclusive Serbian standard). We should make some kind of meta-alphabet and meta-orthography for writing data into database.
And the last problem which I noticed are naming conventions. Would it be in Latin? Would it be in Serbian variant? Would it be in Iyekavian? Would it be...? This set of problems assumes that we need to make good political solutions.
It is not good to make any kind of majorization. We can say that the most of Serbs, Croats, Bosniaks and Montenegrins write in Latin alphabet (around 50% of Serbs and Montenegrins, 90% of Bosniaks and 100% of Croats), but it would be very bad to implement sh: interwiki links etc. in Latin alphabet because around 1/3 of speakers would think that is is majorization. It can be said that maybe 60% of all speakers are Ekavians, but all Croatians, Bosniaks and Montenegrins are Iyekavians. Language policy in former Yugoslavia failed on principle "Ekavian and Latin Serbo-Croatian language for all people in Yugoslavia" (note that Slovenians and Macedonians have different languages!). Only military partially implemented that principle.
- When I am talking about linguistics and technical implementation, I
have clear solution. Any cultural/political problem which can be solved in those ways -- can be easily done.
For example, we can call Serbo-Croatian in the sense of it's linguistic base: Shtokavian; even two letter ISO code (sh) is correct :) We have a lot of naming problems if we want to name the language correct: correct name in English translation is "Serbo-Croatian, Croato-Serbian, Croatian or Serbian, Serbian or Croatian" (because Serbian construction was "Serbo-Croatian" and Croatian construction was "Croatian or Serbian"). But, where are Bosniaks and Montenegrins in that name?
I wanted to say that we can make little clever tricks for a number of problems, but there is a big field of other cultural and political problems. And if people here think that we are enough strong to work on that problems, I would need a lot of help.
I think that the first step toward that solution is to make a workgroup of Wikipedians who are interested to work on that problems. The focus of that group should not be any (N)POV question nor the question of the sense of existence of sr:, hr: and bs:; but only making the solution which can allow possibility that people from sr:, hr: and bs: can work together. _______________________________________________ Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
Ikavian can be included in sh:, hr: and bs: Wikipedia (via simple dictionaries and markup). But, we need in the future three new Wikipedias in the future (Kaykavian, Chakavian and Torlakian), because those languages are very different from Shtokavian (which is the base of Serbian, Croatian and Bosnian; and which is has dialects Ekavian, Iyekavian and Ikavian).
However, it is not Serbian dialect. Only Croats and Bosniaks use Ikavian. So, we can implement it on sh:, hr:, and bs:, but not on sr:. (As well as Ekavian should not be implemented on hr: and bs: because they are not speak Ekavian.)
Mark, are you interested in making workgroup for that?
On 6/28/05, Mark Williamson node.ue@gmail.com wrote:
Although it is not official in any capacity, I think it's unfair to completely exclude Ikavian.
For the unititiated, Ikavian (confusingly similar in name to both Ekavian and Iyekavian!) is the 3rd variety of Serbo-Croatian, the oft-forgotten outcast stepsister of the Ekavian and Iyekavian variants.
It has no official consideration anywhere in the world, but then again neither does Sicilian but we have a Sicilian Wikipedia.
Mark
On 27/06/05, Milos Rancic millosh@gmail.com wrote:
I saw that some people are interested in SC/S/C/B problematic and I would ask all of them for some attention.
- For people who don't know problematics, just to say that Serbian
language is written in two alphabets (Cyrillic and Latin) and it has two standard variants (Ekavian and Iyekavian). Cyrillic and Latin are not geographic specific (in Belgrade, Podgorica or Banja Luka you can find both), but Ekavian and Iyekavian are. There are around 8 millions of Ekavian speakers and around 2 millions of Iyekavian speakers. Ekavian is used in Serbia (but officially, both standard variants are equal in Serbia), Iyekavian is used in Republika Srpska (the part of Bosnia and Herzegovina; officially, both standard variants are equal) and Montenegro (only Iyekavian standard is official). It can be said that all of that can be implemented for sh: and that bs: can implement only Latin<->Cyrillic conversion.
Zhengzhu did a lot of work until now and we are waiting for the first implementation of his software on sr:. The software is based on his previous work on Chinese problem.
- Zhengzhu would implement the basic part of software for sr: (which
would be used on sh:, too; and maybe on bs:). However, it is just the beginning of the work and I think that all of that issue would need some help from the people (both: contributors and developers) who are interested in linguistics.
- The first implementation of the software (on sr:) should be
implemented in month or two (as I know). Implementation assumes:
a) Keeping sr: policy that articles should be written in Cyrillic and using Cyrillic-based syntax (in the sense of the starting alphabet).
b) Writing in Ekavian and/or specific syntax for marking Ekavian-Iyekavian variants. Also, Ekavian-Iyekavian dictionary would be used for automatic conversion and admins would have possibility to update dictionary.
c) General conversion would work in both ways, but we don't want to mix Latin, Cyrillic, Ekavian and Iyekavian (it is chaotic, silly for average user, as well as it is not standard).
d) All changes are on the read level. There would not be any change on the write level in MediaWiki.
- It can be said that "classic" implementation of Zhengzhu's software
would be the next step and (as I think) it would be finished in the next couple of months. Implementation assumes:
a) Possibility for writing in different alphabets and variants.
b) Conversion would be implemented on the write and read level. Database would be written in Ekavian Cyrillic with markup; when contributor writes something in Iyekavian or Ekavian Latin, it would be converted into Ekavian Cyrillic.
- The next step is Serbo-Croatian Wikipedia where more complex (but
more linguistically interesting) rules should be added.
I think that almost all people on the lists know that Serbian, Croatian, Bosnian and Serbo-Croatian standards have minimal linguistic differences. The most of differences are cultural and political. So, we should be very careful with any decision related to that problem. Actually, sr:, hr: and bs: should not be forced to become one Wikipedia never.
But, we can work on sh: with a lot of care.
First of all, at sh: should be implemented extended Zhengzhu's software; which would take care about different standard variants (four Serbian, two Bosnian and one Croatian).
Less complex is implementation of S<->C<->B dictionaries. More complex is starting to work on syntax (and maybe stylistic) differences. That step assumes that we would need help from educated people in linguistics.
Also, database should not stay in Ekavian Cyrillics (as exclusive Serbian standard). We should make some kind of meta-alphabet and meta-orthography for writing data into database.
And the last problem which I noticed are naming conventions. Would it be in Latin? Would it be in Serbian variant? Would it be in Iyekavian? Would it be...? This set of problems assumes that we need to make good political solutions.
It is not good to make any kind of majorization. We can say that the most of Serbs, Croats, Bosniaks and Montenegrins write in Latin alphabet (around 50% of Serbs and Montenegrins, 90% of Bosniaks and 100% of Croats), but it would be very bad to implement sh: interwiki links etc. in Latin alphabet because around 1/3 of speakers would think that is is majorization. It can be said that maybe 60% of all speakers are Ekavians, but all Croatians, Bosniaks and Montenegrins are Iyekavians. Language policy in former Yugoslavia failed on principle "Ekavian and Latin Serbo-Croatian language for all people in Yugoslavia" (note that Slovenians and Macedonians have different languages!). Only military partially implemented that principle.
- When I am talking about linguistics and technical implementation, I
have clear solution. Any cultural/political problem which can be solved in those ways -- can be easily done.
For example, we can call Serbo-Croatian in the sense of it's linguistic base: Shtokavian; even two letter ISO code (sh) is correct :) We have a lot of naming problems if we want to name the language correct: correct name in English translation is "Serbo-Croatian, Croato-Serbian, Croatian or Serbian, Serbian or Croatian" (because Serbian construction was "Serbo-Croatian" and Croatian construction was "Croatian or Serbian"). But, where are Bosniaks and Montenegrins in that name?
I wanted to say that we can make little clever tricks for a number of problems, but there is a big field of other cultural and political problems. And if people here think that we are enough strong to work on that problems, I would need a lot of help.
I think that the first step toward that solution is to make a workgroup of Wikipedians who are interested to work on that problems. The focus of that group should not be any (N)POV question nor the question of the sense of existence of sr:, hr: and bs:; but only making the solution which can allow possibility that people from sr:, hr: and bs: can work together. _______________________________________________ Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
-- SI HOC LEGERE SCIS NIMIVM ERVDITIONIS HABES QVANTVM MATERIAE MATERIETVR MARMOTA MONAX SI MARMOTA MONAX MATERIAM POSSIT MATERIARI ESTNE VOLVMEN IN TOGA AN SOLVM TIBI LIBET ME VIDERE
wikipedia-l@lists.wikimedia.org