I saw that some people are interested in SC/S/C/B problematic and I
would ask all of them for some attention.
0. For people who don't know problematics, just to say that Serbian
language is written in two alphabets (Cyrillic and Latin) and it has
two standard variants (Ekavian and Iyekavian). Cyrillic and Latin are
not geographic specific (in Belgrade, Podgorica or Banja Luka you can
find both), but Ekavian and Iyekavian are. There are around 8 millions
of Ekavian speakers and around 2 millions of Iyekavian speakers.
Ekavian is used in Serbia (but officially, both standard variants are
equal in Serbia), Iyekavian is used in Republika Srpska (the part of
Bosnia and Herzegovina; officially, both standard variants are equal)
and Montenegro (only Iyekavian standard is official). It can be said
that all of that can be implemented for sh: and that bs: can implement
only Latin<->Cyrillic conversion.
Zhengzhu did a lot of work until now and we are waiting for the first
implementation of his software on sr:. The software is based on his
previous work on Chinese problem.
1. Zhengzhu would implement the basic part of software for sr: (which
would be used on sh:, too; and maybe on bs:). However, it is just the
beginning of the work and I think that all of that issue would need
some help from the people (both: contributors and developers) who are
interested in linguistics.
2. The first implementation of the software (on sr:) should be
implemented in month or two (as I know). Implementation assumes:
a) Keeping sr: policy that articles should be written in Cyrillic and
using Cyrillic-based syntax (in the sense of the starting alphabet).
b) Writing in Ekavian and/or specific syntax for marking
Ekavian-Iyekavian variants. Also, Ekavian-Iyekavian dictionary would
be used for automatic conversion and admins would have possibility to
update dictionary.
c) General conversion would work in both ways, but we don't want to
mix Latin, Cyrillic, Ekavian and Iyekavian (it is chaotic, silly for
average user, as well as it is not standard).
d) All changes are on the read level. There would not be any change on
the write level in MediaWiki.
3. It can be said that "classic" implementation of Zhengzhu's software
would be the next step and (as I think) it would be finished in the
next couple of months. Implementation assumes:
a) Possibility for writing in different alphabets and variants.
b) Conversion would be implemented on the write and read level.
Database would be written in Ekavian Cyrillic with markup; when
contributor writes something in Iyekavian or Ekavian Latin, it would
be converted into Ekavian Cyrillic.
4. The next step is Serbo-Croatian Wikipedia where more complex (but
more linguistically interesting) rules should be added.
I think that almost all people on the lists know that Serbian,
Croatian, Bosnian and Serbo-Croatian standards have minimal linguistic
differences. The most of differences are cultural and political. So,
we should be very careful with any decision related to that problem.
Actually, sr:, hr: and bs: should not be forced to become one
Wikipedia never.
But, we can work on sh: with a lot of care.
First of all, at sh: should be implemented extended Zhengzhu's
software; which would take care about different standard variants
(four Serbian, two Bosnian and one Croatian).
Less complex is implementation of S<->C<->B dictionaries. More complex
is starting to work on syntax (and maybe stylistic) differences. That
step assumes that we would need help from educated people in
linguistics.
Also, database should not stay in Ekavian Cyrillics (as exclusive
Serbian standard). We should make some kind of meta-alphabet and
meta-orthography for writing data into database.
And the last problem which I noticed are naming conventions. Would it
be in Latin? Would it be in Serbian variant? Would it be in Iyekavian?
Would it be...? This set of problems assumes that we need to make good
political solutions.
It is not good to make any kind of majorization. We can say that the
most of Serbs, Croats, Bosniaks and Montenegrins write in Latin
alphabet (around 50% of Serbs and Montenegrins, 90% of Bosniaks and
100% of Croats), but it would be very bad to implement sh: interwiki
links etc. in Latin alphabet because around 1/3 of speakers would
think that is is majorization. It can be said that maybe 60% of all
speakers are Ekavians, but all Croatians, Bosniaks and Montenegrins
are Iyekavians. Language policy in former Yugoslavia failed on
principle "Ekavian and Latin Serbo-Croatian language for all people in
Yugoslavia" (note that Slovenians and Macedonians have different
languages!). Only military partially implemented that principle.
5. When I am talking about linguistics and technical implementation, I
have clear solution. Any cultural/political problem which can be
solved in those ways -- can be easily done.
For example, we can call Serbo-Croatian in the sense of it's
linguistic base: Shtokavian; even two letter ISO code (sh) is correct
:) We have a lot of naming problems if we want to name the language
correct: correct name in English translation is "Serbo-Croatian,
Croato-Serbian, Croatian or Serbian, Serbian or Croatian" (because
Serbian construction was "Serbo-Croatian" and Croatian construction
was "Croatian or Serbian"). But, where are Bosniaks and Montenegrins
in that name?
I wanted to say that we can make little clever tricks for a number of
problems, but there is a big field of other cultural and political
problems. And if people here think that we are enough strong to work
on that problems, I would need a lot of help.
I think that the first step toward that solution is to make a
workgroup of Wikipedians who are interested to work on that problems.
The focus of that group should not be any (N)POV question nor the
question of the sense of existence of sr:, hr: and bs:; but only
making the solution which can allow possibility that people from sr:,
hr: and bs: can work together.
Pardon the question, but whom can I email at Yahoo to thank them for the
server donations to Wikipedia? I can't find contact info on Yahoo's site.
Thank you.
> An "out of curiosity" question: Those look like db servers to me. Does
> that mean wikimedia is moving to distributed servers? I was vaguely under the
> impression the dbs were going to be kept in Florida, and just Squids
> distributed.
Well, we try to do things better. It might turn to be distributed content cluster.
Domas
Hi there...
Does anybody know, can I use images for my wiki site which is
contained in the en_image_table.sql dump from
http://download.wikimedia.org, or it may be illegal outside Wikipedia?
Thanks
Sergey
I notice discussion about "New syntax".
My idea about this subject, hope useful, is:
We can have default "special characters" set and alternative "special
characters" defined for any row in tables where the
default set interferes with content of rows.
It is necessary to define a list of character set functionality, code,
definition.
When it is necessary, for some rows, it can be defined
alternative "special characters" in one column
as a string of characters (not used in text).
The position in string is functionality code.
mm
>From our contact at Yahoo. These are HP boxes. If for some reason we
prefer FC over RHEL, I should tell him quickly.
These have been ordered. I will ask about the eta.
---------
3x DL385, dual CPU, 8GB RAM and 6 x 146GB 15k RAID 10. These have capacity
for external shelves, can be added if needed.
20x DL140, dual CPU, 4GB RAM and 1 x 80GB HDD
My plan is to set these up with some linux image, probably RHEL4, and
then your folks can re-jump them into whatever you like. I'll be
available to help with that process, if need be.
These machines will have boot boxes and serial consoles.
I have been on this list awhile, when i originally joined i was
interesting in the possibility of exporting the wiktionary data as
.dict format. Now that the newest version of OSX 10.4 has a built-in
dictionary that uses the dict:// to look-up words i was interested to
see if anyone on the technicaly side would like to explore the
possibility of either exporting the Wiktionary database as .dict
format, or run a dictionary daemon that would access the wiktionary
database server and return dict entries. It would be read-only, but it
would be another interesting way to access the wiktionary besides the
web interface.
Does anyone on the tech list know if this is even possible? I'm not
asking you to do it (i can write the export), i was wondering if there
is some sort of database schema available to extract the data into
dict format, or are the entries too fragmented to even attempt an
export?
-brian
--
brian suda
http://suda.co.uk
To whom it may concern (lovers of Enotif):
I have successfully updated my version, which is now fully based on
beta1, see [1] for download.
Remarks for Tim only:
Witt this you have now a fully working reference version based of
today's Mediawiki including an improved updaters.inc with a new function
ChangeFields (for database fields which needs to be changed or dropped;
this streamlines the code therein).
[1] has everything in, including enotif patchlet #1, which obsoletes
UserTalkUpdate.php, reestablishes newtalk (what you also did, fine) as
mentioned yesterday.
Both together make the recently introduced $wgUseEnotif redundant, so
you will not find your recent changes in my version. I point you to this
version as a reference only, not to ask for anything else. I also like
your clever "late" watchlist notification-timestamp updating in one go;
however, I commented this for a while in the 3.29 version, until I have
adapted my other routines to make use of "oldid" instead of timestamp
compares.
[1]
http://sourceforge.net/project/showfiles.php?group_id=138202&package_id=155…
[2] http://meta.wikipedia.org/wiki/Enotif docu + screenshots
Tom