I just committed some changes so each namespace can get its own background
color. The colors are stored in wikiTextEn.php as an array.
For the sake of our eyesight, please, help me change the colors to something
nice but distinguishable!
Magnus
Dear sirs :)
I just committed a change enabling sysops to block IPs for edit. This was a
request from Larry.
Sysops (and only those!) will have a "Block this IP" link next to each IP on
the history page of an article. That seemed like the logical place to put
it, since IP blocking will result from some form of editing.
NOTE: Currently, there is no way to block a logged-in user that way; not a
pressing issue at the time, though.
That blocked IP will go to "Wikipedia:Blocked IPs".
NOTE: That page should be protected, otherwise a troll could just go there
and remove his IP from the list!
NOTE: Currently, the only way to set an IP free again is to manually delete
the line on "Wikipedia:Blocked IPs". There's a timestamp so we can implement
time-based removal later.
Even blocked IPs can go to the edit page, but on pressing "Save", blocked
IPs will get a message ("Your IP has been blocked..."). This will even work
if the troll signs up with a user name after getting the block as an IP.
Sneaky, eh? ;)
Let's get Larry his sysop rights back ASAP!
Magnus
(I set the reply-to for wikitech-l(a)nupedia.com, since that's the appropriate
place for technical discussions.)
Tomasz Wegrzanowski wrote:
> Easiest distributed editing architecture:
> There is main server and other servers
>
> Every server handle read request itself.
> On all servers 'edit this page' links point to main server.
> Main server sends all changes to all subscribed servers,
> so they are always up to date.
>
> This won't require many changes in design, while
> may allow reasonable distribution of load.
But, there's no problem with load right now, and I stand ready to
supply whatever hardware we need here. In this way, we don't have to
deal with complex distribution schemes.
--Jimbo
On mer, 2002-02-20 at 11:33, lcrocker(a)nupedia.com wrote:
> No! No! The text stored in the database is _always_ single-byte
> ISO-8859-1, no exceptions, even for the foreign wikis. Some of
> those ISO-8859-1 characters may spell out HTML entity references
> to Unicode characters outside the set, but the database should not
> know or care about that.
I'm sorry you feel that way, but that is in fact NOT TRUE. Please take a
look at the non-English non-ISO-8859-1 wikipedias sometime.
Hundreds of pages, with correct charset headers:
ISO-8859-2:
http://pl.wikipedia.com/
UTF-8 with a custom conversion function for certain character
sequences:
http://eo.wikipedia.com/
Stubs:
CP-1251
http://ru.wikipedia.com/
Shift-JIS
http://ja.wikipedia.com/
GB-2312 with a few character references thrown in:
http://zh.wikipedia.com/
Not sure which encodings, but certainly not ISO-8859-1:
http://ar.wikipedia.com/http://he.wikipedia.com/
Now, if you honestly think that people are going to edit text that
consists *entirely* of HTML character entity references, you're
obviously not concerned about anything like "ease of use".
On top of which, the consensus seems to be to not allow &s (and thus
character entities) into page titles, which would effectively require
all page titles to be in ASCIIized roman characters. Can you imagine
this being acceptable on, say, the Chinese wiki if anyone actually used
it?
Gee, maybe someone *would* use it if they could use an appropriate
character set for their language!
> This policy might have to be changed for the Asian wikis if something
> like shift-JIS is universal enough and dealing with HTML entities
> problematic enough to make working with it difficult,
The mind boggles that you might imagine the situation to be otherwise.
> but in that
> case we'll still standardize on one and only one internal character
> representation for that particular wiki. For all others, that
> internal representation (and also the encoding which is served via
> HTTP) is ISO-8859-1.
Bullshit. Ask the Poles if they'd like to convert their wikipedia to
ISO-8859-1 with HTML character entities.
> If you need to "uppercase" words in titles (as our consensus on
> canonization of titles specifies), go ahead and hard-code the
> function to deal with ISO-8859-1.
Gee, that would be great if such a function would do anything at all for
anything other than ISO-8859-1 characters. But, somehow I can't quite
see a function hardcoded to deal with ISO-8859-1 being the slightest bit
useful for anything else.
-- brion vibber (brion @ pobox.com)
> You Wrote:
> >I've noticed that the traditional locale-based case conversion
> functions
> >(ucfirst(), strtolower(), etc) aren't too reliable for anything but
> >English. Even when they do work, it's very dependant on the system
> >configuaration, and thus isn't really transparently portable.
> >
> >So, I've added new case conversion functions ucfirstIntl(),
> >strtoupperIntl(), and strtolowerIntl() which can more or less
> properly
> >convert cases in a system-independent manner. For single-byte
> character
> >encodings this is very simple, based on the PHP strtr() function;
> just
> >define strings $wikiUpperChars containing all the uppercase
> characters
> >and $wikiLowerChars containing all the lowercase chars. (See example
> for
> >iso-8859-1 in wikiTextEn.php)
> >
> >For multibyte character sets it's a little more complex, using the
> same
> >function in an array mode that associates byte sequences. Most
> multibyte
> >character sets are for Asian languages which don't have a case
> >distinction, so it's not likely to come up often except for those
> using
> >UTF-8. I've included conversion arrays for UTF-8 in utf8Case.php
> which
> >should cover just about everything, so any future 'pedias that may
> use
> >UTF-8 need just include that (as does wikiTextEo.php).
> >
> >Also, it should be possible to extend ucfirstIntl() a bit to allow
> for
> >multiple-character first letter sequences (for instance treating ij-
> >IJ
> >as one letter, which I believe is the officially correct behavior for
> >Dutch).
> >
> >-- brion vibber (brion @ pobox.com)
> >
> >_______________________________________________
> >Wikitech-l mailing list
> >Wikitech-l(a)ross.bomis.com
> >http://ross.bomis.com/mailman/listinfo/wikitech-l
> >0
I've noticed that the traditional locale-based case conversion functions
(ucfirst(), strtolower(), etc) aren't too reliable for anything but
English. Even when they do work, it's very dependant on the system
configuaration, and thus isn't really transparently portable.
So, I've added new case conversion functions ucfirstIntl(),
strtoupperIntl(), and strtolowerIntl() which can more or less properly
convert cases in a system-independent manner. For single-byte character
encodings this is very simple, based on the PHP strtr() function; just
define strings $wikiUpperChars containing all the uppercase characters
and $wikiLowerChars containing all the lowercase chars. (See example for
iso-8859-1 in wikiTextEn.php)
For multibyte character sets it's a little more complex, using the same
function in an array mode that associates byte sequences. Most multibyte
character sets are for Asian languages which don't have a case
distinction, so it's not likely to come up often except for those using
UTF-8. I've included conversion arrays for UTF-8 in utf8Case.php which
should cover just about everything, so any future 'pedias that may use
UTF-8 need just include that (as does wikiTextEo.php).
Also, it should be possible to extend ucfirstIntl() a bit to allow for
multiple-character first letter sequences (for instance treating ij->IJ
as one letter, which I believe is the officially correct behavior for
Dutch).
-- brion vibber (brion @ pobox.com)
Gentlemen,
I've just committed to CVS the new implementation of the MostWanted page
based on the new tables 'linked' and 'unlinked'. I've left caching out but
it still takes quite a while to compute. If this turns out to be a problem I
will program it back in.
-- Jan Hidders
On mar, 2002-02-19 at 16:26, Jimmy Wales wrote:
> Brion Vibber wrote:
> > Jimbo, is there any chance we can move the Esperanto wikipedia over to
> > the PHP script soon? I've been promising people we'd be upgrading to the
> > new software (which will fix a number of annoying bugs in the old) for a
> > while, and the natives are getting restless. :)
>
> Yes!
>
> > I'm going to check in a couple more character set and case-conversion
> > fixes tonight, after which we should be ready anytime. At this point any
> > additional problems are only going to be discovered by having real users
> > bang at the real site with real non-English non-ISO-8859-1 text...
>
> How about this -- tomorrow morning, I will install this. Will you be around
> (in email) tomorrow for questions?
Sounds great! If I remember correctly, we are both on Pacific time
(UTC-8), yes?
> Best thing to do -- send me simple step by step instructions,
> including instructions about the conversion script. I'll back
> everything up, run the conversion, install the new software, and
> it'll work perfectly the first try! (ha ha!)
It's relatively simple (famous last words). For everybody's reference:
1. Edit convertWiki2SQL.php to set some options there. Right now it's a
little rough, eventually it may or may not get smoother. Basically,
uncomment the special settings for the target language, and set the
$rootDir variable to point to the "page" subdirectory of the usemod db
that the data is being sucked from.
2. Run "php convertWiki2SQL.php" as a user with write permission to the
wiki directory. This should spit out a bunch of article titles and
create a big file called "newiki.sql" which contains the SQL commands to
insert everything into the database. Note that it's normal to see a few
errors about not being able to open a directory -- that just means
there's a letter of the alphabet that no article titles start with.
3. Create the database. I've been calling my test database "wikieo", but
whatever sounds appropriate should work just as well. Something like:
mysql -e "create database wikieo;"
should do it if I recall correctly. You might also have to specify the
proper username and password, I don't know how you guys have mysql set
up.
4. Initialize the tables and enter the data. Something like:
mysql wikieo < wikipedia.sql
mysql wikieo < newiki.sql
(After this step "newiki.sql" shouldn't be necessary anymore.) I'm also
going to put together a file with SQL commands to fix some articles with
extra uppercase letters in the titles which you can run here:
mysql wikieo < titlefix.sql
5. Edit wikiLocalSettings.php to set the language and database name.
Roughly:
$wikiLanguage = "eo" ;
$wikiSQLServer = "wikieo" ;
and if the hostname isn't being automatically picked up:
$wikiCurrentServer = "http://eo.wikipedia.com" ;
In theory everything should automagically work after that... (Assuming
of course that the apache rewrite rules are set up properly, etc.)
> If it isn't working perfectly right out of the box, then I'll back out
> the change, revert to the Usemod script, and we'll do a dry run in a
> "safer" way with test.wikipedia.com or whatever.
Great, I'll warn the others. :)
-- brion vibber (brion @ pobox.com)
Brion Vibber wrote:
> Jimbo, is there any chance we can move the Esperanto wikipedia over to
> the PHP script soon? I've been promising people we'd be upgrading to the
> new software (which will fix a number of annoying bugs in the old) for a
> while, and the natives are getting restless. :)
Yes!
> I'm going to check in a couple more character set and case-conversion
> fixes tonight, after which we should be ready anytime. At this point any
> additional problems are only going to be discovered by having real users
> bang at the real site with real non-English non-ISO-8859-1 text...
How about this -- tomorrow morning, I will install this. Will you be around
(in email) tomorrow for questions?
Best thing to do -- send me simple step by step instructions,
including instructions about the conversion script. I'll back
everything up, run the conversion, install the new software, and
it'll work perfectly the first try! (ha ha!)
If it isn't working perfectly right out of the box, then I'll back out
the change, revert to the Usemod script, and we'll do a dry run in a
"safer" way with test.wikipedia.com or whatever.
> There are a few switches at the top of convertWiki2SQL.php for selecting
> language-specific processing options, and wikiLocalSettings.php needs to
> select the proper $wikiLanguage and $wikiSQLServer, but that's about it.
> Other than those two, the PHP source files can be shared 100% between
> various language wikipedias. (I've already included the Esperanto
> message-localization file in the CVS repository, and I assume others
> will be added as they are converted & translated.)
Cool! I don't know anything about the conversion script, though. Jason did it
for the main site, and he's out of town. Oh, wait, he'll be back tomorrow. But
still, if you have information, let me know. :-)
----- End forwarded message -----
Jimbo, is there any chance we can move the Esperanto wikipedia over to
the PHP script soon? I've been promising people we'd be upgrading to the
new software (which will fix a number of annoying bugs in the old) for a
while, and the natives are getting restless. :)
I'm going to check in a couple more character set and case-conversion
fixes tonight, after which we should be ready anytime. At this point any
additional problems are only going to be discovered by having real users
bang at the real site with real non-English non-ISO-8859-1 text...
There are a few switches at the top of convertWiki2SQL.php for selecting
language-specific processing options, and wikiLocalSettings.php needs to
select the proper $wikiLanguage and $wikiSQLServer, but that's about it.
Other than those two, the PHP source files can be shared 100% between
various language wikipedias. (I've already included the Esperanto
message-localization file in the CVS repository, and I assume others
will be added as they are converted & translated.)
-- brion vibber (brion @ pobox.com)
Dear fellow programmers,
I have extended the database schema with two new tables: 'linked' and
'unlinked'. As usual the SQL for this addition can be found in
updSchema.sql. However, in this case the contents of the tables cannot be
generated by SQL alone, so there is an extra upLinks.php script in PHP that
contains the PHP code to do so. Read this file for further instructions.
The intention of these tables is to replace the 'cur_linked_links' and
'cur_unlinked_links' columns in the cur table. This will make it possible to
give the special pages that query linking information reasonable response
times, so they won't have to be cached. Right now I've only added the code
to keep these tables up-to-date and they are not used yet. From this moment
on the usage of the 'cur_linked_links' and 'cur_unlinked_links' columns is
depreciated, and I will remove them as soon as all the code that uses them
has been replaced by code that uses the new tables.
-- Jan Hidders