Sorry for the listserv spam, but if anyone can take care of this JIRA ticket
this weekend, it would be greatly appreciated:
https://jira.toolserver.org/browse/TS-1038
My laptop died right before I left for the Wiki-GLAMcamp in New York. I'm
hoping to work with Multichill on developing some GLAM tools this weekend,
but unfortunately, I'm locked out of the toolserver at the moment.
Ryan Kaldari
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
The front-end web server on amaranth (which serves JIRA, MediaWiki,
FishEye and WordPress) has been changed from Sun Java System Web Server
to Apache. This should not cause any noticeable changes, but please
report any problems to JIRA or ts-admins(a)toolserver.org.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (SunOS)
iEYEARECAAYFAk3T3OEACgkQIXd7fCuc5vJr8wCfYRN6DaltlbGWX+qfDv/R4oLe
mSEAniY5Yfoj2VvWV1wsE+1Db8oWQb1/
=D6af
-----END PGP SIGNATURE-----
Hello all,
some user complained of the new way toolserver.namespacename handles cases in
which the primary name of namespace is identical to the canonical name.
To fix this without breaking tools that rely on the new behavior, I add a new
column "ns_is_favorite" to the table today (this solution was found with the
help of valhallasw and Krinkle in IRC yesterday). The coloumn is a boolean
field that describe if this ns_name is the best solution for a ns_id or not.
See the following examples:
mysql> SELECT ns_name,ns_type FROM namespacename WHERE
domain="en.wikipedia.org" AND ns_id=4;
+-----------+-----------+
| ns_name | ns_type |
+-----------+-----------+
| Wikipedia | primary |
| Project | canonical |
| WP | alias |
+-----------+-----------+
mysql> SELECT ns_name,ns_type FROM namespacename WHERE
domain="en.wikipedia.org" AND ns_id=4 AND ns_is_favorite=1;
+-----------+---------+
| ns_name | ns_type |
+-----------+---------+
| Wikipedia | primary |
+-----------+---------+
mysql> SELECT ns_name,ns_type FROM namespacename WHERE
domain="en.wikipedia.org" AND ns_id=2;
+---------+-----------+
| ns_name | ns_type |
+---------+-----------+
| User | canonical |
+---------+-----------+
mysql> SELECT ns_name,ns_type FROM namespacename WHERE
domain="en.wikipedia.org" AND ns_id=2 AND ns_is_favorite=1;
+---------+-----------+
| ns_name | ns_type |
+---------+-----------+
| User | canonical |
+---------+-----------+
As you can see, if you use "ns_is_favorite=1" you will always get only 1
result, no matter if there are primary and canonical or just a canonical one.
If you have questions or suggestions, please reply.
Sincerly,
DaB.
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
Ive seen the primary sql servers for 1,2/5 lagged about 12 hours for the
last day. While the fast servers are current. Any idea on the source of the
issue?
John
Hi Toolserver users and admins,
We've seen a problem regarding non-Latin (Unicode) texts in PHP [1],
and this is already a long-standing issue. I'd like to wrap up the
situation and would like to discuss how to get it better.
Here is the summary of the problem: Several widely-used string
functions of PHP, including strupper() and ucfirst(), are known to
"corrupt" strings when used under a UTF-8 locale [2], which is the
current setting at the Toolserver. The problem is that those
functions can incorrectly recognize a part of a multi-byte character
sequence as a single-byte character. When those parts are converted
into upper/lower cases, the resulting string will corrupt.
We've seen this problem has been breaking down the functionalities of
a number of major tools on the Toolserver including Vvv's sulutils and
SoxRed93's edit counter. For example, a Chinese/Japanese string "利用者"
(meaning "user") doesn't have a capitalized form. However, when it's
passed to a tool which (I assume) uses ucfirst, the first character is
converted into a non-existent character [3], and the result doesn't
make sense. An incomplete list of the affected tools is available at
[4]. See also TS-923 [1] for more details.
River suggested [1] to solve it by migrating into multi-byte aware
functions such as mb_strupper [5], but I think it's not an ideal
solution. I'd totally encourage the migration too, but it would take
time for all developers to fix their tools appropriately. I hope we
can have a more fundamental, instant solution.
The synchronization of reports of similar problems [4] suggests that
there was a underlying common reason. The behavior of string
processing seem to have changed in different programs almost
simultaneously, somewhere around October 2010. The underlying reason
might be a side-effect from some changes in the PHP platform on the
Toolserver, but I don't have any clue what it really was. If someone
could point out the original reason, it would be a great help to step
forward to a better solution.
Wikimedians and Toolserver users using multi-byte characters
(including Arabic, Chinese, Korean and Japanese characters) have been
apparently unhappy about this problem for more than half a year. I
hope all the tools can (again) work more multilingually.
Any comments or suggestions?
[1] https://jira.toolserver.org/browse/TS-923
[2] http://www.phpwact.org/php/i18n/utf-8
[3] http://toolserver.org/~vvv/sulutil.php?user=%E5%88%A9%E7%94%A8%E8%80%85
[4] https://jira.toolserver.org/secure/ManageLinks.jspa?id=24486
[5] http://php.net/manual/en/function.mb-strtoupper.php
Cheers,
Whym
--
http://toolserver.org/~whym/
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
Due to high replication lag on thyme (s1) I've switched 'sql-s1' to
rosemary. If you have scripts which access user databases on s1 and
haven't been updated to use sql-s1-user instead, you will find your
databases are missing. The fix is to connect to sql-s1-user instead of
sql-s1.
Scripts which do not use user databases will not be affected and should
not be changed.
This change will be reverted once thyme has caught up.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (SunOS)
iEYEARECAAYFAk3OC2QACgkQIXd7fCuc5vIz7ACdENxlPUZHXengxHE/ZcYp4hkt
g5oAoLTAXH4VJWJVZCla5sAhQ25sniRh
=9Jc7
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
It is now possible to mark a user database to be backed up without its
data, i.e. only the schema is dumped (mysqldump -d). To do this,
include the string "_transient" in the database name:
u_jsmith_transient
u_jsmith_my_transient_data_p
If you have databases whose data doesn't need to be backed up, it would
be very helpful if you could rename them to include _transient in the
name, to reduce the load / disk space requirements of the nightly backup
job.
Unfortunately MySQL doesn't provide a way to rename a database, but you
can copy the database to a new name like this:
$ mysqladmin create u_jsmith_transient
$ mysqldump --opt u_jsmith | mysql u_jsmith_transient
The old database should then be dropped.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (SunOS)
iEYEARECAAYFAk3OGlQACgkQIXd7fCuc5vKdsQCaAwWY3LJ/H4+JcvB0x7VHTyzS
VSMAnjVCQBvlfOVG/2DmLtUz0c9EUe5Y
=sB8i
-----END PGP SIGNATURE-----
Briefly: an English teacher is insisting you / WMF (via emails) have said it's OK for his class to share an account. We've tried very nicely helping him make drafts and not share, but he's insistent; indeed now claiming it is a 'crime against humanity'. It's ongoing; they've created a couple of test live articles, and been warned by several.
I've tried my damnedest to help 'em, but...they're about to be blocked, I think.
I just want to check - did you indicate anything to him, giving him the OK to share accounts?
See http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard/Incide…
Cheers.
(I tried to post this question before but was not properly registered for
the mailing list. If this is a repeat I apologize.)
I am in need of some guidance on how to get some data out of the query
service. I signed up for an account, but I'm not sure if I'm supposed to
create an "issue" or if there is some other process I should follow. I'm
also not sure if the query service is right way to go about this.
In short, as part of a graduate research project, I need to select about 100
articles in Wikipedia, find the most frequent editors (there is a
"contributors" tool which ranks them) such as those with more than 10 edits
to the article and then for each of these editors generate a list of all
pages they have edited with a frequency count for each.
Does this sound like the query service is the right way to go about
collecting this data and if so can someone point me to the proper procedure
for making such a request?
Thanks.
--
Jim