Hi again.
For anyone who didn´t look my previous post I'm Telco
Engineer, Assistan Professor at the UAX University in
Madrid (Spain) and I'm working for my Ph.D (at URJC
University) in topics related with Internet multimedia
content distribution (currently focusing on
Wikipedia).
You can visit:
http://libresoft.urjc.es/ for more info about our
work, and
http://gsyc.escet.urjc.es/ for more info about our
technical group.
After processing all the info I've could found at
meta.wikimedia.org, I still have several questions on
the air. I don´t know if someone could help me:
1.- Which is the main bottleneck of Wikimedia whole
system nowadays? Apaches? My-SQL database? Both??
2.- I've found no precise info in My-SQL Wikimedia
tables about the size of the modifications that a
certain user made to an article. It could be very
interesting to build some nice statistics about the
shape of register users contributions (obviously,
respecting the anonimity of users).
Our group has build several studies about these
factors with many kind of libre software projects.
3.- Finally, in addition to SQUID configs (I still ask
for someone who could help me on this), I wonder if it
could be possible to get some of the Apache logs to
process them and get info about:
a) The top rated pages (by number of hits).
b) The average served page size.
This way, I could begin to think a method to
discriminate the top requested pages, to start with a
distribution content simulation framework at our grid
computing develop architecture.
Regards,
Felipe Ortega.
______________________________________________
Renovamos el Correo Yahoo!
Nuevos servicios, más seguridad
http://correo.yahoo.es
Neil Harris wrote:
[snip spectacularly good document on machine-ranking of articles]
>Before any of this could be possible, we would in any case need the
>article ranking system to be up and running for some time, which we need
>anyway for the manual approach.
Sounds good. Please help fix the flagged problems! I'm not a coder,
but you are ;-)
- d.
I am trying to change the navigation sidebar in a Media Wiki 1.5
environment installed with Romanian as default language. I tried editing
the MediaWiki:Sidebar page, which did come up, but the change does not
show up in the interface. Do I need to address a Romanian-specific
namespace (and if yes, what is that?) or do I have to mess up with the
php files?
thanks a lot.
Hi All,
How do I use the "--output=mysql" option of mwdumper?
Here's what I've tried:
* Started with a URL as per http://download.wikimedia.org/tools/README.txt
* varying the URL (using 127.0.0.1 instead of localhost, with and
without a username and password)
* With and without a JDBC driver (both the old mm.mysql, and the newer
mysql-connector)
Output as follows:
=========================================================
ludo:/home/nickj/wikipedia# /usr/java/jre1.5.0_05/bin/java -Xmx200M
-server -jar mwdumper.jar --format=sql:1.4
20051020_pages_articles.xml.bz2
"--output=mysql://localhost/enwiki?user=wiki&password=wiki"
Exception in thread "main" java.io.IOException: com.mysql.jdbc.Driver
at org.mediawiki.dumper.Dumper.connectMySql(Unknown Source)
at org.mediawiki.dumper.Dumper.openOutputFile(Unknown Source)
at org.mediawiki.dumper.Dumper.main(Unknown Source)
ludo:/home/nickj/wikipedia# /usr/java/jre1.5.0_05/bin/java -Xmx200M
-server -jar mwdumper.jar --format=sql:1.4
20051020_pages_articles.xml.bz2
"--output=mysql:mysql://localhost/enwiki?user=wiki&password=wiki"
Exception in thread "main" java.io.IOException: com.mysql.jdbc.Driver
at org.mediawiki.dumper.Dumper.connectMySql(Unknown Source)
at org.mediawiki.dumper.Dumper.openOutputFile(Unknown Source)
at org.mediawiki.dumper.Dumper.main(Unknown Source)
ludo:/home/nickj/wikipedia# /usr/java/jre1.5.0_05/bin/java -Xmx200M
-classpath /usr/java/mm.mysql-2.0.2-bin.jar:. -server -jar
mwdumper.jar --format=sql:1.4 20051020_pages_articles.xml.bz2
"--output=mysql://localhost/enwiki?user=wiki&password=wiki"
Exception in thread "main" java.io.IOException: com.mysql.jdbc.Driver
at org.mediawiki.dumper.Dumper.connectMySql(Unknown Source)
at org.mediawiki.dumper.Dumper.openOutputFile(Unknown Source)
at org.mediawiki.dumper.Dumper.main(Unknown Source)
ludo:/home/nickj/wikipedia# /usr/java/jre1.5.0_05/bin/java -Xmx200M
-classpath /usr/java/mm.mysql-2.0.2-bin.jar:. -server -jar
mwdumper.jar --format=sql:1.4 20051020_pages_articles.xml.bz2
"--output=mysql://localhost/enwiki?user=wiki&password=wiki"
Exception in thread "main" java.io.IOException: com.mysql.jdbc.Driver
at org.mediawiki.dumper.Dumper.connectMySql(Unknown Source)
at org.mediawiki.dumper.Dumper.openOutputFile(Unknown Source)
at org.mediawiki.dumper.Dumper.main(Unknown Source)
ludo:/home/nickj/wikipedia# /usr/java/jre1.5.0_05/bin/java -Xmx200M
-classpath mysql-connector-java-3.1.11/mysql-connector-java-3.1.11-bin.jar:.
-server -jar mwdumper.jar --format=sql:1.4
20051020_pages_articles.xml.bz2
--output=mysql://localhost/enwiki?user=wiki\&password=wiki
Exception in thread "main" java.io.IOException: com.mysql.jdbc.Driver
at org.mediawiki.dumper.Dumper.connectMySql(Unknown Source)
at org.mediawiki.dumper.Dumper.openOutputFile(Unknown Source)
at org.mediawiki.dumper.Dumper.main(Unknown Source)
ludo:/home/nickj/wikipedia# /usr/java/jre1.5.0_05/bin/java -Xmx200M
-classpath mysql-connector-java-3.1.11/mysql-connector-java-3.1.11-bin.jar:.
-server -jar mwdumper.jar --format=sql:1.4
20051020_pages_articles.xml.bz2
--output=mysql://127.0.0.1/enwiki?user=wiki\&password=wiki
Exception in thread "main" java.io.IOException: com.mysql.jdbc.Driver
at org.mediawiki.dumper.Dumper.connectMySql(Unknown Source)
at org.mediawiki.dumper.Dumper.openOutputFile(Unknown Source)
at org.mediawiki.dumper.Dumper.main(Unknown Source)
ludo:/home/nickj/wikipedia# /usr/java/jre1.5.0_05/bin/java -Xmx200M
-classpath mysql-connector-java-3.1.11/mysql-connector-java-3.1.11-bin.jar:.
-server -jar mwdumper.jar --format=sql:1.4
20051020_pages_articles.xml.bz2 --output=mysql://127.0.0.1/enwiki
Exception in thread "main" java.io.IOException: com.mysql.jdbc.Driver
at org.mediawiki.dumper.Dumper.connectMySql(Unknown Source)
at org.mediawiki.dumper.Dumper.openOutputFile(Unknown Source)
at org.mediawiki.dumper.Dumper.main(Unknown Source)
ludo:/home/nickj/wikipedia#
=========================================================
Before doing the above, I did the following in MySQL to grant rights:
=========================================================
grant all on enwiki to wiki(a)localhost.localdomain identified by "wiki";
grant all on enwiki.* to wiki(a)localhost.localdomain identified by "wiki";
grant all on enwiki to wiki@localhost identified by "wiki";
grant all on enwiki.* to wiki@localhost identified by "wiki";
grant all on enwiki to wiki identified by "wiki";
grant all on enwiki.* to wiki identified by "wiki";
=========================================================
Versions of software being used:
=========================================================
ludo:/home/nickj/wikipedia# /usr/java/jre1.5.0_05/bin/java -version
java version "1.5.0_05"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_05-b05)
Java HotSpot(TM) Client VM (build 1.5.0_05-b05, mixed mode, sharing)
ludo:/home/nickj/wikipedia# mysql --version
mysql Ver 11.16 Distrib 3.23.49, for pc-linux-gnu (i686)
ludo:/home/nickj/wikipedia#
=========================================================
And mwdumper.jar is the latest version.
I'm bound to be doing something wrong, but I just don't know what it
is. Can someone please tell me what I'm doing wrong?
All the best,
Nick.
Can anybody point me in the right direction as to what software is
required to get mediawiki acting like wikibooks? I.e. how name/spaces
automatically create links between modules of a book? I've done quite a
lot of searching but I haven't found the answer. Apologies if this is
the wrong place to ask.
Cheers,
- Lionel
Wikitech,
It's mentioned in the ''Tatabahasa Bugis'' book or Buginese grammar
published by ''Departemen Pendidikan dan Kebudayaan, Indonesia'' (The
Indonesian department of Education and Culture.) If you can read
Indonesian, I can scan it for you.
Likewise, as mentioned in the Andi Mallerangeng bugis font page, the first
to create this software-font, he used underline to replace this diacritic.
Here's the page:
<http://www.seasite.niu.edu/bugis2.gif>
Thank you,
Muhammad Zaid Zainuddin
Dear wikitech,
It's me. Again........... (the troublesome guy.) This problem just struck
me. There's missing a diacritic for the buginese lontara font. This vowel
use to make the characters voweless. As you guys notice that all lontara
characters end with vowel /a/. The diacritic is attached. Therefore, is
there any possibility to add this diacritic?
Secondly, I still cannot type in the characters via keyboard. Can someone
explain it to me step by step?
Thank you
Muhammad Zaid Zainuddin
Neil Harris wrote:
>Finally, in any case, since ratings would only be an experiment during
>this phase, the whole ratings system could be turned off at any time if
>it presented a serious problem.
A previous version of the code (not the current version, I think?)
loaded *all* revisions of the article being rated, and switching it on
would have caused an extremely unhappy DB server and a top-50 error
message pretty much instantly ;-)
- d.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
MediaWiki 1.5.1 is a bugfix and security maintenance release, and is a
recommended upgrade for all installations.
This release includes further corrections to the inline CSS style
sanitation which works around a JavaScript "feature" on Microsoft
Internet Explorer. Users of Microsoft Internet Explorer for Windows may
be vulnerable to XSS injections on prior versions; users of
standards-compliant browsers are not vulnerable.
Major fixes include:
* Image pages work again with resizing disabled
* Works in MySQL 5.0 strict mode
There is experimental support in this release for explicitly declaring
the UTF-8 charset in the database; this has been tested with MySQL
5.0.15 but should work on 4.1 as well.
IMPORTANT: Changing this setting on an existing wiki may produce
interesting data corruption, depending on server configuration. Page
contents should, usually, be unaffected, but page titles and other items
may be. Limitations in MySQL's Unicode support mean that characters
outside the BMP cannot be used in page titles or various other fields
when using this mode.
Table definitions are in maintenance/mysql5/tables.sql, and the runtime
option to send 'SET NAMES utf8' is set by $wgDBmysql5 = true.
(MySQL 3.23.x and 4.0.x do not support character set declarations; on
these versions MediaWiki simply works with UTF-8 data and MySQL is
blissfully unaware of it.)
Release notes:
http://sourceforge.net/project/shownotes.php?release_id=366110
Download:
http://prdownloads.sourceforge.net/wikipedia/mediawiki-1.5.1.tar.gz?download
MD5 checksum: f2030d3302a899f4e4f2e6d9f3842067
SHA-1 checksum: ff4c843f132ca54ef85b2a15fb42498a54f0ae33
Before asking for help, try the FAQ:
http://meta.wikimedia.org/wiki/MediaWiki_FAQ
Low-traffic release announcements mailing list:
http://mail.wikipedia.org/mailman/listinfo/mediawiki-announce
Wiki admin help mailing list:
http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Bug report system:
http://bugzilla.wikimedia.org/
Play "stump the developers" live on IRC:
#mediawiki on irc.freenode.net
- -- brion vibber (brion @ pobox.com)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFDX2C5wRnhpk1wk44RAjcFAKCreSnX6H9AlrTJPYAqW9cOcejTQgCgoqk3
k+lb4bVzuxytsf1Xvgv8z3E=
=flUz
-----END PGP SIGNATURE-----
MySQL 5.0 does not reject the surrogate characters between U+D800 and
U+DFFF. This means we can store characters above the BMP either by setting
the character set to UTF-8 and inserting CESU-8, or by setting the character
set to UCS-2 and inserting UTF-16.
It might be a hack, but at least it's a standard hack. After all, this is
exactly what UTF-16 and CESU-8 were designed for. According to a certain
online encyclopedia that we all know and love, the exact same problem is
observed in Oracle, and CESU-8 is used to solve it.
Implementing it in MediaWiki would be a similar task to implementing
PostgreSQL bytea support, which I was talking about in #mediawiki the other
day. We could convert from UTF-8 to CESU-8 in Database::strencode(), and
decode by altering the result rows returned by mysql_fetch_object() before
they are returned by Database::fetchObject().
Binary data would have to be flagged both on write and read. On write, we
could use a function other than strencode() where there is a need to
construct raw SQL, and for the wrapper functions we could pass binary data
in a blob object. When reading from the database, we could probably use
mysql_field_type() or mysql_field_flags() to detect binary columns and skip
the conversion accordingly.
It'd probably be easiest to do conversion offline, in the process of
switching to MySQL 5.0. We wouldn't be converting the bulk text, just the
metadata.
-- Tim Starling