the database at zedler is unuseable at the moment. I try all
recovery-level but nothing work :(. So I guess that we need a new
full-non-en-dump from the wikimedia-server-admins to rebuild the
database from scratch. There is a backup (4th dec 4:04 UTC) of your
user-databases so there will be no or only few data lost.
I CC this eMail to the wikimedia-server-admins. Perhaps somebody there
make a nice dump for me :).
Is there any way we can tweak the Spam blacklist filter to block images that
match blacklist list entries while $wgAllowExternalImages is set?
Right now when $wgAllowExternalImages is true, the parser is converting the
links in internalParse to images, which is expected, but if those images
match something in the blacklist they won't be blocked. These images also
don't show up in out->getImages.
If we could check the spam blacklist against something a little less
processed than external links, it might work. Is there any reason why the
blacklist isn't just compared to the raw Mediawiki text?
When I run mwdumper with enwiki-20061001-pages-articles.xml.bz2 and
pipe it to mysql (see my other mail), it eventually prints out
3,583,699 pages (490.212/sec), 3,583,699 revs (490.212/sec)
However, on the database afterwards a query like SELECT COUNT(*) FROM
page gives me a result count somewhere around 2.5 million (I don't
have the exact number in my notes, sorry). And this discrepancy is
further confirmed by many missing articles: lots of red links from
existing articles where I can confirm that the article doesn't exist
in the database but does exist in the dump.
Is mwdumper known to have problems in this area? How might I track this down?
When trying to use mwdumper to import
enwiki-20061001-pages-articles.xml.bz2, I find that using the
--output=mysql:... --format=sql:1.5 causes some piece to eat UTF-8. I
can verify this in the database: the text table will have data like
is:Stj�rnleysisstefna where the middle mojibake character is a single
If I use mwdumper with --output=stdout, I can verify the resulting SQL
has good UTF-8 in it. If I run a command like
mwdumper --output=stdout --format=sql:1.5 ... | mysql -uwiki -pwiki wikidb
things come out ok: the Greek in [[Anarchism]] (which is a useful test
article because it occurs early in the dump) displays fine.
Some other info that may be useful:
- I have $wgDBmysql5 = false; in my LocalSettings.php.
- My locale doesn't seem to affect it, but it's all en_US.UTF-8 in
case that matters.
- java version "1.5.0_07" / Java(TM) 2 Runtime Environment, Standard
Edition (build 1.5.0_07-b03)
- Command line I'm trying is:
java -server -classpath /usr/share/java/mysql-3.1.11.jar:mwdumper/bin
That mysql jarfile comes from libmysql-java on Ubuntu dapper, version
3.1.11-1, but I find the same behavior with
I could file this as a bug, but I wanted to first verify I wasn't
doing anything wrong.
I hope this is the right place to ask this question: is there a technical
reason for not hosting TIFF files on Wikimedia servers? Commons forbids
them, and apparently has since the beginning. However, it would be helpful
for Wikisource if TIFF files of scans of public domain works could be
uploaded and used for proofreading and/or archival.
Nathaniel / User:Spangineer
I've added a very basic extension that allows custom headers and footers to
be added to user to user e-mails:
It'd be nice addition if extra headers could be passed into
UserMailer.phpso that something like an HTML template could be use to
wrap the user's
message, so it looks a little prettier.
-----BEGIN PGP SIGNED MESSAGE-----
I'm currently doing a test run of pulling migration data from the user
tables; the slow part currently seems to be getting the edit counts.
In a day or two I'll set up test.wikipedia.org for interactive migration
testing. This will allow people to try out the UI, and give a chance to
see how it looks so we can make sure the UI translations get done soon...
- -- brion vibber (brion @ pobox.com / brion @ wikimedia.org)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v188.8.131.52 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----
An automated run of parserTests.php showed the following failures:
Reading tests from "/home/brion/src/wiki/phase3/maintenance/parserTests.txt"...
Running test TODO: Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html)... FAILED!
Running test TODO: Link containing double-single-quotes '' (bug 4598)... FAILED!
Running test TODO: Template with thumb image (with link in description)... FAILED!
Running test TODO: message transform: <noinclude> in transcluded template (bug 4926)... FAILED!
Running test TODO: message transform: <onlyinclude> in transcluded template (bug 4926)... FAILED!
Running test BUG 1887, part 2: A <math> with a thumbnail- math enabled... FAILED!
Running test TODO: HTML bullet list, unclosed tags (bug 5497)... FAILED!
Running test TODO: HTML ordered list, unclosed tags (bug 5497)... FAILED!
Running test TODO: HTML nested bullet list, open tags (bug 5497)... FAILED!
Running test TODO: HTML nested ordered list, open tags (bug 5497)... FAILED!
Running test TODO: Parsing optional HTML elements (Bug 6171)... FAILED!
Running test TODO: Inline HTML vs wiki block nesting... FAILED!
Running test TODO: Mixing markup for italics and bold... FAILED!
Running test TODO: 5 quotes, code coverage +1 line... FAILED!
Running test TODO: dt/dd/dl test... FAILED!
Running test TODO: Images with the "|" character in the comment... FAILED!
Running test TODO: Parents of subpages, two levels up, without trailing slash or name.... FAILED!
Running test TODO: Parents of subpages, two levels up, with lots of extra trailing slashes.... FAILED!
Running test TODO: Don't fall for the self-closing div... FAILED!
Running test TODO: Always escape literal '>' in output, not just after '<'... FAILED!
Reading tests from "/home/brion/src/wiki/phase3/extensions/Cite/citeParserTests.txt"...
Passed 451 of 471 tests (95.75%)... FAILED!