I recently set up a MediaWiki (http://server.bluewatersys.com/w90n740/)
and I need to extra the content from it and convert it into LaTeX
syntax for printed documentation. I have googled for a suitable OSS
solution but nothing was apparent.
I would prefer a script written in Python, but any recommendations
would be very welcome.
Do you know of anything suitable?
today we came over 10k HTTP requests per second (even with inter-squid
traffic eliminated). Especially thanks to Mark and Tim, who've been
improving our caching, as well as doing lots of other work, and
achieved incredible results (while I was slacking). Really, thanks!
I've put together an extension for rating articles if anyone is
interested. It's just a first version and hasn't been tested much, but
the details can be found here:
You can see an example here on our development server:
(username password wikihow / wikihow2006) - scroll down to the bottom
of the page for the checkmarks.
I'd appreciate feedback if anyone has any. If someone wants to add
this to extensions in svn, that'd be great.
I've set up a subversion user list:
If you have subversion commit access, please create a file describing
yourself, at /USERINFO/<username>. The syntax is like MIME headers, see my
Current fields are name, URL and email, but I'd accept suggestions for
more. The idea of the email field is to eventually set up aliases for
everyone of the form <username>@svn.wikimedia.org. If you're worried about
spam, I would suggest including your email address encoded with ROT13, in
a field called encrypted-email.
-- Tim Starling
Unicode's bidirectional algorithm often fails where there are RTL
characters, LTR characters and neutrals such as punctuation in the same
paragraph. Often this can be fixed by liberal sprinkling of either the RLM
character (in base RTL text) or the LTR character (in base LTR text).
Putting these characters directly into the article text makes such changes
difficult to review and edit, since they are invisible in the edit box in
major browsers. A better solution is to use HTML's ‎ and ‏
By happy coincidence, ‎ has roughly the same effect in the edit box as
it does in display, because the latin characters "lrm" are of strong
left-to-right type, just like the control character they represent. The
same is not so for ‏, meaning that in cases where ‏ is used, the
text remains broken on edit while being fixed on display. Here's an example:
What I propose is that someone should come up with a translation of "rlm"
into Hebrew, Arabic or both, and that we should implement this artificial
character entity in the MediaWiki parser.
-- Tim Starling
-----BEGIN PGP SIGNED MESSAGE-----
Mark Clements schrieb:
> I don't know if anyone has any suggestions to deal with this kind of
> problem, or even if it has already been recognised as an issue. The
> non-technical answer is, of course, to copy the commons images to the local
> wiki, but that kind of defeats the point of commons, doesn't it?
We have the CheckUsage tool on the toolserver and every admin should
check the usages _before_ deleting an image. If there are (more or less)
prominent usages, at least at non user/user talk pages, he shouldn't
delete the image or previous replace all usages with the duplicate image.
I know, it's a lot of work
Exemptions are copyvios.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----
An automated run of parserTests.php showed the following failures:
This is MediaWiki version 1.10alpha (r21695).
Reading tests from "maintenance/parserTests.txt"...
Reading tests from "extensions/Cite/citeParserTests.txt"...
Reading tests from "extensions/Poem/poemParserTests.txt"...
18 still FAILING test(s) :(
* URL-encoding in URL functions (single parameter) [Has never passed]
* URL-encoding in URL functions (multiple parameters) [Has never passed]
* Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html) [Has never passed]
* Link containing double-single-quotes '' (bug 4598) [Has never passed]
* message transform: <noinclude> in transcluded template (bug 4926) [Has never passed]
* message transform: <onlyinclude> in transcluded template (bug 4926) [Has never passed]
* BUG 1887, part 2: A <math> with a thumbnail- math enabled [Has never passed]
* HTML bullet list, unclosed tags (bug 5497) [Has never passed]
* HTML ordered list, unclosed tags (bug 5497) [Has never passed]
* HTML nested bullet list, open tags (bug 5497) [Has never passed]
* HTML nested ordered list, open tags (bug 5497) [Has never passed]
* Fuzz testing: image with bogus manual thumbnail [Introduced between 08-Apr-2007 07:15:22, 1.10alpha (r21099) and 25-Apr-2007 07:15:46, 1.10alpha (r21547)]
* Inline HTML vs wiki block nesting [Has never passed]
* Mixing markup for italics and bold [Has never passed]
* dt/dd/dl test [Has never passed]
* Images with the "|" character in the comment [Has never passed]
* Parents of subpages, two levels up, without trailing slash or name. [Has never passed]
* Parents of subpages, two levels up, with lots of extra trailing slashes. [Has never passed]
Passed 493 of 511 tests (96.48%)... 18 tests failed!
On 30/04/07, Daniel Kinzler <daniel(a)brightbyte.de> wrote:
> > I don't know if anyone has any suggestions to deal with this kind of
> > problem, or even if it has already been recognised as an issue. The
> > non-technical answer is, of course, to copy the commons images to the local
> > wiki, but that kind of defeats the point of commons, doesn't it?
> It's commons policy not to delete images as long as the images are used
> anywhere, if the deletion is not urgent (copyvios and policy-vios can be deleted
> right away). Some people don't seem to get that into their head, though.
If it's not right there on the image page, they won't see it.
What needs to be done for it to show up right there on the image page?