Zwinger went down last night with some sort of disk problem. Although we've
removed most of the dependencies on Zwinger's NFS, a few had escaped noticed and
it took us a while to get things balanced right again.
* /home/wikipedia/htdocs was used via a redirect from /usr/local/apache/htdocs
(not really used for anything, just broke apache restart. this should be moved
to an empty local directory)
* /home/wikipedia/conf/php*.ini was symlinked instead of copied around, so PHP
was misconfigured until we got the files copied back.
* math renderings are on amane, but mounted through /home/wikipedia/ someplace.
this made the directory inaccessible and complicated the unmount process.
There were some other annoyances such as having /home/wikipedia/bin in the PATH.
In general having the NFS go down is a big pain in the ass to recover from; with
a lot of things hanging on it it's virtually impossible to unmount and remount
another server short of rebooting.
After rebooting things to clear up the broken mounts, we had to hassle around
fixing the above glitches and fixing some cache inconsistencies, so the
databases were locked for a couple extra hours while we fixed that.
The /home NFS server has been moved to suda; zwinger is now back online doing
mail and DNS until we get that moved off.
A few weeks ago we tried removing the /home mount from Amane to protect it
against Zwinger failures. This has been working pretty well so far, so what I'd
like to see us do is get rid of the /home mounts entirely from most servers. We
don't _really_ need them, though they're convenient when working.
This will remove the temptation to be sloppy and use files off of NFS instead of
syncing them with the other source and config directories.
The main point where this could be difficult is with the debugging and
monitoring logs we write to from the wiki; we might store those on another
server, like the upload server, or just find better NFS settings where it fails
gracefully and quickly, or something... Or we could switch to a log server of
some kind (eg syslog).
-- brion vibber (brion @ pobox.com)
An automated run of parserTests.php showed the following failures:
Running test BUG 361: URL within URL, not bracketed... FAILED!
Running test External links: invalid character... FAILED!
Running test Bug 2702: Mismatched <i> and <a> tags are invalid... FAILED!
Running test A table with no data.... FAILED!
Running test A table with nothing but a caption... FAILED!
Running test Link containing "#<" and "#>" % as a hex sequences... FAILED!
Running test Magic links: PMID incorrectly converts space to underscore... FAILED!
Running test Template with thumb image (wiht link in description)... FAILED!
Running test Link to image page... FAILED!
Running test BUG 1887: A ISBN with a thumbnail... FAILED!
Running test BUG 1887: A <math> with a thumbnail... FAILED!
Running test BUG 561: {{/Subpage}}... FAILED!
Running test Simple category... FAILED!
Running test Section headings with TOC... FAILED!
Running test Media link with nasty text... FAILED!
Running test Bug 2095: link with pipe and three closing brackets... FAILED!
Running test Sanitizer: Validating the contents of the id attribute (bug 4515)... FAILED!
Passed 264 of 281 tests (93.95%) FAILED!
Thanks for the suggestions, everyone. I figured out my own approach
that works pretty well, adapted from the javascript in the Wikipedia
widget for Mac OS X. It's a data scraping approach that pulls in a
page from $url and strips off the header and footer. The advantage to
this approach is that it doesn't require any knowledge of the
MediaWiki codebase. Here's the basic PHP code:
$article = file_get_contents($url);
echo processRawHtml($article);
function processRawHTML($article) {
$start = strpos($article, '<h1 class="firstHeading">');
$end = strpos($article, '<!-- end content -->');
$article = substr($article, $start, $end-$start) . '</div>';
return $article;
}
Of course, this needs some further elaboration, such as a search-and-
replace to convert local hyperlinks to their full URLs.
--Sheldon Rampton
On 2/11/06, Brion Vibber <brion(a)pobox.com> wrote:
>
> If you'd like try rewriting [Sanitizer::removeHTMLtags()] so it balances end tags
> properly, detects illegal nesting cases, and understands MathML, that would be
> super awesome.
I thought a bit about it and I came to the conclusion that I don't
quite understand what Sanitizer::removeHTMLtags() is supposed to do.
Firstly, I was wrong about the MathML. Sanitizer::removeHTMLtags()
does not need to understand MathML, because at the point where it is
called, the Parser::strip() has replaced the <math> tags by
placeholder strings.
But the important point is: How can Sanitizer::removeHTMLtags()
balance tags? Consider the input
----
First line. <s> Struck through
More text.
Another paragraph.
----
There is an unclosed <s> tag here, so removeHTMLtags() should close
it. If it does the same as HTML Tidy, it adds </s> after "More text.",
before the </p> implied by the empty line.
Okay, that's fine. But now consider
----
* First line. <s> Struck through
* More text.
Another paragraph.
----
In this case, the </s> has to be put after "Struck through".
I think this means that removeHTMLtags() can only work if it parses
the text according to (a subset of) the wiki-grammar. But that seems a
bit messy from the design point of view. The alternative is to call
removeHTMLtags() later, at the same time when HTML Tidy is called
(this is what I wrongly thought to happen).
By the way, I'm still interested to hear why you want to get rid of
HTML Tidy. Is it performance?
Cheers,
Jitse
I'm getting the following error while using the mwdumper from 2006-Feb-01.
tail load-progress-err
7 pages (1.396/sec), 1,000 revs (199.362/sec)
Exception in thread "main" java.io.IOException: XML document
structures must start and end within the same entity.
at org.mediawiki.importer.XmlDumpReader.readDump(Unknown Source)
at org.mediawiki.dumper.Dumper.main(Unknown Source)
I can't figure out what might be wrong w/ the XML being input, and
since the error doesn't give an offset in the stream, I'm not sure how
to troubleshoot it.
Here's the cmd line:
bzcat enwiki-20060125-pages-meta-history.xml.bz2 | ./load.sh .. more
stuff, but nothing that knows anything about XML. ;-)
where load.sh is:
#!/bin/bash
java -server -jar ../tools/mwdumper.jar \
--output=stdout \
--format=xml \
--filter=exactlist:1000_random_titles \
--filter=namespace:0 \
--progress=1000 \
If anyone has any suggestions, I'd be grateful.
Thanks,
Jeremy
Hello,
I've set up MediaWiki successfully one other XP box, but my second can't do the
config:
Checking environment...
PHP 5.1.2: ok
PHP server API is apache2handler; ok, using pretty URLs (index.php/Page_Title)
Have XML / Latin1-UTF-8 conversion support.
PHP is configured with no memory_limit.
Have zlib support; enabling output compression.
Neither Turck MMCache nor eAccelerator are installed, can't use object caching
functions
GNU diff3 not found.
Couldn't find GD library or ImageMagick; image thumbnailing disabled.
Installation directory: C:\Program Files\Apache Group\Apache2\htdocs\wiki
Script URI path: /wiki
Warning: $wgSecretKey key is insecure, generated with mt_rand(). Consider
changing it manually.
Trying to connect to MySQL on localhost as root...
MySQL error 2003: Can't connect to MySQL server on 'localhost' (10061)
I can log into mySql with the same parameters using MySql Administrator
(localhost). What might MediaWiki be doing differently?
Thanks in advance for any tips.
Regards,
Chris
I've upgraded Mailman to 2.1.7, which came out at the end of last year when I
wasn't looking.
It's still on zwinger for now, but I've changed the installation paths from
/home/mailman to /usr/local/mailman and removed the old /home/mailman symlink to
make sure things don't interfere with the switch of the /home NFS server.
-- brion vibber (brion @ pobox.com)
I've disabled mod_perl on ticket.wikimedia.org, since OTRS seems violently
opposed to working correctly under it. As far as I could tell, it would start
passing the request information from some previous request to the script, so
you'd often get what someone else had loaded earlier instead of what you were
asking for.
(This is similar in effect to the bug we had long, long ago when using a
separate perl script as a RewriteMap on Wikipedia; it would sometimes get out of
sync and send the server to the different address a different client had asked for.)
I also applied this patch: http://isleo.kapsi.fi/pub/otrs/
to suppress some whiny lines in the session file decoder.
-- brion vibber (brion @ pobox.com)
Hi SJ.
We are three people. I manage the project, and there're also other
two people translating texts: a translator and a Basque philologist.
Please write if you want to join forces.
Gero arte,
Miguel
On Tue, 2006-02-14 at 21:24, SJ wrote:
> This is cool, Miguel! How many of you are there working on it?
>
> Cheers,
> SJ
>
> On 2/14/06, Miguel A. Cuesta <cuesta(a)alianzo.com> wrote:
> > Hi.
> >
> > We are working on the translation of MediaWiki into Basque (Euskara). We
> > are working with the last 'Language.php' file, and made a lot of
> > progress.
> >
> > If anybody want to collaborate, don't hesitate to drop me a line.
> >
> > Regards. Gero arte.
> >
> > --
> > Miguel A. Cuesta
> > Alianzo Networks
> > "We make social networks"
> >
> > tel: (+34) 944 371 684 - skype me: cuesta-alianzo
> > mailto:cuesta@alianzo.com - IM/MSN: cuesta(a)alianzo.com
> >
> >
> > _______________________________________________
> > Wikitech-l mailing list
> > Wikitech-l(a)wikimedia.org
> > http://mail.wikipedia.org/mailman/listinfo/wikitech-l
> >
>
>
> --
> ++SJ
--
Miguel A. Cuesta
Alianzo Networks
"We make social networks"
tel: (+34) 944 371 684 - skype me: cuesta-alianzo
mailto:cuesta@alianzo.com - IM/MSN: cuesta(a)alianzo.com
> Where can I modify and rename the label
> [edit]
> which is on each section of a page ?
For the "edit" at the top of the page, assuming you're running 1.5, go
to your wiki, and open "MediaWiki:Edit". Change it from "Edit" to
"Modify" (or whatever you want to call it).
For the edit in section headers, change "MediaWiki:Qbedit".
No idea what "MediaWiki:editsection" changes, but you should probably
change it too.
I just found these by trial and error, but does anyone know if they
are documented anywhere?
All the best,
Nick.