I'm doing a test run of the new data dump script on our Korean cluster;
currently jawiki (ja.wikipedia.org) is in progress:
http://amaryllis.yaseo.wikimedia.org/backup/jawiki/20060118/
Any comments on the page layout and information included in the progress page?
A couple notes:
* The file naming has been changed so they include the database name and date.
This should make it easier to figure out what the hell you just downloaded.
* The directory structure is different; the database names are used instead of
the weird mix of sites, languages, and database names which was hard to reliably
get the scripts to run. Each database has subdirectories for each day it was
dumped, plus a 'latest' subdirectory with symbolic links to the files from the
last completed dump.
* I renamed 'pages_current' and 'pages_full' to 'pages-meta-current' and
'pages-meta-history'. In addition to the big explanatory labels, this should
emphasize that these dumps contain metapages such as discussion and user pages,
distancing it from the pages-articles dump.
* I've discontinued 7-Zip compression for the current-versions dumps, since it
doesn't do better than bzip2 for those. They are still generated for the history
dump, where it compresses significantly better (about 3 vs 11 GB for enwiki)
* Upload tarballs are still not included at the moment.
The backup runner script is written in Python, and is in our CVS in the 'backup'
module should anyone feel like laughing at my code.
A few more things need to be fixed up before I start running it on the main
cluster, but it's pretty close! (A list of databases in progress, some locking,
emailing me on error, and finding the prior XML dump to speed dump generation.)
-- brion vibber (brion @ pobox.com)
A component milestone of Wikidata/WiktionaryZ development is true
multi-language support for page titles and content on the MediaWiki
level. We have funding to pay a developer to implement it (I'll be
working on versioning and object/relational mapping in the meantime),
and I have written the first set of specifications for this:
http://meta.wikimedia.org/wiki/Multilingual_MediaWiki
Note that this is in no way a proposal to merge existing Wikimedia
projects; in fact, MediaWiki should be fully backwards compatible and
continue to act exactly as it does now in monolingual installations
(there'll be a couple of new features, such as persistent UI language
selection for anonymous users). However, multilanguage support should be
enabled for wikis like Commons and Meta, where multiple languages are
handled in a single database.
I think I've come up with a fairly clever way of dealing with the
community issues of language filtering, but see for yourself. In any
event, I'd much appreciate feedback on this proposal before we go ahead
with the implementation, especially from Brion as the release manager. :-)
Best,
Erik
Inside an extension module:
what is the correct way to _disable caching for that specific page_
(example: dynamic integration of texts from another source, which might
change) ?
Hi,
I tried the following changes to LocalSettings.php and I renamed index.php to index.cgi and I added the interpreter line to index.cgi.
$wgScript = "$wgScriptPath/index.cgi";
$wgRedirectScript = "$wgScriptPath/redirect.php";
## If using PHP as a CGI module, use the ugly URLs
#$wgArticlePath = "$wgScript/$1";
$wgArticlePath = "$wgScript?title=$1";
I get the following message when I open the site with a browser:
Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete your request.
Please contact the server administrator, rzwebmaster(a)rz.uni-karlsruhe.de and inform them of the time the error occurred, and anything you might have done that may have caused the error.
More information about this error may be available in the server error log.
Any idea?
Thanks, MB
______________________________________________________________
Verschicken Sie romantische, coole und witzige Bilder per SMS!
Jetzt bei WEB.DE FreeMail: http://f.web.de/?mc=021193
We've been having a problem with some method calls failing mysteriously on the
servers, which seems worse under PHP 5 with the APC opcode cache.
Tim's filed a bug report here, which has got some feedback likening it to a
known problem with mixed early binding and late binding:
http://pecl.php.net/bugs/bug.php?id=6503
Since the PHP documentation doesn't cover this early / late binding issue at
all, I want to make sure we actually know what it is. :)
Poking about I found this commit notice which has some relatively clear
explanations: http://news.php.net/php.pecl.cvs/4288
As I understand it, the problem is situations like this...
A.php:
class A { ... }
B.php:
require_once 'A.php'
class B extends A { ... }
Early-binding scenario:
require_once 'A.php';
require_once 'B.php'; // A is already defined when *compiling* B
Late-binding scenario:
require_once 'B.php'; // A is loaded only when B is *run*
The Zend bytecode compiler emits different code depending on whether the base
class was known at compile time, so APC's caching can get confused when you pop
back and forth between these cases.
Our problem is that whether A or B (or C or D) gets loaded is a runtime
decision, and the code that loads one or the other doesn't know about B's
dependencies.
The cases where this seems to frequently come up are:
* Localization handlers, where we have some classes inherit code from a similar
language or variant ("LanguageZh_tw extends LanguageZh")
* Skins, where some are slight variants on a base skin
One possible quickie workaround is to toss in another include which loads the
depdendencies that aren't guaranteed to be preloaded, and hit it first:
LanguageZh_tw.php:
require_once 'LanguageZh.php';
class LanguageZh_tw extends LanguageZh { ... }
LanguageZh_tw.deps.php:
require_once 'LanguageUtf8.php';
require_once 'LanguageZh.php';
Setup.php:
...
wfSuppressWarnings();
include_once("$IP/languages/$wgLangClass.deps.php");
include_once("$IP/languages/$wgLangClass.php");
wfRestoreWarnings();
...
Still wrapping my head around this issue, but I think this makes some kind of
sense...
-- brion vibber (brion @ pobox.com)
Dear,
After installation of the Mediawiki-software, I changed the default
"English" language into Dutch by changing $wgLanguageCode in
localsettings.php AND by setting $wgUseDatabaseMessages=false in
defaultsettings.php (the latter had to be done in order to see the changes,
like is indicated in the Mediawiki-FAQ).
My problem is that because $wgUseDatabaseMessages=false, I cannot display
the page "special:allmessages" anymore, neither can I change the content of
the navigation bar. As a solution for this problem, the FAQ-answer proposes
to run the script rebuildMessages.php in the maintenance-folder. I did this
by running an FTP-program, selecting the remote file and pressing "execute"
but without any results.
Does anyone know how I can fix this problem so I am allowed to see the page
special:allmessages and change the content of the navigation bar while I
still have my wiki-site in the Dutch language?
thanks in advance!
I've been fiddling a bit with the (Atom) recent changes feed at
http://en.wikipedia.org/w/wiki.phtml?title=Special:Recentchanges&feed=atom.
I'm interested in whether a tool like
http://en.wikipedia.org/wiki/User:CryptoDerk/CDVF could use the feed,
so that the diff for each edit could be quickly viewed inside the tool, rather
than linked to externally.
It seems though that the combination of the default window size (50)
and the apparent refresh rate of the feed in the cache (about 20
seconds) means that there are changes that will fall through the gap
once you have more than about ~2.5 changes a second (which seems to
happen fairly often now). This means right now it's not really usable
in a tool like CDVF (and perhaps not usable in general).
What are the future plans for this feed? Can it (will it) be feasibly
maintained in a useful state as the edit rate grows further? No doubt
there are several approaches that could be taken (steadily increase
window size, refresh cache more often, split feed out into namespaces,
...), but all involve placing greater load on the server.
Adrian
I was recently asked by the Esperanto press for visitor statistics for
the year. I realize that grabbing the statistics is a very traffic
consuming process, but could we at least allow it to happen once a
year so we can see how much traffic we're really getting or are we
purging the logs regularly?
Thanks,
Chuck
Hello
I've modified the TouchGraph Applet for MediaWiki Visualizations. There are
two menus added to select the visualization parameters: 1) types of links to
show and 2) size of nodes (in relation to counter, size or modifications for
each article).
The semantic-Wiki-content is updated daly by a cron-job.
You can download the hole package and install it on your own MediaWiki.
The running visualization is here:
http://tecfax.unige.ch/portails/mediawiki/index.php/Special:WikiViz
The download you will find here:
http://tecfax.unige.ch/portails/mediawiki/index.php/Special:SemanticWiki
Comments and further developpement welcome!
Urs Richle
Hello
On purpose to visualize the semantic content of MediaWiki, I wrote a
WebService as a MediaWikiExtension. With this WebService it would be
possible to create independent visualizations outside of MediaWiki with the
technologie of your choice.
The WebService proposes the semantic net content of MediaWiki through tree
médthods:
1) getTopicNames()
This method returns the different topic names of the MediaWiki plateform.
Topics can be of different types: existing or wanted category, existing or
wanted article, author, image. The methode returns an array : (topicTypeName
=> array(array(name, url))).
2) getTopicLinkage($term)
This method returns an array with the topic name, his type, the number of
links to this topic, the URL and an array with all links from this topic to
other topics (each link is an array(name, type, url)) types can be: existing
or wanted category, existing or wanted article, author, image. If no topic
matches to the term sent, NULL is returned.
3) getTopicMap()
This method returns the semantic content of the MediaWiki platform in XTM
(Topic Maps) standardised format. You can read the returned topic map with
applications like TMNav and TMBrows from TM4J.org or Omnigator from
ontopia.net. The returned topic map file is an XML file respecting the XTM
dtd with the extension "xtm".
MediaWiki-WebService uses the NuSOAP library:
http://sourceforge.net/projects/nusoap
You can get the WebService Package here:
http://tecfax.unige.ch/portails/mediawiki/index.php/Special:SemanticWiki
An example of a WebService-Client you will find here:
http://tecfax.unige.ch/portails/mediawiki/extensions/WebService_CLIENT/
Comments and further developpement welcome!
Urs Richle