Wikitech-l September 2005

wikitech-l@lists.wikimedia.org

126 participants
160 discussions

Edit conflict
by Philipp 19 Sep '05

19 Sep '05

Hello, in the past few days the following problem occurred in a local mediawiki installation: When opening an article for edit, everything seems in order. Showing the preview also works and the change applied to the article is shown. But when trying to submit, one gets the other-user-edits-conflict-warning and in the preview field there is the old unchanged version. This happens with every article (as far as tested) independently whether that respective article being edited by someone else or not. Writing a new article works, trying to edit it just afterwards leads to the described problem. Already did a rebuildall which took all night (2500 articles), but the problem remains the same. Is that something known and what am I missing here? Greetings Philipp

5 7

Wiki dump organization
by Nick Jenkins 19 Sep '05

19 Sep '05

> Nick Jenkins wrote: > >>(Also the SQL dump output needs to actually be tested before we dedicate > >>a few gigs to it.) > > > > I'm happy to be a guinea pig. Just give tell me where to get it from, > > and I'll leave it importing overnight and report back any > > errors/problems and whether it worked or not. > > I haven't had a chance to upload some test conversions yet but I'd > appreciate it if you'd give a quick test of the program: > http://leuksman.com/misc/mwdumper-preview.zip > > This .zip contains the compiled converter program in its present state, > with a bundled executable for Linux/x86 and as CIL assemblies for any > other OS. > > The Linux binary does *not* require a Mono installation; I've tested it > on Ubuntu Hoary and Fedora Core 1. (It does require glib2, which is > probably installed by default on any modern Linux.) // O/S is Debian Woody 3.0r6 GNU/Linux, with a backport of the 2.6.7 kernel, so it's semi-modern (i.e. not truly ancient, but definitely not cutting edge either - more kind of middle-aged, with a few grey hairs, and driving a station wagon or mini-van). // Tried running initially - got error loading libgmodule-2.0.so.0 : ludo:/home/nickj/wikipedia/new-dumper# wget http://leuksman.com/misc/mwdumper-preview.zip ludo:/home/nickj/wikipedia/new-dumper# unzip mwdumper-preview.zip ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview# ls README cil doc linux-i386 ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview# cd linux-i386/ ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview/linux-i386# ls mwdumper ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview/linux-i386# ./mwdumper ./mwdumper: error while loading shared libraries: libgmodule-2.0.so.0: cannot open shared object file: No such file or directory ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview/linux-i386# ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview/linux-i386# strace ./mwdumper execve("./mwdumper", ["./mwdumper"], [/* 18 vars */]) = 0 uname({sys="Linux", node="ludo", ...}) = 0 brk(0) = 0x857c000 open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=10970, ...}) = 0 old_mmap(NULL, 10970, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40014000 close(3) = 0 open("/lib/libpthread.so.0", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0`C\0\000"..., 1024) = 1024 fstat64(3, {st_mode=S_IFREG|0644, st_size=102172, ...}) = 0 old_mmap(NULL, 81316, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40017000 mprotect(0x40024000, 28068, PROT_NONE) = 0 old_mmap(0x40024000, 28672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xd000) = 0x40024000 close(3) = 0 open("/lib/libm.so.6", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\2007\0"..., 1024) = 1024 fstat64(3, {st_mode=S_IFREG|0644, st_size=130088, ...}) = 0 old_mmap(NULL, 132708, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4002b000 mprotect(0x4004b000, 1636, PROT_NONE) = 0 old_mmap(0x4004b000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x1f000) = 0x4004b000 close(3) = 0 open("/lib/i686/mmx/libgmodule-2.0.so.0", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/lib/i686/mmx", 0xbfffee94) = -1 ENOENT (No such file or directory) open("/lib/i686/libgmodule-2.0.so.0", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/lib/i686", 0xbfffee94) = -1 ENOENT (No such file or directory) open("/lib/mmx/libgmodule-2.0.so.0", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/lib/mmx", 0xbfffee94) = -1 ENOENT (No such file or directory) open("/lib/libgmodule-2.0.so.0", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 open("/usr/lib/i686/mmx/libgmodule-2.0.so.0", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/usr/lib/i686/mmx", 0xbfffee94) = -1 ENOENT (No such file or directory) open("/usr/lib/i686/libgmodule-2.0.so.0", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/usr/lib/i686", 0xbfffee94) = -1 ENOENT (No such file or directory) open("/usr/lib/mmx/libgmodule-2.0.so.0", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/usr/lib/mmx", 0xbfffee94) = -1 ENOENT (No such file or directory) open("/usr/lib/libgmodule-2.0.so.0", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/usr/lib", {st_mode=S_IFDIR|0755, st_size=16384, ...}) = 0 writev(2, [{"./mwdumper", 10}, {": ", 2}, {"error while loading shared libra"..., 36}, {": ", 2}, {"libgmodule-2.0.so.0", 19}, {": ", 2}, {"cannot open shared object file", 30}, {": ", 2}, {"No such file or directory", 25}, {"\n", 1}], 10./mwdumper: error while loading shared libraries: libgmodule-2.0.so.0: cannot open shared object file: No such file or directory ) = 129 _exit(127) = ? ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview/linux-i386# // Based on your message about needing glib2, looked for and installed this: ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview/cil# apt-get install libglib2.0-0 Reading Package Lists... Done Building Dependency Tree... Done The following NEW packages will be installed: libglib2.0-0 0 packages upgraded, 1 newly installed, 0 to remove and 0 not upgraded. Need to get 288kB of archives. After unpacking 723kB will be used. Get:1 http://debian.ihug.com.au woody/main libglib2.0-0 2.0.1-2 [288kB] Fetched 288kB in 5s (51.0kB/s) Selecting previously deselected package libglib2.0-0. (Reading database ... 24745 files and directories currently installed.) Unpacking libglib2.0-0 (from .../libglib2.0-0_2.0.1-2_i386.deb) ... Setting up libglib2.0-0 (2.0.1-2) ... ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview/linux-i386# // tried running again: ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview/linux-i386# ./mwdumper ./mwdumper: /lib/ld-linux.so.2: version `GLIBC_2.3' not found (required by ./mwdumper) ./mwdumper: /lib/libpthread.so.0: version `GLIBC_2.3.2' not found (required by ./mwdumper) ./mwdumper: /lib/libc.so.6: version `GLIBC_2.3.2' not found (required by ./mwdumper) ./mwdumper: /lib/libc.so.6: version `GLIBC_2.3' not found (required by ./mwdumper) ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview/linux-i386# dpkg -S /lib/ld-linux.so.2 libc6: /lib/ld-linux.so.2 ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview/linux-i386# dpkg -s libc6 Package: libc6 Status: install ok installed Priority: required Section: base Installed-Size: 12760 Maintainer: Ben Collins <bcollins(a)debian.org> Source: glibc Version: 2.2.5-11.8 Replaces: ldso (<= 1.9.11-9), timezone, timezones, gconv-modules, libtricks, libc6-bin, netkit-rpc, netbase (<< 4.0) Provides: glibc-2.2.5-11.8 Suggests: locales, glibc-doc Conflicts: strace (<< 4.0-0), libnss-db (<< 2.2-3), timezone, timezones, gconv-modules, libtricks, libc6-doc, libc5 (<< 5.4.33-7), libpthread0 (<< 0.7-10), libc6-bin, libwcsmbs, apt (<< 0.3.0), libglib1.2 (<< 1.2.1-2), libc6-i586, libc6-i686, libc6-v9, netkit-rpc Conffiles: /etc/default/devpts fc857c5ac5fb84d80720ed4d1c624f6e Description: GNU C Library: Shared libraries and Timezone data Contains the standard libraries that are used by nearly all programs on the system. This package includes shared versions of the standard C library and the standard math library, as well as many others. Timezone data is also included. ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview/linux-i386# // I.e. only have glibc 2.2.5, need >= 2.3.2. :-( // Upgrading currently isn't really viable as the machine's reason for existence is to be a test box configured in the same way as a production server (i.e. should have the same versions of software), and currently the production machine works fine, so the current feeling is "don't fix what ain't broken". > If you're on > Linux and the binary does or doesn't work, please let me know. (If it > fails, please include output of 'ldd mwdumper'.) ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview/linux-i386# ldd mwdumper ./mwdumper: /lib/ld-linux.so.2: version `GLIBC_2.3' not found (required by ./mwdumper) ./mwdumper: /lib/libpthread.so.0: version `GLIBC_2.3.2' not found (required by ./mwdumper) ./mwdumper: /lib/libc.so.6: version `GLIBC_2.3.2' not found (required by ./mwdumper) ./mwdumper: /lib/libc.so.6: version `GLIBC_2.3' not found (required by ./mwdumper) libpthread.so.0 => /lib/libpthread.so.0 (0x40017000) libm.so.6 => /lib/libm.so.6 (0x4002b000) libgmodule-2.0.so.0 => /usr/lib/libgmodule-2.0.so.0 (0x4004d000) libdl.so.2 => /lib/libdl.so.2 (0x40051000) libgthread-2.0.so.0 => /usr/lib/libgthread-2.0.so.0 (0x40054000) libglib-2.0.so.0 => /usr/lib/libglib-2.0.so.0 (0x40059000) libc.so.6 => /lib/libc.so.6 (0x400bc000) /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000) ludo:/home/nickj/wikipedia/new-dumper/mwdumper-preview/linux-i386# All the best, Nick.

2 1

Using MediaWiki to do Wiki source -> HTML as a straight string transformation?
by Nick Jenkins 19 Sep '05

19 Sep '05

> > Can I please ask a unusual question: Is there some way to get > > MediaWiki to render a page from just the wiki source, and no database? > > If you do a bunch of hacking into the internals, probably... > > > and without having to delve deeply into the internals of how MediaWiki > > works. > > d'oh! ;) I actually did something like this with MediaWiki 1.4 (gross hacks on the internals to get a Wiki-string to HTML string conversion without requiring database access), but I did not enjoy the experience (in particular the use of globals, plus the dependency tree in which a file would include another file or two, which would include another, etc., plus getting the initialization order right was non-trivial), and honestly it never worked reliably (from memory it worked sometimes but not always, almost certainly due to something I stuffed up with a hack). I was hoping it might have changed in 1.5 :-( In case anyone ever feels tempted to repeat the experiment, I started from an RC of MediaWiki 1.4, and needed these files: ./DefaultSettings.php ./languages ./languages/LanguageUtf8.php ./languages/Language.php ./languages/Names.php ./languages/LanguageEn.php ./Parser.php ./includes ./includes/User.php ./includes/Utf8Case.php ./includes/WatchedItem.php ./includes/Skin.php ./includes/SkinStandard.php ./includes/Image.php ./includes/Feed.php ./includes/RecentChange.php ./includes/SkinPHPTal.php ./includes/LogPage.php ./includes/GlobalFunctions.php ./includes/DatabaseFunctions.php ./includes/UpdateClasses.php ./includes/Database.php ./includes/CacheManager.php ./includes/Title.php ./includes/UserUpdate.php ./includes/ViewCountUpdate.php ./includes/SiteStatsUpdate.php ./includes/LinksUpdate.php ./includes/SearchUpdate.php ./includes/UserTalkUpdate.php ./includes/SquidUpdate.php ./includes/Namespace.php ./includes/MagicWord.php ./includes/LinkCache.php ./includes/Article.php I also modified some of the above (sorry, I can't easily provide a diff - I just remember every time I ran into an error with an undefined variable either bypassing it or hard coding it or making it use a conditional isset or adding another include - and I kept repeating this until eventually the errors stopped). (From the last-modified dates on the files, the modified/hacked files were probably: Parser.php, Title.php, Skin.php, Language.php, SkinPHPTal.php, Namespace.php, GlobalFunctions.php, and DatabaseFunctions.php) To tie it all together I needed a file like this which would initialize things in the right order, supply dummy functions to cut out things I didn't need, include required files, and so forth: ludo:~nickj/wiki/HTML-validation# cat master.php <?php // report any errors at all error_reporting (E_ERROR | E_WARNING | E_PARSE | E_CORE_ERROR); /* ** @desc: FakeMemCachedClient imitates the API of memcached-client v. 0.1.2. ** It acts as a memcached server with no RAM, that is, all objects are ** cleared the moment they are set. All set operations succeed and all ** get operations return null. */ class FakeMemCachedClient { function add ($key, $val, $exp = 0) { return true; } function decr ($key, $amt=1) { return null; } function delete ($key, $time = 0) { return false; } function disconnect_all () { } function enable_compress ($enable) { } function forget_dead_hosts () { } function get ($key) { return null; } function get_multi ($keys) { return array_pad(array(), count($keys), null); } function incr ($key, $amt=1) { return null; } function replace ($key, $value, $exp=0) { return false; } function run_command ($sock, $cmd) { return null; } function set ($key, $value, $exp=0){ return true; } function set_compress_threshold ($thresh){ } function set_debug ($dbg) { } function set_servers ($list) { } } // we don't want any kind of profiling function wfProfileIn( $fn = '' ) {} function wfProfileOut( $fn = '' ) {} function wfGetProfilingOutput( $s, $e ) {} function wfProfileClose() {} // because Debian woody doesn't have a high enough version of LIBXML to enable XML, which means we have no 'utf8_encode'... function utf8_encode($x) { return $x; } // define required for include files to work OK. define("MEDIAWIKI",true); // initialize the IP global. $IP = ""; define( "DB_READ", -1 ); # Read from the slave (or only server) define( "DB_LAST", -3 ); # Whatever database was used last // include default settings require_once ("DefaultSettings.php"); // initialize $wgMemc global (needed by languages). $wgMemc = new FakeMemCachedClient(); // include MagicWord, needed for the Parser.php to work OK. Should come before language.php to avoid errors. $wgMagicWords = array(); require_once("includes/MagicWord.php"); // include Namespace, needed for the Parser.php to work OK. Should come before language.php to avoid errors. require_once("includes/Namespace.php"); // Setup languages, needed to get us the $wgLang global, which we need. require_once("languages/Language.php"); require_once("languages/LanguageUtf8.php"); $wgLangClass = 'LanguageUtf8'; $wgLang = new LanguageUtf8(); require_once("includes/GlobalFunctions.php"); // include Skin, needed for the User.php to work OK. require_once("includes/Skin.php"); // include User, needed for the Parser.php to work OK. require_once("includes/User.php"); // include LinkCache, needed for the Parser.php to work OK. require_once("includes/LinkCache.php"); $wgLinkCache = new LinkCache(); // include Article, needed for the Parser.php to work OK. require_once("includes/Article.php"); // set up the parse the output require_once("Parser.php"); $parserOptions = new ParserOptions(); $mParserOptions = $parserOptions->newFromUser( $temp = NULL ); // create a Parser object $parser = new Parser(); // supply a blank title $title = NULL; // make up some text for test purposes $text = "A [[test]] ''blah''"; // Generate some output, but as an object. $parserOutput = $parser->parse( $text, $title, $mParserOptions ); // convert the output of the parser to a string. $output = $parserOutput->mText; print $output; ?> > You can't guarantee that without doing template inclusions, though as > for instance template inclusions can be embedded in HTML attribute > values (yyyuuuccckkkkk!) and mistakes there are a likely source of > borken HTML output. That's true, but I just wanted something simple to catch most HTML cock-ups, and was willing to accept a small percentage of false-positives. > For now we run parsed output through > the HTML Tidy library for an additional cleanup pass on Wikipedia; this > is optional in MediaWiki and requires either the tidy executable or the > PHP extension form. Ah, OK, interesting. I too was experimenting with using the PHP tidy extension to do the above checking, but I wanted the errors and their solutions in order to make a list of them (like the lists used in Wiki Syntax) so that interested people could fix the data, whereas I presume you folks just throw the errors away. On one hand running the output through tidy as you currently do is good because tidy can be updated as required to detect and fix new errors, and because it means web browsers will get nice clean output (but only if you're using MediaWiki to transform the wiki string into HTML), and because tidy seems fairly quick, but on the other hand maybe it's slightly bad because it's a run-time solution with added overhead for something can be fixed once in the data. However, that doesn't really matter from my perspective. Basically you folks are already fixing this problem automatically, which means I don't have to concern myself with this problem any more. To quote Keith Packard: "this problem is now being fixed by my favourite person - someone else!" :-) All the best, Nick.

1 0

loading data
by Jim Mahoney 18 Sep '05

18 Sep '05

Nick, I've been trying to get on our web site and it keeps telling me that I cannot find a server or the web site is down. Either way, it doesn't work. Could you please e-mail me with an answer so that I can work with it today. Thanks, Jim James E. Mahoney, Jr. Media Matchmaker, Inc. 13101 W. Washington Blvd Suite 211 Culver City, CA 90066 Office: (310) 432-6388 Fax: (310) 432-6383 jmahoney(a)mediamatchmaker.com

1 0

Re: RSS feed not working for Google Personal
by Jamie Bliss 18 Sep '05

18 Sep '05

Phil Boswell wrote: > [cross-posted to wikien-l and wikitech-l, adjust follow-ups as necessary] [also cross-posted] > I just tried to add the RecentChanges RSS feed to my Google Personal page at > http://www.google.com/ig and was met by a rude response. > > I copied the link from [[Special:RecentChanges]] ( > http://en.wikipedia.org/w/index.php?title=Special:Recentchanges&feed=rss ) > and pasted it into the appropriate box. > > The response was as follows: > Your search > http://en.wikipedia.org/w/index.php?title=Special:Recentchanges&feed=rss > was blocked by that feed's robots.txt. > > What's going on? try http://en.wikipedia.org/wiki/Special:Recentchanges?feed=rss -- -- Jamie ------------------------------------------------------------------- http://endeavour.zapto.org/astro73/ Thank you to JosephM for inviting me to Gmail! Have lots of invites. Gmail now has 2GB.

2 1

Authentication (..again)
by Stanislav Traykov 17 Sep '05

17 Sep '05

Hi, We're building a site which includes MediaWiki and some other tools (Drupal, BSCW, etc.) and needs unified log-in. I've seen some posts and code contributions on other types of authentication (not using AuthPlugin but outright replacing Special:Userlogin) but they were a bit too specific for some purpose and changed the main code a lot. So I thought up yet another way to do things: why not enclose User::loadFromSession() in Setup.php with an if: if(!something($wgUser)) { // declare as something(&$user) to get ref $wgUser = User::loadFromSession(); } This will permit an extension/plugin to load $wgUser from somewhere else: say HTTP auth or a different set of cookies, perhaps creating the local MediaWiki user on-the-fly. The extension can also choose to let User::loadFromSession() do the work by returning false or dump some error message and exit. After some tests, I think this little change is all that's needed to get an alternative log-in to work. The question is what should something() really be? 1) wfRunHooks(..) might seem like a natural choice, but extention functions are yet to be executed at this stage, so an extention wouldn't have had a chance to register its hooks. 2) a new global, $wgUserLoadHook, is perhaps a good alternative 3) a new method, $wgAuth->userLoad($wgUser), is what I used though, because overriding some AuthPlugin methods is all that was necessary to get it working. --- Working example: http://stan.gmuf.com/wik/authpatch/ contains a patch and two plugins: one is a generic HttpAuthPlugin that can be extended, the other is a skeleton for a plugin extending HttpAuthPlugin. 1) cd MediaWiki's dir; patch -p1 </path/to/patch 2) download the two plugins to include/ 3) put require_once('BscwAuthPlugin.php') in LocalSettings.php 4) put MediaWiki's dir in a http auth protected realm 5) test: when you log in via http auth, MediaWiki recognizes you (perhaps creating your local user on-the-fly). Note: As expected, Special:Userlog{in,out} will not work. The skin could hide the links. --- What do you think, does this patch seem like a good solution which could go into MediaWiki? Greetings, Stanislav

2 1

Wiki dump organization
by Nick Jenkins 17 Sep '05

17 Sep '05

> If you can narrow down the request a bit that makes it more likely we'll > slip something into the backup script. :) > [...snip...] > Is that actually what people want? Well, given the disk space limitation, straight off the bat we know that people can't have everything they want. In particular, the all-revisions version in SQL would be way too large I suspect (extra >= 40 gig). That only leaves SQL versions of pages_current.xml.gz, and pages_public.xml.gz. If there is space for both then that would be ideal, but I understand that may be asking too much. Personally, although I don't want the talk pages, others might, so if there's only space for one of these two, then I think a SQL version of pages_current.xml.gz is the way to go (i.e. current revisions of all pages), because it would be applicable for the widest possible audience. > Why do that when MediaWiki comes with an import tool built-in? ;) Because I'm not importing it into MediaWiki. I'm importing it into a database, to then run non-MediaWiki software analysing the data - in particular: looking for bad wiki syntax ( http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wiki_Syntax ), suggesting new useful redirects and disambigs ( http://en.wikipedia.org/wiki/User:Nickj/Redirects ), and searching for good potential wiki-links ( http://en.wikipedia.org/wiki/User:LinkBot - [although for this one I haven't worked out how to get those suggestions out to page authors in a really satisfactory way]). I want to improve the Wikipedia, not mirror it. > In which version? Ideally something that works with MySQL 3.23.49, but if that's too old then something that works with MySQL 4.0.24 instead. > What about everyone > who wants something slightly different? But the database dumps have never tried to be all things to all people. Rather they've been snapshots of the various Wikipedias at semi-regular intervals of time, which you can load into a database (specifically MySQL, but if you can get it to work in another RDBMS, then more power to you). I'm not asking for something entirely new, rather I'm asking for an equivalent replacement for what we already had. > (Also the SQL dump output needs to actually be tested before we dedicate > a few gigs to it.) I'm happy to be a guinea pig. Just give tell me where to get it from, and I'll leave it importing overnight and report back any errors/problems and whether it worked or not. All the best, Nick.

3 2

Using MediaWiki to do Wiki source -> HTML as a straight string transformation?
by Nick Jenkins 17 Sep '05

17 Sep '05

Can I please ask a unusual question: Is there some way to get MediaWiki to render a page from just the wiki source, and no database? I.e.: Input: wiki string (e.g. This ''is'' a '''test'''. [[Axiom]]atic. [[Wiki Web|test]].") Output: HTML string (e.g. "This is</is> a test. <a href="?Axiom">Axiomatic</a>. <a href="?Wiki_Web">test</a>."). Is there some way to do this using the MediaWiki software? Ideally I want just the wiki rendering engine, without any of the other stuff, and without having to delve deeply into the internals of how MediaWiki works. I'm happy for all links to either be edit links or not (i.e. no reason to check whether a page exists or not). Also happy for any templates or other inclusions in wiki strings to be ignored, so it's a straight string transformation, and so that there should be no reason to connect to or use a database. The reason I'm asking is I'm trying to scope out if it's possible to pass in the wiki source of an article, convert it to HTML, and then HTML validate that output, so as to find invalid HTML and mis-uses of HTML in wiki articles. ( Erik Zachte did something along these lines previously with invalid <table> attributes - please see http://en.wikipedia.org/w/index.php?title=Wikipedia:WikiProject_Wiki_Syntax… for some examples of output that he found previously that have been fixed ). All the best, Nick.

2 1

Wiki dump organization
by Nick Jenkins 17 Sep '05

17 Sep '05

> This would triple the disk space requirements for the data dumps > (quadruple after the next major upgrade, quintuple the time after > that...) Surely it should only double the disk space requirements? XML format dumps I would say are the same size, or possibly even slightly larger, than SQL dumps. After all, the main content is the article text, and that's the same in both (apart from some extra slashes in SQL), but you lose the overhead of the XML gumpf. I'd be surprised if wasn't a wash, or close enough to as makes no real-world difference. Besides, most people I think probably don't want every revision ever. Nor do they probably want talk pages. In other words, one extra file, namely the SQL version of pages_public.xml.gz, whose size is going to almost the same. For EN, the largest of all, that's only ~ 900 megs. For 900 meg it stops people whining. > you can transform to whatever local format you > need. (And we provide software for you to do that if you like.) What most people need is get it into a database for further work, and the fact there's software for this at all shows there's demand for it. And what's the point of every user who wants an SQL dump downloading the XML version, downloading mwdumper, downloading mono, setting up mono, running mwdumper, and creating the dump? Wouldn't it make more sense to run the conversion software as part of a general fortnightly database dump cron job that did all the XML stuff, then took the XML file, converted it to SQL, and compressed it? That way the problem is solved once, in one place, forever, for all users who want SQL format. > and maybe a couple people might use some of them every once in Au contraire - most people who want dumps will use them all the time! Tell you what: If you don't believe me, try making one, uploading it, and then _next_ dump add a README that says "SQL dumps have been discontinued due to a lack of interest and demand from users. If you disagree, please address your comments to Brion on the wikitech-l mailing list (email: wikitech-l(a)wikimedia.org)". And then see what happens. :-) > > Also, can we please have back the "is_redirect" field in the XML (and > Hmm, can probably do that yeah. Sounds great, thank you! All the best, Nick.

2 2

Pagerights in MediaWiki
by Sascha Brockmeyer 16 Sep '05

16 Sep '05

Hi! I have a problem with my wikimedia page. I want to safe all existing and all in future time existing articel. How can i safe articels in a way, that only SYSOPS can change anything or create new texts. Sry for the worse english. ----- Hi, wie kann ich bei meinem MediaWiki schützen, das nur SysOps Articel erstellen und bestehende auch nur von SysOps bearbeitet werden können? Thanks Sascha Brockmeyer Brockmeyer.S(a)atlas.de

2 1

← Newer
1
...
4
5
6
7
8
9
10
...
16
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l September 2005