There's been some recent discussion in the mediawiki IRC channel about the
Parser's handling of whitespace in tag extension output. Specifically, if
the output of a tag contains leading whitespace or consecutive newlines,
they are replaced with <pre> blocks and </p><p> combos respectively.
I'm planning to fix this, and I was curious which version(s) would receive
the fix when finished. Does this qualify as something which would be
backported, or only in the latest release? Thanks in advance.
-- Jim R. Wilson (jimbojw)
Dear Wiki Devs,
I have a proposal for adding new hooks into the MW codebase, and I'm looking
for feedback (positive and negative). The goal of the proposed changes is
to give Extension Devs the ability to cleanly hijack just about any class in
the system. Here are the details:
Step 1: For each class in MW, create a static factory method for class
instantiation. The factory method would take the same number of arguments
as the class constructor, and under normal circumstances would simply call
the constructor. Here's an example for the UploadForm class (
SpecialUpload.php):
--------------------------------------------------------------
class UploadForm {
// ...
function instantiate( &$request ) {
$uploadForm = new UploadForm($request);
wfRunHooks('UploadFormInstantiate', array( &$uploadForm, &$request ));
return $uploadForm;
}
--------------------------------------------------------------
Step 2: Replace all calls to the traditional "new Whatever($args)" with
"Whatever::instantiate($args)".
By doing this, extension developers gain the power to modify any class
before it's ever used, possibly even replacing it with a custom class
they've developed.
What do you think? I'm genuinely interested in hearing what everyone has to
say. Thanks!
(Please note that I haven't tested any code for this yet - so if my syntax
in the example is out of order, I apologize in advance).
-- Jim R. Wilson
> Actually, // and ** are at least as clear, and are most definitely
> parsable by a fixed-lookahead context-free grammar - even an unaugmented
> LL(k) grammar could probably handle it. <i> and <b> are unambiguous, but
> ugly and language-dependent. MediaWiki's current behavior "fixes" many
> of the issues with its ambiguous bold/italics representation with little
> ad-hoc DWIM-type behavior. It works, but cannot be represented by a CFG
> and is difficult to extend.
Another benefit of changing the double single quotes and triple single
quotes to // and ** respectively is that it would be a small step to
making MediaWiki markup more Creole-compatible (www.wikicreole.org).
Also, it seems like a conversion script for just these two elements
would not be that difficult to write. What could be potential
complications?
Chuck
On 14/02/07, leon(a)svn.wikimedia.org <leon(a)svn.wikimedia.org> wrote:
> Revision: 19930
> Author: leon
> Date: 2007-02-14 10:26:41 -0800 (Wed, 14 Feb 2007)
>
> Log Message:
> -----------
> * (bug 8988) Added missing $
>
> Modified Paths:
> --------------
> tags/REL1_8_3/phase3/includes/DatabasePostgres.php
> tags/REL1_9_2/phase3/includes/DatabasePostgres.php
WTF?
*No one* except the release manager is supposed to alter *tags*, ever.
If code *has* to be backported, then it gets backported to a branch.
Rob Church
I'm trying to debug an extension on my laptop, and http://localhost/
wiki keeps redirecting to http://<my-computers-hostname>.tamu.edu/
wiki... which doesn't work. How do I stay at localhost? This
doesn't seem to happen with other webpages on my laptop.
Jim
=====================================
Jim Hu
Associate Professor
Dept. of Biochemistry and Biophysics
2128 TAMU
Texas A&M Univ.
College Station, TX 77843-2128
979-862-4054
An automated run of parserTests.php showed the following failures:
This is MediaWiki version 1.10alpha (r19935).
Reading tests from "maintenance/parserTests.txt"...
Reading tests from "extensions/Cite/citeParserTests.txt"...
Reading tests from "extensions/Poem/poemParserTests.txt"...
18 still FAILING test(s) :(
* URL-encoding in URL functions (single parameter) [Has never passed]
* URL-encoding in URL functions (multiple parameters) [Has never passed]
* TODO: Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html) [Has never passed]
* TODO: Link containing double-single-quotes '' (bug 4598) [Has never passed]
* TODO: message transform: <noinclude> in transcluded template (bug 4926) [Has never passed]
* TODO: message transform: <onlyinclude> in transcluded template (bug 4926) [Has never passed]
* BUG 1887, part 2: A <math> with a thumbnail- math enabled [Has never passed]
* TODO: HTML bullet list, unclosed tags (bug 5497) [Has never passed]
* TODO: HTML ordered list, unclosed tags (bug 5497) [Has never passed]
* TODO: HTML nested bullet list, open tags (bug 5497) [Has never passed]
* TODO: HTML nested ordered list, open tags (bug 5497) [Has never passed]
* TODO: Inline HTML vs wiki block nesting [Has never passed]
* TODO: Mixing markup for italics and bold [Has never passed]
* TODO: 5 quotes, code coverage +1 line [Has never passed]
* TODO: dt/dd/dl test [Has never passed]
* TODO: Images with the "|" character in the comment [Has never passed]
* TODO: Parents of subpages, two levels up, without trailing slash or name. [Has never passed]
* TODO: Parents of subpages, two levels up, with lots of extra trailing slashes. [Has never passed]
Passed 493 of 511 tests (96.48%)... 18 tests failed!
I tried using WMDumper to load the content of wikipedia in a Mysql 5
Database. I used tables.sql to generate the table. I then tried writing
the data in the mySql using WMDumper and get the following results.
C:\Downloads>set
class=mwdumper.jar;mysql-connector-java-3.0.11-stable-bin.jar
C:\Downloads>set data="C:\Downloads\enwiki-20070206-pages-articles.xml.bz2"
C:\Downloads>java -client -classpath
mwdumper.jar;mysql-connector-java-3.0.11-stable-bin.jar
org.mediawiki.dumper.Dumper
"--output=mysql://127.0.0.1/enwiki?user=xxxx&password=xxxxxxx"
"--format=sql:1.5" "C:\Downloads\enwiki-20070206-pages-a
rticles.xml.bz2"
1.000 pages (148,148/sec), 1.000 revs (148,148/sec)
2.000 pages (156,104/sec), 2.000 revs (156,104/sec)
Exception in thread "main" java.lang.StringIndexOutOfBoundsException:
String index out of range: -1
at java.lang.String.substring(Unknown Source)
at
com.mysql.jdbc.EscapeProcessor.escapeSQL(EscapeProcessor.java:151)
at com.mysql.jdbc.Statement.execute(Statement.java:845)
at org.mediawiki.importer.SqlServerStream.writeStatement(Unknown
Source)
at org.mediawiki.importer.SqlWriter.flushInsertBuffer(Unknown
Source)
at org.mediawiki.importer.SqlWriter.bufferInsertRow(Unknown Source)
at org.mediawiki.importer.SqlWriter15.writeRevision(Unknown Source)
at org.mediawiki.importer.MultiWriter.writeRevision(Unknown Source)
at org.mediawiki.importer.PageFilter.writeRevision(Unknown Source)
at org.mediawiki.dumper.ProgressFilter.writeRevision(Unknown Source)
at org.mediawiki.importer.XmlDumpReader.closeRevision(Unknown
Source)
at org.mediawiki.importer.XmlDumpReader.endElement(Unknown Source)
at
org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at
org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at org.mediawiki.importer.XmlDumpReader.readDump(Unknown Source)
at org.mediawiki.dumper.Dumper.main(Unknown Source)
--
________________________________________________________________________
Axel Ngonga University of Leipzig, Dpt. Computer Sciences
M.Sc. Business Information Systems Group
http://bis.informatik.uni-leipzig.de
Johannisgasse 26, Room 5-22
D-04103 Leipzig
fon: +49-341-9732341 * fax: +49-341-9732239 * mobile: +49-176-23517631
________________________________________________________________________
Hi all,
the use I have in mind about this feature was in fact translation, rather than printing.
Using a generative-grammar approach allows large scale economy while translating many articles dislpaying similar strucures.
However -as I realised this night- extracting subsections can be easily dealt with be the use of Find/Replace and wildcards, plus some appropriate macroes applied to the XML code exported.
Another idea, which came to me while thinking about how to solve this problem, is that it would be very useful the templates included within a page to be automatically exported together with that page. Of course, nesting should be suitable limited.
Many thanks for your kind and interesting comments,
Claudi
---------------------------------
Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses.
An automated run of parserTests.php showed the following failures:
This is MediaWiki version 1.10alpha (r19926).
Reading tests from "maintenance/parserTests.txt"...
Reading tests from "extensions/Cite/citeParserTests.txt"...
Reading tests from "extensions/Poem/poemParserTests.txt"...
18 still FAILING test(s) :(
* URL-encoding in URL functions (single parameter) [Has never passed]
* URL-encoding in URL functions (multiple parameters) [Has never passed]
* TODO: Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html) [Has never passed]
* TODO: Link containing double-single-quotes '' (bug 4598) [Has never passed]
* TODO: message transform: <noinclude> in transcluded template (bug 4926) [Has never passed]
* TODO: message transform: <onlyinclude> in transcluded template (bug 4926) [Has never passed]
* BUG 1887, part 2: A <math> with a thumbnail- math enabled [Has never passed]
* TODO: HTML bullet list, unclosed tags (bug 5497) [Has never passed]
* TODO: HTML ordered list, unclosed tags (bug 5497) [Has never passed]
* TODO: HTML nested bullet list, open tags (bug 5497) [Has never passed]
* TODO: HTML nested ordered list, open tags (bug 5497) [Has never passed]
* TODO: Inline HTML vs wiki block nesting [Has never passed]
* TODO: Mixing markup for italics and bold [Has never passed]
* TODO: 5 quotes, code coverage +1 line [Has never passed]
* TODO: dt/dd/dl test [Has never passed]
* TODO: Images with the "|" character in the comment [Has never passed]
* TODO: Parents of subpages, two levels up, without trailing slash or name. [Has never passed]
* TODO: Parents of subpages, two levels up, with lots of extra trailing slashes. [Has never passed]
Passed 493 of 511 tests (96.48%)... 18 tests failed!