Hi,
I'm getting complaints from users using my Widgets and HeaderTabs extensions' parser functions that their output is mangled with <p> tags.
Looking into the issues I identified two separate problems and I'll be very happy if you can confirm that I'm correct and help me resolve them.
First issue: Output of all parser functions is preceded with "\n\n" in Parser.php line 2975 (on current stable 1.12 branch) which purposefully forces closing </p><p> combination which contradicts with users expectations that it will actually be inline if they put parser function inline. Here's the code:
# Replace raw HTML by a placeholder # Add a blank line preceding, to prevent it from mucking up # immediately preceding headings if ( $isHTML ) { $text = "\n\n" . $this->insertStripItem( $text ); }
This is quite distracting since there is no way to work around this in extensions or page text.
Second issue: If output of the function has line breaks in it, it gets populated by lots of <p> tags which might not be desirable if extension is supposed to preserve HTML structure (e.g. Widgets extension). I found a piece of instruction on how to avoid it by using unique markers and 'ParserAfterTidy' hook: http://www.mediawiki.org/wiki/Manual:Tag_extensions#How_can_I_avoid_modifica... please let me know if this is still going to work correctly with new parser implementation.
I'll greatly appreciate your help with the matter.
Thank you,
Sergey
On Sat, Jun 21, 2008 at 12:36 PM, Sergey Chernyshev wikitech-l@antispam.sergeychernyshev.com wrote:
I'm getting complaints from users using my Widgets and HeaderTabs extensions' parser functions that their output is mangled with <p> tags.
In Widgets.php, it sounds like maybe you want to replace:
return array($output, 'noparse' => true, 'isHTML' => true, 'noargs' => true);
with
return $parser->insertStripItem($output, $parser->mStripState);
There was a similar issue here:
http://www.mediawiki.org/w/index.php?title=Extension_talk:Icon&diff=prev...
Joshua
Sergey said:
First issue: Output of all parser functions is preceded with "\n\n" in Parser.php line 2975 (on current stable 1.12 branch) which purposefully forces closing </p><p> combination which contradicts with users expectations that it will actually be inline if they put parser function inline. Here's the code: --SNIP-- This is quite distracting since there is no way to work around this in extensions or page text.
You're correct that it's quite distracting, but fortunately there is a workaround - more on that in a second.
Joshua said:
In Widgets.php, it sounds like maybe you want to replace:
return array($output, 'noparse' => true, 'isHTML' => true, 'noargs' => true);
It would be nice it that would work, but unfortunately, it's not the silver bullet you might expect. MediaWiki inserts the aforementioned newlines regardless of whether the parser function utilizes the alternative array return type.
I have studied this problem in depth[1], and my standing recommendation is to use the parser's insertStripItem() function thusly:
return $parser->insertStripItem( $output, $parser->mStripState );
This will bypass the newline insertion altogether. Good luck!
[1] http://jimbojw.com/wiki/index.php?title=Raw_HTML_Output_from_a_MediaWiki_Par...
-- Jim
On Sat, Jun 21, 2008 at 12:40 PM, Joshua C. Lerner jlerner@gmail.com wrote:
On Sat, Jun 21, 2008 at 12:36 PM, Sergey Chernyshev wikitech-l@antispam.sergeychernyshev.com wrote:
I'm getting complaints from users using my Widgets and HeaderTabs extensions' parser functions that their output is mangled with <p> tags.
In Widgets.php, it sounds like maybe you want to replace:
return array($output, 'noparse' => true, 'isHTML' => true, 'noargs' => true);
with
return $parser->insertStripItem($output, $parser->mStripState);
There was a similar issue here:
http://www.mediawiki.org/w/index.php?title=Extension_talk:Icon&diff=prev...
Joshua
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Mon, Jun 23, 2008 at 1:02 PM, Jim R. Wilson wilson.jim.r@gmail.com wrote:
Joshua said:
In Widgets.php, it sounds like maybe you want to replace:
return array($output, 'noparse' => true, 'isHTML' => true, 'noargs' => true);
It would be nice it that would work, but unfortunately, it's not the silver bullet you might expect. MediaWiki inserts the aforementioned newlines regardless of whether the parser function utilizes the alternative array return type.
I have studied this problem in depth[1], and my standing recommendation is to use the parser's insertStripItem() function thusly:
return $parser->insertStripItem( $output, $parser->mStripState );
That's what I suggested! Based on your original example, of course. ;-)
Joshua
Oh ha! Sorry about that - I read down to the part about noparse/isHTML and started formulating my response.
My apologies buddy - teaches me to shoot first and read later :)
-- Jim
On Mon, Jun 23, 2008 at 12:39 PM, Joshua C. Lerner jlerner@gmail.com wrote:
On Mon, Jun 23, 2008 at 1:02 PM, Jim R. Wilson wilson.jim.r@gmail.com wrote:
Joshua said:
In Widgets.php, it sounds like maybe you want to replace:
return array($output, 'noparse' => true, 'isHTML' => true, 'noargs' => true);
It would be nice it that would work, but unfortunately, it's not the silver bullet you might expect. MediaWiki inserts the aforementioned newlines regardless of whether the parser function utilizes the alternative array return type.
I have studied this problem in depth[1], and my standing recommendation is to use the parser's insertStripItem() function thusly:
return $parser->insertStripItem( $output, $parser->mStripState );
That's what I suggested! Based on your original example, of course. ;-)
Joshua
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I have studied this problem in depth[1], and my standing recommendation is to use the parser's insertStripItem() function thusly:
return $parser->insertStripItem( $output, $parser->mStripState );
That's what I suggested! Based on your original example, of course. ;-)
Actually I didn't provide the example because I was worried about it in general ;)
BTW, I realized that this problem came up before and I even posted a patch based on somebody else's explanation of the solution to Bugzilla back in August: https://bugzilla.wikimedia.org/show_bug.cgi?id=8997#a0
I'll also think about using insertStripItem - never used it before so I have to read up on it.
Sergey
Joshua
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Actually, it looks like you provided a solution only to the first problem i mentioned - preceding <p> tag and yes, it works, thank you!
But second problem still exists, but the patch I mentioned ( https://bugzilla.wikimedia.org/show_bug.cgi?id=8997#a0) is resolving second issue where output is getting parsed as wiki-text which causes HTML to still be considered wiki-text and inserts <p>s when two consequent newlines were used. I wonder if there is any way to avoid that without patching MW code.
Sergey
On Tue, Jun 24, 2008 at 4:58 PM, Sergey Chernyshev < wikitech-l@antispam.sergeychernyshev.com> wrote:
I have studied this problem in depth[1], and my standing
recommendation is to use the parser's insertStripItem() function thusly:
return $parser->insertStripItem( $output, $parser->mStripState );
That's what I suggested! Based on your original example, of course. ;-)
Actually I didn't provide the example because I was worried about it in general ;)
BTW, I realized that this problem came up before and I even posted a patch based on somebody else's explanation of the solution to Bugzilla back in August: https://bugzilla.wikimedia.org/show_bug.cgi?id=8997#a0
I'll also think about using insertStripItem - never used it before so I have to read up on it.
Sergey
Joshua
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Sergey Chernyshev http://www.sergeychernyshev.com/
Hello again,
It may surprise you to find out that I have also investigated that problem in depth. :)
There are two things playing against you, even if you use the insertStripItem() method. One is the whitespace list processing (which you've discovered), and the second is Tidy.
If you must have pure, unadulterated HTML from a parser function, you'll effectively want to "hide" data from the parser, then re-introduce it later.
You can do this by: 1) Have your parser function output an encoded HTML comment (protected as usual by the insertStripItem() trick) 2) Create a function hooking 'ParserAfterTidy' which looks for any encoded HTML comments and replaces them with the desired content.
Basically you want to combine the techniques you already know with the one explained in this article: http://jimbojw.com/wiki/index.php?title=Raw_HTML_Output_from_a_Parser_Extens...
Good luck!
-- Jim
On Wed, Jun 25, 2008 at 4:21 PM, Sergey Chernyshev wikitech-l@antispam.sergeychernyshev.com wrote:
Actually, it looks like you provided a solution only to the first problem i mentioned - preceding <p> tag and yes, it works, thank you!
But second problem still exists, but the patch I mentioned ( https://bugzilla.wikimedia.org/show_bug.cgi?id=8997#a0) is resolving second issue where output is getting parsed as wiki-text which causes HTML to still be considered wiki-text and inserts <p>s when two consequent newlines were used. I wonder if there is any way to avoid that without patching MW code.
Sergey
On Tue, Jun 24, 2008 at 4:58 PM, Sergey Chernyshev < wikitech-l@antispam.sergeychernyshev.com> wrote:
I have studied this problem in depth[1], and my standing
recommendation is to use the parser's insertStripItem() function thusly:
return $parser->insertStripItem( $output, $parser->mStripState );
That's what I suggested! Based on your original example, of course. ;-)
Actually I didn't provide the example because I was worried about it in general ;)
BTW, I realized that this problem came up before and I even posted a patch based on somebody else's explanation of the solution to Bugzilla back in August: https://bugzilla.wikimedia.org/show_bug.cgi?id=8997#a0
I'll also think about using insertStripItem - never used it before so I have to read up on it.
Sergey
Joshua
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Sergey Chernyshev http://www.sergeychernyshev.com/
-- Sergey Chernyshev http://www.sergeychernyshev.com/ _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Jim, sorry for being so slow on this thread - looks like I just missed your post.
Thanks a lot for advice! Obviously I read this article at some point, but I forgot about it when time came to fix the issues.
Looks like I'll have to do more hardcore MediaWiki coding to make it happen without users patching their installation.
Sergey
On Wed, Jun 25, 2008 at 6:16 PM, Jim R. Wilson wilson.jim.r@gmail.com wrote:
Hello again,
It may surprise you to find out that I have also investigated that problem in depth. :)
There are two things playing against you, even if you use the insertStripItem() method. One is the whitespace list processing (which you've discovered), and the second is Tidy.
If you must have pure, unadulterated HTML from a parser function, you'll effectively want to "hide" data from the parser, then re-introduce it later.
You can do this by:
- Have your parser function output an encoded HTML comment (protected
as usual by the insertStripItem() trick) 2) Create a function hooking 'ParserAfterTidy' which looks for any encoded HTML comments and replaces them with the desired content.
Basically you want to combine the techniques you already know with the one explained in this article:
http://jimbojw.com/wiki/index.php?title=Raw_HTML_Output_from_a_Parser_Extens...
Good luck!
-- Jim
On Wed, Jun 25, 2008 at 4:21 PM, Sergey Chernyshev wikitech-l@antispam.sergeychernyshev.com wrote:
Actually, it looks like you provided a solution only to the first problem
i
mentioned - preceding <p> tag and yes, it works, thank you!
But second problem still exists, but the patch I mentioned ( https://bugzilla.wikimedia.org/show_bug.cgi?id=8997#a0) is resolving
second
issue where output is getting parsed as wiki-text which causes HTML to
still
be considered wiki-text and inserts <p>s when two consequent newlines
were
used. I wonder if there is any way to avoid that without patching MW
code.
Sergey
On Tue, Jun 24, 2008 at 4:58 PM, Sergey Chernyshev < wikitech-l@antispam.sergeychernyshev.com> wrote:
I have studied this problem in depth[1], and my standing
recommendation is to use the parser's insertStripItem() function thusly:
return $parser->insertStripItem( $output, $parser->mStripState );
That's what I suggested! Based on your original example, of course. ;-)
Actually I didn't provide the example because I was worried about it in general ;)
BTW, I realized that this problem came up before and I even posted a
patch
based on somebody else's explanation of the solution to Bugzilla back in August: https://bugzilla.wikimedia.org/show_bug.cgi?id=8997#a0
I'll also think about using insertStripItem - never used it before so I have to read up on it.
Sergey
Joshua
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Sergey Chernyshev http://www.sergeychernyshev.com/
-- Sergey Chernyshev http://www.sergeychernyshev.com/ _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org