New Project, Link Hooks... Needing some research - Wikitech-l

15 Aug 2008


      I've noticed a growing in extensions extending link syntax. (Namely 
SMW's annotations, and other extensions using Embed:, Video:, or 
theoretically even Audio: namespaces for embedding things).
However all implementations have strong issues. We have an internal 
parsing of links, however when an extension does something it's 
customary to use a regex rather than duplicating a small part of the 
parser. This normally leads to either a limited syntax substandard of 
what the parser does, or a regex so complex it causes server errors when 
syntax is a bit broken (missing a trailing ]] ).
For that reason I'm looking into adding a new feature for the parser 
Link Hooks. Basically this would allow an extension to hook into link 
processing for a Namespace, or a pattern.
I plan to support a number of flags (Link/Media callbacks [link 
modification, vs. embedding], namespace/pattern [ns number, or a special 
pattern (like SMW's ::)], Multi-params [Pipe separated params rather 
than one display text], Recursive parameters [Things like Image: where 
links can be inside parameters], Recursive link text [For patterns which 
break things up and may contain links]) so it should handle most cases.
Unfortunately I hit a snag in the code when dealing with 
[[Embedablens:Page|Content with [[link|displaytext]] inside]]. I can't 
provide data to extensions in a sane way. Either plaintext is sent to 
them, and they work with that (albet breaking things like usual), or I 
try to split up the |'s which doesn't work with nested things, or I 
first parse the nested links, but then extensions get a hard to work 
with mess passed to them as their data.
The nice way the preprocessor works with objects has pointed me out that 
the best way this would work, would probably be to recursively parse the 
text into link objects, and then do our expansion, also allowing them 
access in special ways to the tree (Extract as WikiText, HTML, Plain Text).
Doing some research into the way the parser handles links at first 
provided me with good results ([[link [[inside of]] link]] nicely gives 
you a link to "inside of" with the outside stuff verbatim just as the 
processor I think of would do). However I ran into an ugly, sticky, mess 
with image embedding.
http://dev.wiki-tools.com/wiki/LinkHook#Old_Tests
(Ignore the fact my examples here don't have the frame option)
[[Image:File.ext|Caption]] Renders as a image with "Caption"
[[Image:File.ext|[[Image:File.ext|Caption]]]] Renders an image inside of 
another image that has a caption of "Caption".
[[Image:File.ext|[[Image:File.ext|[[link]]]]]] Renders [[link]] as a 
link, the rest is completely verbatim.
Honestly, the syntax is inconsistent with itself. If we were trying to 
stop embeds inside of embeds, then the last one should render as an 
image, with a link to [[link]] and the other Image: verbatim as a caption.
I believe there is a bug about the 2nd case, if anyone has it handy I'd 
love a link. I hunted through bugzilla but couldn't find it.
Some use cases, what's expected would be nice.
My issue is that Image links are functionally supposed to be the same as 
a setLinkHook using the Media, Multi-params, and Recursive parameters 
options. (Embed but not with : at the start, pipe separated parameters, 
and parameters can have links inside of them).
However, in terms of any extension or anything that would be using 
setLinkHook, something like that making use of the recursive parameters 
option would be expecting something different.
[[Embed:Title|[[Otherembed:Title]] and [[link]]]]
Would actually render as an embed, with two links (since it's inside of 
another embed the 'Otherembed' reverts to a link).
And: [[Embed:Title|[[Otherembed:Title|[[link]]]]]]
Would actually render as an embed, with a link to [[link]] and the rest 
of the caption verbatim.
-- 
~Daniel Friesen(Dantman, Nadir-Seen-Fire) of:
-The Nadir-Point Group (http://nadir-point.com)
--It's Wiki-Tools subgroup (http://wiki-tools.com)
--The ElectronicMe project (http://electronic-me.org)
--Games-G.P.S. (http://ggps.org)
-And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
--Animepedia (http://anime.wikia.com)
--Narutopedia (http://naruto.wikia.com)