Hi,
today I started to look into generating something closer to WikiDom from the
parser in the ParserPlayground extension. For further testing and parser
development, changes to the structure will need to be mirrored in the
current serializers and renderers, which likely won't be used very long once
the editor integration gets underway.
The serializers developed in wikidom/lib/es seem to be just what would be
needed, so I am wondering if it would make sense to put some effort into
plugging those into the parser at this early stage while converting the
parser output to WikiDom. The existing round-trip and parser testing
infrastructure can then already be run against these serializers.
The split codebases make this a bit harder than necessary, so maybe this
would also be a good time to draw up a rough plan on how the integration
should look like. Swapping the serializers will soon break the existing
ParserPlayground extension, so a move to another extension or the wikidom
repository might make sense.
Looking forward to your thoughts,
Gabriel
A chapter is being worked on for the next edition of the Open Source
Architecture ORA book, here:
https://www.mediawiki.org/wiki/MediaWiki_architecture_document/text
I suspect there's some more information about current parser project work
that should be in there, but isn't. I'm going to be doing an edit pass
on that page tonight or tomorrow, and if there's info you think should be
in there you or I should put it in?
Cheers,
-- jra
--
Jay R. Ashworth Baylink jra(a)baylink.com
Designer The Things I Think RFC 2100
Ashworth & Associates http://baylink.pitas.com 2000 Land Rover DII
St Petersburg FL USA http://photo.imageinc.us +1 727 647 1274
Forwarding -- the rest of the thread is at
http://lists.wikimedia.org/pipermail/wikitech-l/2011-October/thread.html#56…
.
-------- Original Message --------
Subject: [Wikitech-l] #parser parser function - does this make any sense?
Date: Sat, 29 Oct 2011 02:01:15 +0200
From: Daniel Werner <DanWeEtz(a)web.de>
Reply-To: Wikimedia developers <wikitech-l(a)lists.wikimedia.org>
To: wikitech-l(a)lists.wikimedia.org
I am thinking about creating a very simple parser function #parse doing
nothing but returning parameter 1 with an "'noparse' => false" option.
Is there anything like this (or what could be abused for this) already
or is there any reason why this might be a bad idea?
The reason I want to have something like this is, I want to create a
template (for template and parser function black-box tests) accepting
something like {{((}}#somefunction:a{{!}}b{{!}}c{{))}} as parameter
value, showing {{#somefunction|a|b|c}} as output and at the same time
calling {{#parse: {{((}}#somefunction:a{{!}}b{{!}}c{{))}} }} so that
besides the definition also the result can be shown by the template output.
regards,
Daniel
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Etherpad'ed notes from today's editor/parser status meeting:
2011-10-26
Attending:
* Visual Editor team
** Trevor
** Inez
* Parser team
** Brion
** Gabriel Wicke (skype)
* Integration
** Neil
* Managy people
** Erik
** Alolita
== Status check ==
How are we going to
- Write a parser
- Integrate it with the visual editor
- Deploy the visual editor
Parser Status
- Basic parser in place
- Produces intermediate JSON object tree
- Doesn't yet produce Wikidom as it's been defined
- Considering if Wikidom should be changed
- Still working on markup support (mixed HTML and Wikitext)
- Written in JavaScript and is currently in the "ParserPlayground"
- Using PEG for part of the parsing process (primarily tokenizing)
VisualEditor Goals
- Get something working in the wiki environment so people can play with
it
- Ideas for places to integrate first
- New page creation - Potential Use Case
- http://www.mediawiki.org/wiki/Visual_editor/Task_management
== Decemberish test deployment? ==
Distinct stages that could be deployed on sites for testing:
* Visual editor in the wiki -> lets you edit but nothing else
** ''too early to advertise for anything but interaction testing''
** just need to actually plug it in
* Visual editor in the wiki and lets you save
** ''_could_ start to use as opt-in test for new pages''
** add API module to save, should be easy-ish
* Visual editor in the wiki and lets you load and save.
** ''this would be ideal place to start advertising for testers''
** combining the parser and reconciling the formats
== Visual Editor side ==
VE is in midst of core redo to make undo etc working (internal
representation changes)-- still stabilizing!
VE frontend todos for page saving:
* reintegrate serializers (in progress)
* undo/redo/copy/paste working (mostly there)
** '''Trevor and Inez are already working on this, but it needs to be
finished before we can integrate saving.'''*
Parser/VE integration todos (for page loading):
* parser & wikidom specs need to be resynchronized
** possibly needs a translator step
* support for more structures (lists etc)
** '''This is big work and needs a lot of coordination between VE & Parser
groups!'''*
** brion & gabriel: can start working with their existing spec -- let's at
least get that working
*** then we'll need to start figuring out if we need to do major changes
based on issues
*** parser & VE may need to evolve together for now!
Lots more front-end work needed in VE ecosystem; we expect to bring on
another hire in that area.
"WikiDom will be the fence between these two neighbors [VE & Parser groups]"
== Parser side ==
*Immediate parser work:*
* get it producing output compatible with VE's current wikidom input
** we do expect to have to evolve this over time! but let's get it together
with the VE first
* make sure there's a clean JS API for the parser so the VE can call it
*Next stages parser work:*
* start making sure tables, templates are all working as expected (fixup
stages?)
** at least properly nested templates..
* iterate as necessary
* make sure auto tests are running
** round-trip tests on wikipedia corpus already available
** need to test HTML rendering against MW's current internal parser
*Later stages parser work: (after December)*
* depending on what we've seen from auto test...
* ... redoing the PEG?
* ... redoing the wikidom?
* ... redoing how/when templates are expanded?
* make sure all structures really supported!
* add proper fix-ups similar to what tidy does right now
== Upcoming work ==
* Mediawiki extension for Visual Editor: Architecture is to be driven by
Trevor; Integration w Parser
* RL2 support
* Internationalization support
* Using underscore.js for js based parser functionality in the same
extension as VEd
** replace jQuery bits like $.each; underscore should be cleaner and is
easier to use in node for batch tests
* Unknown markup indicator
** at least black-box markers for extension hooks and such
Milestones!!!!!!!
* Trevor will check up with Howie
** various VE stuff on its way
* getting VE and parser into a common extension (separate code modules
loaded by it)
* parser integration may not make it by December, but we'll see what's ready
by then
Welcome, Gabriel. I'm looking forward to working with you, and to
seeing you posting on the parser/visual editor list about our progress
on that very important endeavor! cc'ing the parser list (wikitext-l).
Sumana Harihareswara
Volunteer Development Coordinator
Wikimedia Foundation
On Tue, Oct 25, 2011 at 1:22 AM, Alolita Sharma <asharma(a)wikimedia.org> wrote:
> Hi All,
>
> Please join me in welcoming Gabriel Wicke as a Software Developer in
> WMF’s Features Engineering team. Gabriel will be working on the
> Visual Editor - Parser project, one of Wikimedia’s high priority
> projects this year. He will be working closely with our guru Brion
> Vibber on extending the parser to support WikiDom interactions with
> the Visual Editor client being developed by lead engineer Trevor
> Parscal, Wikia developer Inez Korczyński and front-end developer Neil
> Kandalgaonkar.
>
> As many of you may already know, Gabriel has been member of the
> Wikipedia community for many years now. He discovered Wikipedia in
> 2003, when it was still running on two servers. Using his previous
> experience with Squid caching, he got involved in technical
> discussions and hacking. In 2004, Gabriel designed and implemented the
> initial Squid caching layer, and later wrote the MonoBook skin.
>
> After completing his Computer Science degree and doing research in
> transactional distributed systems and Haskell, Gabriel is looking
> forward to more practical challenges at Wikimedia.
>
> Gabriel is an avid sportsman and professional sailor. When he’s not in
> front of a computer coding away, he is often sailing on his own or
> with friends. He was a member of the German national team in the
> Olympic 49er class from 2001-2008, and is now racing an A-Class
> catamaran. Pretty awesome!
>
> Say hello to Gabriel online. He can be found on #mediawiki as gwicke.
>
> Welcome back Gabriel! Great to have you on the Wikimedia Features team :-)
>
> Alolita
> --
> Alolita Sharma
> Director, Features Engineering
> Wikimedia Foundation
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Dear Wikitext experts,
please, check out Sztakipedia, a new Wiki RTE at:
http://pedia.sztaki.hu/ (please check the video first, and then the tool itself)
which aims at implementing some of the Visions you described here:
http://www.mediawiki.org/wiki/Future/Parser_plan (the RTE part)
Some background:
Sztakipedia did not start out as an editor for Wikipedia. It was meant
to be a web-based editor for UIMA annotated rich content, supported
with natural language background processing.
The tool was functional by the end of 2010, and we wanted a popular
application to demonstrate its features, so went on applying it to
Wiki editing.
To do that, we have made some wiki-specific stuff:
-After checking out many parsers, we have created a new one in JavaCC
-Created lots of content helpers based on dbpedia, like the link
recommendation, infobox recommendation, infobox editor help
-Integrated external resources to help editing, like the Book
Recommendation or Yahoo-based category recommendation
Sztakipedia is right now in its alpha phase, with many show stoppers,
like handling cite references properly, or editing templates embedded
in templates,
etc...
I am aware that you are working on a new syntax, parser and RTE and
they will eventually become the official ones for Wiki editing
(Sztakipedia is in Java anyway).
However, I still think that there is much to learn from our project. We will
write a paper next month on the subject and I will be honored is some
of you read and comment it. The main contents will be:
-problematic stuff in the current wikitext syntax we struggled with
-usability tricks, like extracting the infobox pages to provide help
for the fields, showing the abstracts of the articles to be linked
-recommendations, machine learning to support the user+ background theory
Our plan right now is to create an API for our recommendation services
and helpers and a MediaWiki js plugin to get its results to the
current wiki editor. This way I hope the results of this research -
which started out as a rather theoretical one - will be used in a real
world scenario by at least a few people. I hope we will be able to
extend the your planned new RTE the same way in the future.
Please, share with me your thoughs/comments/doubts about Sztakipedia.
Also I wanted to ask some things:
-Which is the most wanted helper feature according to you:
infobox/category/link recommendation? External data import from the
Linked Open Data? (Like our Book Recommender right now which has
millions of book records in it?) Field _value_ recommendation for
infoboxes from the text? Other?
-How do you measure the performance of a parser? I saw hints to some
300 parser test cases somewhere...
-Which is the best way to mash up external services to support the Wiki editor
interface (because if you call an external REST service from JS in mediawiki, it
will be cross-site scripting I'm afraid)?
Thank you very much,
Best Regards
Mihály Héder
MTA Sztaki,
Budapest, Hungary