I have some general Parsoid questions I hoped someone here might help me with.
The
background is that we are doing some preliminary work looking at how
Text-to-Speech might work on Wikipedia (there will be some info online
in the coming weeks).
One detail of this is that you might
occasionally have to highlight specific words/sentences that are dealt
with differently (e.g. World War III -> World War 3). It is still
unclear how frequent such things would be but if they are very frequent
then there would likely be push-back from the community if this is
stored in the normal wikitext.
In this case we would have to
store the markup outside of the wikitext and any viewing/editing of it
would have to happen in some user enabled extension of the normal
environment.
And here we come to the question.
1. If we would
have to store this markup outside of the wikitext could this be done by
storing the individual parsoid-data-units?
2. Would it be possible to
add these units to the existing parsoid-data (which gets loaded from
the wikitext) when loading a page?
3. Would it be possible to detect which of these units would be affected by edits to the wikipage?
This is still in the early stages so mainly we are looking at what possibilities exist should we need them. Using Parsoid data was something we thought of as a light-weight solution to having to store a synced copy of the wikitext+additional markup.