On 01/17/2011 03:49 PM, Anthony wrote:
How would you define a particular sentence, paragraph
or section of an
article? The difficulty of the solution lies in answering that
question.
I think the definition could vary, and the functionality could
still be useful. The API parameters could be the offset and
length in the given article version, just like substr().
A user interface (depending on skin) could input the offset
and length by point-and-click (region select) or by pointing
at a word and finding the preceding and following blank line.
Some user interface might care about sentence separators.
The search could be simplified if each edit preserved some
parameters of the diff, an "edit index", e.g. "inserted 7
characters at offset 4711". Then we know that this edit is
irrelevant if the sought offset is nowhere near 4711 and
as we go back in history, our offset needs to be reduced
by 7 if it is larger than 4711. Doing such offset arithmetics
for a thousand article edits should be a lot faster than
calling diff over and over again. And then again, the diffs
are necessary to build such an edit index. This could be
done in a one-time conversion or on demand, using the edit
index as a cache of such parameters.
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik -
http://aronsson.se