On Jun 14, 2021, at 10:13 AM, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
I'm not talking just about the technical parts, i.e. how the parser sees it, but also about the functional parts—how the template maintainers and users perceive it.
I know this is a bit of a tangent, but since you mentioned parsing, I'd like to go off in that direction. The way parsoid represents a page is a mix of HTML and json (RDFa), with the template details being in the json parts. There are good tools for processing HTML documents and searching for specific nodes based on the tree structure. While there are tools for working with RDFa, it's a much sparser ecosystem (see https://rdfa.info/tools https://rdfa.info/tools). As far as I know, there are no tools that let you do queries like:
Find all the xyz templates with a foo=bar attribute that exist inside a <div class="blah">
because that crosses between the HTML and RDFa domains. Writing such a query is easy with many existing tools that use either XPATH or css selector syntax.