I know this is a bit of a tangent, but since you mentioned parsing, I'd like to go off in that direction. The way parsoid represents a page is a mix of HTML and json (RDFa), with the template details being in the json parts. There are good tools for processing HTML documents and searching for specific nodes based on the tree structure. While there are tools for working with RDFa, it's a much sparser ecosystem (see
https://rdfa.info/tools). As far as I know, there are no tools that let you do queries like:
because that crosses between the HTML and RDFa domains. Writing such a query is easy with many existing tools that use either XPATH or css selector syntax.