HI Roy,
Sorry it took a while before we could respond.
On 9/4/20 12:40 PM, Roy Smith wrote:
I know there's been a ton of work done of Parsoid
lately. This is
great, and the amount of effort that's gone into this functionality is
really appreciated. It's clear that Parsoid is the way of the future,
but the documentation of how you get a Parsoid parse tree via an AP
call isI kind of confusing.
I found
https://www.mediawiki.org/wiki/Parsoid/API, which looks like
it's long out of date. The last edit was almost 2 years ago. As far
as I can tell, most of what it says is obsolete, and refers to a
series of /v3 routes which don't actually exist.
That page is not out of date. That refers to Parsoid's API which is what
you would use if you were querying Parsoid directly. When we ported
Parsoid from JS to PHP, we ensured that the API routes didn't change.
What changed was the base url of the Parsoid service (so clients could
simply switch this URL in the configuration without having to change
their code).
For example,
http://en.wikipedia.org/w/rest.php/en.wikipedia.org/v3/page/html/<title&…
works if you curl this url on a production server (if you have access).
But this Parsoid API is not accessible on the public internet. However,
wikitech.wikimedia.org's Parsoid API is currently accessible on the
public internet. So, for example:
https://wikitech.wikimedia.org/w/rest.php/wikitech.wikimedia.org/v3/page/ht…
works. So, you can verify that the API routes on the Parsoid wiki page
will work on
wikitech.wikimedia.org.
But, anyway, this is not directly relevant to your usecase unless you
are directly contacting a Parsoid service somewhere. In production
wikimedia wikis, as I said, Parsoid's API isn't public (it wasn't public
with the JS version either). You can only access Parsoid content via
RESTBase's public API which you reference below.
I also found
https://en.wikipedia.org/api/rest_v1/#/Page%20content
<https://en.wikipedia.org/api/rest_v1/#/Page content>, which seems
more in line with the current reality.
Yes, this is the API that RESTBase provides. Behind the scenes, it
accesses Parsoid's API when it needs fresh content.
But, the call I was most interested in,
/page/data-parsoid/{title}/{revision}/{tid}, doesn't actually
respond (at least not on
en.wikipedia.org <http://en.wikipedia.org>).
Joaquin already responded to this part (thanks Joaquin), so I'll skip
this here.
I will respond to your other Parsoid HTML questions / comments by
responding to your other post.
Subbu.