parsing multiple pages (for search results)

List overview All Threads
Download

newer

older

lack of a format: indented, real...

action=parse internal wiki links -...

Michael Dale

9 Jan 2009 9 Jan '09

6:37 a.m.

Is it possible to parse multiple pages at once or to get your search results as html?

for example I run something like: http://commons.wikimedia.org/w/api.php?format=jsonfm&action=query&ge...

And I get all the results "revisions": ["*"] as wiki-text ... ideally I could get those results as html is there anyway to do that?

--michael

Show replies by date

Roan Kattouw

9 Jan 9 Jan

6:55 a.m.

Michael Dale schreef:

...

Is it possible to parse multiple pages at once or to get your search results as html?

for example I run something like: http://commons.wikimedia.org/w/api.php?format=jsonfm&action=query&ge...

And I get all the results "revisions": ["*"] as wiki-text ... ideally I could get those results as html is there anyway to do that?

You can't get them all at once, currently, no. You could run them through action=parse one by one, but you seem to imply you already knew that.

I'll look into the feasibility of parsing multiple pages at once (do you want to parse all pages in their entirety, or just parts of them?) when I have time, which probably won't be this week or next (busy times).

Roan Kattouw (Catrope)

Dian

7:10 a.m.

you could merge the resulting page data and then send a single action=parse request.

best regards, dian

...

Michael Dale schreef:

...
Is it possible to parse multiple pages at once or to get your search results as html?

for example I run something like: http://commons.wikimedia.org/w/api.php?format=jsonfm&action=query&ge...

And I get all the results "revisions": ["*"] as wiki-text ... ideally I could get those results as html is there anyway to do that?

You can't get them all at once, currently, no. You could run them through action=parse one by one, but you seem to imply you already knew that.

I'll look into the feasibility of parsing multiple pages at once (do you want to parse all pages in their entirety, or just parts of them?) when I have time, which probably won't be this week or next (busy times).

Roan Kattouw (Catrope)

Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api

Michael Dale

7:18 a.m.

its just for search results so maybe just the into paragraph? or maybe you could request either?

Dian mentioned merging it all as a single request but that would be pretty resource intensive and have pretty low cache hit rate. I foresee the use case eventually being fairly high traffic.

you can see what I am using it for ( the add_media_wizard ) by adding: importScriptURI('http://mvbox2.cse.ucsc.edu/w/extensions/MetavidWiki/skins/external_media_wiz...'); to your monobook.js user page.

then edit a page click on the wizard button on the left then click on the "list" layout option in the search results... and notice all the wiki-text ... would look nicer as just html ( will probably have to strip the html down to only perverse a few tags to have consistent formating... but that can be done in javascript )

peace, --michael

Roan Kattouw wrote:

...

Michael Dale schreef:

...
Is it possible to parse multiple pages at once or to get your search results as html?

for example I run something like: http://commons.wikimedia.org/w/api.php?format=jsonfm&action=query&ge...

And I get all the results "revisions": ["*"] as wiki-text ... ideally I could get those results as html is there anyway to do that?

You can't get them all at once, currently, no. You could run them through action=parse one by one, but you seem to imply you already knew that.

I'll look into the feasibility of parsing multiple pages at once (do you want to parse all pages in their entirety, or just parts of them?) when I have time, which probably won't be this week or next (busy times).

Roan Kattouw (Catrope)

Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api

Roan Kattouw

4:39 p.m.

Michael Dale schreef:

...

its just for search results so maybe just the into paragraph? or maybe you could request either?

Dian mentioned merging it all as a single request but that would be pretty resource intensive and have pretty low cache hit rate. I foresee the use case eventually being fairly high traffic.

Actually, it wouldn't have a low cache hit rate if you're parsing multiple entire pages at once. Each page would be parsed separately, which means cached pages are served from cache and the rest is parsed right then and there. Since we're talking about current revisions, cache hit rate will actually be pretty high.

Roan Kattouw (Catrope)

Michael Dale

10 Jan 10 Jan

12:31 a.m.

... I think what was being suggested is that I could presently literally take all the wiki text results combine them and send them as a single action=parse request. That would be a unique parse request per your results set and not very catch friendly.

But yes, ideally we can add in the functionality to parse all the results individually via a single request that would be able to hit the current revision parser cache. A believe Roan Kattouw is planing on looking into adding that to the api at some point soon :)

--michael

Roan Kattouw wrote:

...

Michael Dale schreef:

...
its just for search results so maybe just the into paragraph? or maybe you could request either?

Dian mentioned merging it all as a single request but that would be pretty resource intensive and have pretty low cache hit rate. I foresee the use case eventually being fairly high traffic.

Actually, it wouldn't have a low cache hit rate if you're parsing multiple entire pages at once. Each page would be parsed separately, which means cached pages are served from cache and the rest is parsed right then and there. Since we're talking about current revisions, cache hit rate will actually be pretty high.

Roan Kattouw (Catrope)

Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api

Roan Kattouw

12 Jan 12 Jan

4:56 a.m.

Michael Dale schreef:

...

... I think what was being suggested is that I could presently literally take all the wiki text results combine them and send them as a single action=parse request. That would be a unique parse request per your results set and not very catch friendly.

Unicity doesn't really matter, since we don't try the cache at all when text= is used.

...

But yes, ideally we can add in the functionality to parse all the results individually via a single request that would be able to hit the current revision parser cache. A believe Roan Kattouw is planing on looking into adding that to the api at some point soon :)

Yup. "Soon" will probably be in the next few weeks.

Roan Kattouw (Catrope)

5797

Age (days ago)

5800

Last active (days ago)

mediawiki-api@lists.wikimedia.org

6 comments

3 participants

tags (0)

participants (3)

Dian
Michael Dale
Roan Kattouw