Hi there !,
Is in the rewrite branch a way of getting the wiki-text of a diff like using this link http://es.wikipedia.org/w/index.php?diff=37372500&oldid=32780367&dif.... I think it can be donned via the API with "prop=revisions" and a combination of parameters specially "rvdiffto". "Site" class has a "loadrevisions" method but has no "rvdiffto" implementation. Is there any other way to get just the diff from two versions?.
Thanks !!
Matias.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10-05-24 10:38 PM, Matias wrote:
I think it can be donned via the API with "prop=revisions" and a combination of parameters specially "rvdiffto".
The API diff returns an HTML table (ie the same thing you see on-wiki, there's just no interface or CSS). I think it is just a matter of implementing it in pywikibot (yay for the name change!) if it hasn't been done already.
- -Mike
On 10-05-24 10:38 PM, Matias wrote:
I think it can be donned via the API with "prop=revisions" and a combination of parameters specially "rvdiffto".
The API diff returns an HTML table (ie the same thing you see on-wiki, there's just no interface or CSS). I think it is just a matter of implementing it in pywikibot (yay for the name change!) if it hasn't been done already.
- -Mike
I think it returns an XML as default. I want to make the query as smaller as possible. I'm only interested in new additions, as far as I could see, new additions are inside <td class="diff-addedline"> and there in <span class="diffchange">. At least this is true in the non-API query. I manage to get the diff output with the API and those tags are represented in this way: http://pastebin.com/Myv1976Y this is for XMLFM (default output).
API DIFF: http://es.wikipedia.org/w/api.php?action=query&prop=revisions&rvstar...
NON-API DIFF: http://es.wikipedia.org/w/index.php?diff=37372500&oldid=32780367&dif...
I think this is not yet implemented in the framework.
Matias.
Hello.
2010/5/25 Matias plinbox@gmail.com:
On 10-05-24 10:38 PM, Matias wrote:
I think it can be donned via the API with "prop=revisions" and a combination of parameters specially "rvdiffto".
The API diff returns an HTML table (ie the same thing you see on-wiki, there's just no interface or CSS). I think it is just a matter of implementing it in pywikibot (yay for the name change!) if it hasn't been done already.
- -Mike
I think it returns an XML as default. I want to make the query as smaller as possible.
I think that what Mike meant was: the diff data is returned as HTML. API has various ways to output data, and you are right, default format is XMLFM. BUT the diff data, as in the data in the <diff> tag returned by ?action=query&prop=revisions&rvdiffto=prev is HTML content.
I'm only interested in new additions, as far as I could see, new additions are inside <td class="diff-addedline"> and there in <span class="diffchange">. At least this is true in the non-API query. I manage to get the diff output with the API and those tags are represented in this way: http://pastebin.com/Myv1976Y this is for XMLFM (default output).
API DIFF: http://es.wikipedia.org/w/api.php?action=query&prop=revisions&rvstar...
NON-API DIFF: http://es.wikipedia.org/w/index.php?diff=37372500&oldid=32780367&dif...
I think this is not yet implemented in the framework.
It's not implemented as high level API, but building yourself a query is easy. Look into site.py:loadrevisions for directions on how to do it.
And if you were to submit a patch, I'm sure that Russell would be happy to apply it =)
Regards,
Matias.
Look into site.py:loadrevisions for directions on how to do it.
And if you were to submit a patch, I'm sure that Russell would be happy to apply it =)
When is the query actually been executed? Since I'm kind of lost in the code. I want to see how loadrevisions executes the query and manages the results. I've tried to "magically" make it work with this UAH patch ( http://pastebin.com/rhg9n7NM) for the site.py but doesn't seem to work as easy as I wished. Any hint would be great !
--Matias.
Ok, I did it :). Made a patch also. (http://pastebin.com/028wDbJR)
Changelog:
I've modified *api.update_page()* to save the new diff information. I've added a *Page.Revision.Diff* class for storing the diff text and revto id. I've modified *site.loadrevisions()* method to support rvdiffto parameter.
A method from Page.py is still missing to get diffs just like you get a revision now. But you can get the diff text from page._revision[id].diff.text
Please review the ugly as hell patch that I made. Any comment is welcome !.
--Matias
On Tue, May 25, 2010 at 05:14:18PM -0300, Matias wrote:
When is the query actually been executed? Since I'm kind of lost in the code. I want to see how loadrevisions executes the query and manages the results. I've tried to "magically" make it work with this UAH patch ( http://pastebin.com/rhg9n7NM) for the site.py but doesn't seem to work as easy as I wished. Any hint would be great !
The 'problem' of loadrevisions() is that it doesn't return anything, it just loads revisions (sic) and store them in the _revisions attribute of the Page() object, which is a dictionnary of Revision() objects (see update_page() in api.py). And these Revision() objects are not meant to store diffs.
My suggestion would be to directly use a PropertyGenerator and handle the result yourself instead of parsing it via update_page. Which will be almost the same thing you've shown on your pastebin link, with the last part (api.update_page(...)) removed and replaced by something useful for you.
stan.
Okay, it took me too long to write the previous message :) Your solution is better, indeed.
stan.
pywikipedia-l@lists.wikimedia.org