Thanks Oliver and Aaron. I want to look at the deleted revisions as described in the project meta page [1], which is not there in the XML dump. I know the revisions that I want to get the content for. What would you advice?
Happy to take this off the list if it gets too specific.

Thanks!
Srijan

[1] https://meta.wikimedia.org/wiki/Research:Understanding_hoax_articles_on_English_Wikipedia


---------- Forwarded message ----------
From: Aaron Halfaker <ahalfaker@wikimedia.org>
Date: Wed, Jul 29, 2015 at 4:21 PM
Subject: Re: [Wiki-research-l] How to read blobs in text table?
To: Research into Wikimedia content and communities
<wiki-research-l@lists.wikimedia.org>


That's right.  I use the API and the XML dumps if I need text content.
If you let me know about the type of analysis you are performing, I
can advise about the best strategies.

On Wed, Jul 29, 2015 at 6:14 PM, Oliver Keyes <okeyes@wikimedia.org> wrote:
>
> If we're talking Wikimedia Mediawiki instances, yes, the API is your
> only way forward - for performance reasons the text content is stored
> in a totally different set of servers that (to my knowledge) even paid
> researchers don't get to mess around with. Alternately you could take
> a look at https://dumps.wikimedia.org if slightly outdated information
> is okay to you.
>
> On 29 July 2015 at 18:58, Srijan Kumar <srijankedia@gmail.com> wrote:
> > Hi!
> >
> > I want to read the text stored in the text tables[1], but the old_text field
> > stores it as what seems to be the path to the blob. How can I get the
> > content of the blob?
> >
> > Alternately, is there any other way to access all text content (including
> > deleted content) without requiring global rights to the API?
> >
> > Thanks!
> > Srijan
> >
> > [1] https://www.mediawiki.org/wiki/Manual:Text_table
> >
> > _______________________________________________
> > Wiki-research-l mailing list
> > Wiki-research-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
>
>
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l