If periodic update dumps are being considered, information that describes changes to old data (page deletes, user renames, etc) would be very useful to have along with new revisions.
On Mar 31, 2011 6:27 PM, "Luca de Alfaro" <
luca@dealfaro.org> wrote:
> I think I would be very interested in 3, or even, in having every month a
> dump of that month's revisions. As I have built tools for the xml dumps, no
> change in format is good for me (and for WikiTrust).
>
> I would find incremental dumps (with occasional, yearly, full dumps) much
> easier to manage than full dumps.
>
> Luca
>
> On Thu, Mar 31, 2011 at 2:27 PM, Yuvi Panda <
yuvipanda@gmail.com> wrote:
>
>> Hi, I'm a student planning on doing GSoC this year on mediawiki.
>> Specifically, I'd like to work on data dumps.
>>
>> I'm writing this to gauge what would be useful to the research
>> community. Several ideas thrown about include:
>> 1. JSON Dumps
>> 2. Sqlite Dumps
>> 3. Daily dumps of revisions in last 24 hours
>> 4. Dumps optimized for very fast import into various external storage
>> and smaller size (diffs)
>> 5. JSON/CSV for Special:Import and Special:Export
>>
>> Would any of these be useful? Or is there anything else that I'm
>> missing, that you would consider much more useful?
>>
>> Feedback would be invaluable :)
>>
>> Thanks :)
>> --
>> Yuvi Panda T
>>
http://yuvi.in/blog>>
>> _______________________________________________
>> Wiki-research-l mailing list
>>
Wiki-research-l@lists.wikimedia.org>>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>