---------- Forwarded message ----------
From: Shao Hong <shaohong86(a)gmail.com>
Date: Fri, May 3, 2013 at 4:15 PM
Subject: [Wikitech-l] [GSOC Proposal] Incremental data dumps - Shaohong
To: wikitech-l(a)lists.wikimedia.org
Hi all,
I have just updated my proposal for "Incremental data dumps" and you can
view it at https://www.mediawiki.org/wiki/User:Shaohong and please feel
free to give me comments or suggestion! Thanks!
Shao Hong
shaohong86(a)gmail.com
"Respect needs to be earned, but honour is an attitude of the heart. Not
everyone will earn your respect, but everyone deserves to be shown honour."
- Anonymous
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hi,
I noticed that some pagecounts data files are missing, namely the files in the
interval (20090921160000 - 20091001000000) (ends excluded).
See http://dumps.wikimedia.org/other/pagecounts-raw/2009/2009-09/
Does anybody know the reason why these data are missing?
Best,
--
Giovanni Luca Ciampaglia
Postdoctoral fellow
Center for Complex Networks and Systems Research
Indiana University
✎ 910 E 10th St ∙ Bloomington ∙ IN 47408
☞ http://cnets.indiana.edu/
✉ gciampag(a)indiana.edu
---------- Forwarded message ----------
From: Jeremy Coffman <jcoffman93(a)yahoo.com>
Date: Fri, May 3, 2013 at 2:18 PM
Subject: [Wikitech-l] GSoC Proposal - Incremental Data Dumps
To: "wikitech-l(a)lists.wikimedia.org" <wikitech-l(a)lists.wikimedia.org>
Hello,
My name is Jeremy Coffman. I am a second year student studying
Computer Science at Brandeis University, with a possible focus on
natural language processing. I have decided to apply to work on the
Incremental Data Dumps project. My proposal can be found here:
http://www.mediawiki.org/wiki/User:J.a.coffman/GSoc_2013_Proposal
I know I'm rather close to the deadline, but I welcome any feedback
you may have.
Thank you,
Jeremy Coffman
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I realized I didn't post my proposal to the list yet (I have added it to
the official GSoC site few days ago), so here it is:
http://www.mediawiki.org/wiki/User:Svick/Incremental_dumps
In short, the project aims to create new format for dumps (which allow
users to download parts of the database of Wikimedia projects). The primary
advantage of this new format will be that it should take shorter time to
create the dump, because the previous dump can be reused.
Any comments or co-mentors (as far as I know, Ariel Glenn is currently the
only potential mentor on this project) are welcome.
Petr Onderka
[[en:User:Svick]]