Hi all;
Can you imagine a day when Wikipedia is added to this list?[1]
WikiTeam have developed a script[2] to download all the Wikipedia dumps (and her sister projects) from dumps.wikimedia.org. It sorts in folders and checks md5sum. It only works on Linux (it uses wget).
You will need about 100GB to download all the 7z files.
Save our memory.
Regards, emijrp
[1] http://en.wikipedia.org/wiki/Destruction_of_libraries [2] http://code.google.com/p/wikiteam/source/browse/trunk/wikipediadownloader.py
Thank you, Emijrp!
What about the dump of Commons images? [for those with 10TB to spare]
SJ
On Sun, Jun 26, 2011 at 8:53 AM, emijrp emijrp@gmail.com wrote:
Hi all;
Can you imagine a day when Wikipedia is added to this list?[1]
WikiTeam have developed a script[2] to download all the Wikipedia dumps (and her sister projects) from dumps.wikimedia.org. It sorts in folders and checks md5sum. It only works on Linux (it uses wget).
You will need about 100GB to download all the 7z files.
Save our memory.
Regards, emijrp
[1] http://en.wikipedia.org/wiki/Destruction_of_libraries [2] http://code.google.com/p/wikiteam/source/browse/trunk/wikipediadownloader.py
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hi SJ;
You know that that is an old item in our TODO list ; )
I heard that Platonides developed a script for that task long time ago.
Platonides, are you there?
Regards, emijrp
2011/6/27 Samuel Klein sjklein@hcs.harvard.edu
Thank you, Emijrp!
What about the dump of Commons images? [for those with 10TB to spare]
SJ
On Sun, Jun 26, 2011 at 8:53 AM, emijrp emijrp@gmail.com wrote:
Hi all;
Can you imagine a day when Wikipedia is added to this list?[1]
WikiTeam have developed a script[2] to download all the Wikipedia dumps
(and
her sister projects) from dumps.wikimedia.org. It sorts in folders and checks md5sum. It only works on Linux (it uses wget).
You will need about 100GB to download all the 7z files.
Save our memory.
Regards, emijrp
[1] http://en.wikipedia.org/wiki/Destruction_of_libraries [2]
http://code.google.com/p/wikiteam/source/browse/trunk/wikipediadownloader.py
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- Samuel Klein identi.ca:sj w:user:sj +1 617 529 4266
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hi, Does anyone know if old dumps, from around 2004-2005, are still available somewhere?
Lior
On Mon, Jun 27, 2011 at 2:10 PM, emijrp emijrp@gmail.com wrote:
Hi SJ;
You know that that is an old item in our TODO list ; )
I heard that Platonides developed a script for that task long time ago.
Platonides, are you there?
Regards, emijrp
2011/6/27 Samuel Klein sjklein@hcs.harvard.edu
Thank you, Emijrp!
What about the dump of Commons images? [for those with 10TB to spare]
SJ
On Sun, Jun 26, 2011 at 8:53 AM, emijrp emijrp@gmail.com wrote:
Hi all;
Can you imagine a day when Wikipedia is added to this list?[1]
WikiTeam have developed a script[2] to download all the Wikipedia dumps
(and
her sister projects) from dumps.wikimedia.org. It sorts in folders and checks md5sum. It only works on Linux (it uses wget).
You will need about 100GB to download all the 7z files.
Save our memory.
Regards, emijrp
[1] http://en.wikipedia.org/wiki/Destruction_of_libraries [2]
http://code.google.com/p/wikiteam/source/browse/trunk/wikipediadownloader.py
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- Samuel Klein identi.ca:sj w:user:sj +1 617 529 4266
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
On Mon, Jun 27, 2011 at 7:10 AM, emijrp emijrp@gmail.com wrote:
Hi SJ;
You know that that is an old item in our TODO list ; )
I know, I know...
I don't mean to Domas by asking after the Commons dump every time dumps of any kind comes up.
It's just so inspiring to imagine that consolidated visual and auditory beauty being mirrored all around the world, it is difficult to resist.
SJ.
I heard that Platonides developed a script for that task long time ago.
Platonides, are you there?
Regards, emijrp
2011/6/27 Samuel Klein sjklein@hcs.harvard.edu
Thank you, Emijrp!
What about the dump of Commons images? [for those with 10TB to spare]
SJ
On Sun, Jun 26, 2011 at 8:53 AM, emijrp emijrp@gmail.com wrote:
Hi all;
Can you imagine a day when Wikipedia is added to this list?[1]
WikiTeam have developed a script[2] to download all the Wikipedia dumps
(and
her sister projects) from dumps.wikimedia.org. It sorts in folders and checks md5sum. It only works on Linux (it uses wget).
You will need about 100GB to download all the 7z files.
Save our memory.
Regards, emijrp
[1] http://en.wikipedia.org/wiki/Destruction_of_libraries [2]
http://code.google.com/p/wikiteam/source/browse/trunk/wikipediadownloader.py
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- Samuel Klein identi.ca:sj w:user:sj +1 617 529 4266
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Can you share your script with us?
2011/6/27 Platonides platonides@gmail.com
emijrp wrote:
Hi SJ;
You know that that is an old item in our TODO list ; )
I heard that Platonides developed a script for that task long time ago.
Platonides, are you there?
Regards, emijrp
Yes, I am. :)
xmldatadumps-l@lists.wikimedia.org