WikiTeam has just finished archiving all Wikimedia Commons files up to 2012 (and some more) on the Internet Archive: https://archive.org/details/wikimediacommons So far it's about 24 TB of archives and there are also a hundred torrents you can help seed, ranging from few hundred MB to over a TB, most around 400 GB. Everything is documented at https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_tarballs and if you want here are some ideas to help WikiTeam with coding: https://code.google.com/p/wikiteam/issues/list.
Nemo
Hi Nemo,
Op 13-10-2013 11:09, Federico Leva (Nemo) schreef:
WikiTeam has just finished archiving all Wikimedia Commons files up to 2012 (and some more) on the Internet Archive: https://archive.org/details/wikimediacommons So far it's about 24 TB of archives and there are also a hundred torrents you can help seed, ranging from few hundred MB to over a TB, most around 400 GB. Everything is documented at https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_tarballs and if you want here are some ideas to help WikiTeam with coding: https://code.google.com/p/wikiteam/issues/list.
Nice, this was really needed for https://meta.wikimedia.org/wiki/Right_to_fork (although I hope that never happens). I wonder who is going to use it and for what. Are you keeping statistics so we can get an idea what gets downloaded and how many times? Would it make sense if the WMF seeds some (or all) of these torrents?
Maarten
Maarten Dammers, 13/10/2013 13:50:
Nice, this was really needed for https://meta.wikimedia.org/wiki/Right_to_fork (although I hope that never happens). I wonder who is going to use it and for what. Are you keeping statistics so we can get an idea what gets downloaded and how many times?
There are already counts, see e.g. the descriptions I linked. This archive is mostly a "just in case", but you can never say.
Would it make sense if the WMF seeds some (or all) of these torrents?
Probably not: WMF has very limited bandwidth even for XML dumps, it wouldn't be particularly useful. However, there are many mirrors out there, so convincing some to seed a few torrents would be nice. Archive.org has decent bandwidth and no throttling; unless you happen to reach them via the horrible he.net links (especially transatlantic), it's mostly fine. https://monitor.archive.org/weathermap/weathermap.html
Nemo
Nice work Nemo!
2013/10/13 Federico Leva (Nemo) nemowiki@gmail.com
WikiTeam has just finished archiving all Wikimedia Commons files up to 2012 (and some more) on the Internet Archive: https://archive.org/details/ **wikimediacommons https://archive.org/details/wikimediacommons So far it's about 24 TB of archives and there are also a hundred torrents you can help seed, ranging from few hundred MB to over a TB, most around 400 GB. Everything is documented at <https://meta.wikimedia.org/** wiki/Mirroring_Wikimedia_**project_XML_dumps#Media_**tarballshttps://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_tarballs> and if you want here are some ideas to help WikiTeam with coding: < https://code.google.com/p/**wikiteam/issues/listhttps://code.google.com/p/wikiteam/issues/list
.
Nemo
-- You received this message because you are subscribed to the Google Groups "wikiteam-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to wikiteam-discuss+unsubscribe@**googlegroups.comwikiteam-discuss%2Bunsubscribe@googlegroups.com . For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out .
+1 kudos to the whole wikiteam!
On Sun, Oct 13, 2013 at 5:31 PM, Emilio J. Rodríguez-Posada emijrp@gmail.com wrote:
Nice work Nemo!
2013/10/13 Federico Leva (Nemo) nemowiki@gmail.com
WikiTeam has just finished archiving all Wikimedia Commons files up to 2012 (and some more) on the Internet Archive: https://archive.org/details/wikimediacommons So far it's about 24 TB of archives and there are also a hundred torrents you can help seed, ranging from few hundred MB to over a TB, most around 400 GB. Everything is documented at https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_tarballs and if you want here are some ideas to help WikiTeam with coding: https://code.google.com/p/wikiteam/issues/list.
Nemo
-- You received this message because you are subscribed to the Google Groups "wikiteam-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to wikiteam-discuss+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
Nicely done, you guys!
Fabrice
On Oct 13, 2013, at 3:03 PM, Samuel Klein wrote:
+1 kudos to the whole wikiteam!
On Sun, Oct 13, 2013 at 5:31 PM, Emilio J. Rodríguez-Posada emijrp@gmail.com wrote:
Nice work Nemo!
2013/10/13 Federico Leva (Nemo) nemowiki@gmail.com
WikiTeam has just finished archiving all Wikimedia Commons files up to 2012 (and some more) on the Internet Archive: https://archive.org/details/wikimediacommons So far it's about 24 TB of archives and there are also a hundred torrents you can help seed, ranging from few hundred MB to over a TB, most around 400 GB. Everything is documented at https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_tarballs and if you want here are some ideas to help WikiTeam with coding: https://code.google.com/p/wikiteam/issues/list.
Nemo
-- You received this message because you are subscribed to the Google Groups "wikiteam-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to wikiteam-discuss+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
-- Samuel Klein @metasj w:user:sj +1 617 529 4266
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
_______________________________
Fabrice Florin Product Manager Wikimedia Foundation
Brilliant! Now we just need someone with a good idea, a fast connection and huge hard drives to do cool stuff with all those images :)
-- Hay
On Mon, Oct 14, 2013 at 1:32 AM, Fabrice Florin fflorin@wikimedia.org wrote:
Nicely done, you guys!
Fabrice
On Oct 13, 2013, at 3:03 PM, Samuel Klein wrote:
+1 kudos to the whole wikiteam!
On Sun, Oct 13, 2013 at 5:31 PM, Emilio J. Rodríguez-Posada emijrp@gmail.com wrote:
Nice work Nemo!
2013/10/13 Federico Leva (Nemo) nemowiki@gmail.com
WikiTeam has just finished archiving all Wikimedia Commons files up to
2012 (and some more) on the Internet Archive:
https://archive.org/details/wikimediacommons
So far it's about 24 TB of archives and there are also a hundred torrents
you can help seed, ranging from few hundred MB to over a TB, most around 400
GB.
Everything is documented at
https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_tarballs
and if you want here are some ideas to help WikiTeam with coding:
https://code.google.com/p/wikiteam/issues/list.
Nemo
--
You received this message because you are subscribed to the Google Groups
"wikiteam-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to wikiteam-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l
-- Samuel Klein @metasj w:user:sj +1 617 529 4266
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
Fabrice Florin Product Manager Wikimedia Foundation
http://en.wikipedia.org/wiki/User:Fabrice_Florin_(WMF)
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
On Mon, Oct 14, 2013 at 1:32 AM, Fabrice Florin fflorin@wikimedia.org wrote:
Nicely done, you guys!
Fabrice
On Oct 13, 2013, at 3:03 PM, Samuel Klein wrote:
+1 kudos to the whole wikiteam!
On Sun, Oct 13, 2013 at 5:31 PM, Emilio J. Rodríguez-Posada emijrp@gmail.com wrote:
Nice work Nemo!
2013/10/13 Federico Leva (Nemo) nemowiki@gmail.com
WikiTeam has just finished archiving all Wikimedia Commons files up to
2012 (and some more) on the Internet Archive:
https://archive.org/details/wikimediacommons
So far it's about 24 TB of archives and there are also a hundred torrents
you can help seed, ranging from few hundred MB to over a TB, most around 400
GB.
Everything is documented at
https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_tarballs
and if you want here are some ideas to help WikiTeam with coding:
https://code.google.com/p/wikiteam/issues/list.
Nemo
--
You received this message because you are subscribed to the Google Groups
"wikiteam-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to wikiteam-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l
-- Samuel Klein @metasj w:user:sj +1 617 529 4266
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
Fabrice Florin Product Manager Wikimedia Foundation
http://en.wikipedia.org/wiki/User:Fabrice_Florin_(WMF)
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
Hoi, The basic problem with all these images at archive.org is the same Commons has. How do you find that useful image. Commons is where you can contribute but how about actually using and finding all the great material that is hidden so well ? Thanks, GerardM
On 14 October 2013 13:15, Hay (Husky) huskyr@gmail.com wrote:
Brilliant! Now we just need someone with a good idea, a fast connection and huge hard drives to do cool stuff with all those images :)
-- Hay
On Mon, Oct 14, 2013 at 1:32 AM, Fabrice Florin fflorin@wikimedia.org wrote:
Nicely done, you guys!
Fabrice
On Oct 13, 2013, at 3:03 PM, Samuel Klein wrote:
+1 kudos to the whole wikiteam!
On Sun, Oct 13, 2013 at 5:31 PM, Emilio J. Rodríguez-Posada emijrp@gmail.com wrote:
Nice work Nemo!
2013/10/13 Federico Leva (Nemo) nemowiki@gmail.com
WikiTeam has just finished archiving all Wikimedia Commons files up to
2012 (and some more) on the Internet Archive:
https://archive.org/details/wikimediacommons
So far it's about 24 TB of archives and there are also a hundred torrents
you can help seed, ranging from few hundred MB to over a TB, most around
400
GB.
Everything is documented at
<
https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_...
and if you want here are some ideas to help WikiTeam with coding:
https://code.google.com/p/wikiteam/issues/list.
Nemo
--
You received this message because you are subscribed to the Google Groups
"wikiteam-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to wikiteam-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l
-- Samuel Klein @metasj w:user:sj +1 617 529
4266
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
Fabrice Florin Product Manager Wikimedia Foundation
http://en.wikipedia.org/wiki/User:Fabrice_Florin_(WMF)
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
On Mon, Oct 14, 2013 at 1:32 AM, Fabrice Florin fflorin@wikimedia.org wrote:
Nicely done, you guys!
Fabrice
On Oct 13, 2013, at 3:03 PM, Samuel Klein wrote:
+1 kudos to the whole wikiteam!
On Sun, Oct 13, 2013 at 5:31 PM, Emilio J. Rodríguez-Posada emijrp@gmail.com wrote:
Nice work Nemo!
2013/10/13 Federico Leva (Nemo) nemowiki@gmail.com
WikiTeam has just finished archiving all Wikimedia Commons files up to
2012 (and some more) on the Internet Archive:
https://archive.org/details/wikimediacommons
So far it's about 24 TB of archives and there are also a hundred torrents
you can help seed, ranging from few hundred MB to over a TB, most around
400
GB.
Everything is documented at
<
https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_...
and if you want here are some ideas to help WikiTeam with coding:
https://code.google.com/p/wikiteam/issues/list.
Nemo
--
You received this message because you are subscribed to the Google Groups
"wikiteam-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to wikiteam-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l
-- Samuel Klein @metasj w:user:sj +1 617 529
4266
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
Fabrice Florin Product Manager Wikimedia Foundation
http://en.wikipedia.org/wiki/User:Fabrice_Florin_(WMF)
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
-- Kind regards, -- Hay Kranen Wikipedian in Residence Koninklijke Bibliotheek & Nationaal Archief (the Netherlands) http://www.twitter.com/hayify
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
2013/10/14 Gerard Meijssen gerard.meijssen@gmail.com
Hoi, The basic problem with all these images at archive.org is the same Commons has. How do you find that useful image. Commons is where you can contribute but how about actually using and finding all the great material that is hidden so well ? Thanks, GerardM
Basically, you can't.
Internet Archive has this problem in several other topics, like its Wayback Machine, there is not search engine to search the billions grabbed websites by keyword of whatever.
Internet Archive is a pile of hard disks and a time capsule with backups, and they try to do the best at showing the materials (media players, pdf viewers), but it is not always easy or possible.
On 14 October 2013 13:15, Hay (Husky) huskyr@gmail.com wrote:
Brilliant! Now we just need someone with a good idea, a fast connection and huge hard drives to do cool stuff with all those images :)
-- Hay
On Mon, Oct 14, 2013 at 1:32 AM, Fabrice Florin fflorin@wikimedia.org wrote:
Nicely done, you guys!
Fabrice
On Oct 13, 2013, at 3:03 PM, Samuel Klein wrote:
+1 kudos to the whole wikiteam!
On Sun, Oct 13, 2013 at 5:31 PM, Emilio J. Rodríguez-Posada emijrp@gmail.com wrote:
Nice work Nemo!
2013/10/13 Federico Leva (Nemo) nemowiki@gmail.com
WikiTeam has just finished archiving all Wikimedia Commons files up to
2012 (and some more) on the Internet Archive:
https://archive.org/details/wikimediacommons
So far it's about 24 TB of archives and there are also a hundred
torrents
you can help seed, ranging from few hundred MB to over a TB, most
around 400
GB.
Everything is documented at
<
https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_...
and if you want here are some ideas to help WikiTeam with coding:
https://code.google.com/p/wikiteam/issues/list.
Nemo
--
You received this message because you are subscribed to the Google
Groups
"wikiteam-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send
an
email to wikiteam-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l
-- Samuel Klein @metasj w:user:sj +1 617 529
4266
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
Fabrice Florin Product Manager Wikimedia Foundation
http://en.wikipedia.org/wiki/User:Fabrice_Florin_(WMF)
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
On Mon, Oct 14, 2013 at 1:32 AM, Fabrice Florin fflorin@wikimedia.org wrote:
Nicely done, you guys!
Fabrice
On Oct 13, 2013, at 3:03 PM, Samuel Klein wrote:
+1 kudos to the whole wikiteam!
On Sun, Oct 13, 2013 at 5:31 PM, Emilio J. Rodríguez-Posada emijrp@gmail.com wrote:
Nice work Nemo!
2013/10/13 Federico Leva (Nemo) nemowiki@gmail.com
WikiTeam has just finished archiving all Wikimedia Commons files up to
2012 (and some more) on the Internet Archive:
https://archive.org/details/wikimediacommons
So far it's about 24 TB of archives and there are also a hundred
torrents
you can help seed, ranging from few hundred MB to over a TB, most
around 400
GB.
Everything is documented at
<
https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_...
and if you want here are some ideas to help WikiTeam with coding:
https://code.google.com/p/wikiteam/issues/list.
Nemo
--
You received this message because you are subscribed to the Google
Groups
"wikiteam-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send
an
email to wikiteam-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l
-- Samuel Klein @metasj w:user:sj +1 617 529
4266
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
Fabrice Florin Product Manager Wikimedia Foundation
http://en.wikipedia.org/wiki/User:Fabrice_Florin_(WMF)
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
-- Kind regards, -- Hay Kranen Wikipedian in Residence Koninklijke Bibliotheek & Nationaal Archief (the Netherlands) http://www.twitter.com/hayify
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
Emilio J. Rodríguez-Posada, 14/10/2013 14:18:
Internet Archive has this problem in several other topics, like its Wayback Machine, there is not search engine to search the billions grabbed websites by keyword of whatever.
Internet Archive is a pile of hard disks and a time capsule with backups, and they try to do the best at showing the materials (media players, pdf viewers), but it is not always easy or possible.
...and that's why Hay said we need someone with a good idea. :) Now it's easy to download the dataset (though it's not perfect), of course this doesn't automatically make something cool happen with it. Except replication of the data in multiple places, which is a good thing in itself.
Nemo
The first step is that we now have stuff in different places. There used to be a period of time a few years ago when there weren't *any* backups of Commons images. The next step is that somebody uses these dumps for a new creative project. Maybe someone working at a university with lots of bandwidth and lots of space...
On Mon, Oct 14, 2013 at 2:26 PM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Emilio J. Rodríguez-Posada, 14/10/2013 14:18:
Internet Archive has this problem in several other topics, like its Wayback Machine, there is not search engine to search the billions grabbed websites by keyword of whatever.
Internet Archive is a pile of hard disks and a time capsule with backups, and they try to do the best at showing the materials (media players, pdf viewers), but it is not always easy or possible.
...and that's why Hay said we need someone with a good idea. :) Now it's easy to download the dataset (though it's not perfect), of course this doesn't automatically make something cool happen with it. Except replication of the data in multiple places, which is a good thing in itself.
Nemo
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
Hoi, While I do agree that it is good to have the data in many places and, the Internet Archive on its own moves it to several places as well. Many of us have seen the IA servers at the Library of Alexandria.
While it is ok to find a use for the data at the IA, I would like us to concentrate first and foremost on how we can make better use of the media that is in Commons itself. How we can open it up to more use. Make Commons more accessable.
Do realise that when there is a good use for all the data that is in the IA, the same use and more could be made with the larger amount of data that is in Commons itself. Thanks, GerardM
On 14 October 2013 14:26, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Emilio J. Rodríguez-Posada, 14/10/2013 14:18:
Internet Archive has this problem in several other topics, like its
Wayback Machine, there is not search engine to search the billions grabbed websites by keyword of whatever.
Internet Archive is a pile of hard disks and a time capsule with backups, and they try to do the best at showing the materials (media players, pdf viewers), but it is not always easy or possible.
...and that's why Hay said we need someone with a good idea. :) Now it's easy to download the dataset (though it's not perfect), of course this doesn't automatically make something cool happen with it. Except replication of the data in multiple places, which is a good thing in itself.
Nemo
______________________________**_________________ Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/commons-lhttps://lists.wikimedia.org/mailman/listinfo/commons-l
It was not the original intention of us at WikiTeam to create these media tarballs so that researchers can use them from there. We created these tarballs so that everyone in the Wikimedia movement can be rest assured that there is one backup copy of their media on the Internet Archive. Trust me, the number of people who are going to actually use these tarballs are going to be lesser than the number of people editing the smaller wikis combined, certainly everyone is going to be using the data on Commons itself. So, we can fully focus on improving Commons to make it more data-accessible without taking the risk of having people working on the tarballs on the Internet Archive for research instead.
That being said, we can't even guarantee that the images in these tarballs are up-to-date. They are all downloaded and should be regarded as a snapshot of the image at the time of download, not an effective live backup of all the images on Commons. We are looking into creating subsequent tarballs that take into account the new uploads and the re-uploads so that Commons is actually backed up.
I guess the way we presented the tarballs on the Internet Archive is enough to deter anyone from conducting research directly from it, unless he/she does an in-depth mining of the data to get what he/she wants, but it certainly is going to be much tougher than mining the information from Commons directly in its current state.
On Mon, Oct 14, 2013 at 8:59 PM, Gerard Meijssen gerard.meijssen@gmail.comwrote:
Hoi, While I do agree that it is good to have the data in many places and, the Internet Archive on its own moves it to several places as well. Many of us have seen the IA servers at the Library of Alexandria.
While it is ok to find a use for the data at the IA, I would like us to concentrate first and foremost on how we can make better use of the media that is in Commons itself. How we can open it up to more use. Make Commons more accessable.
Do realise that when there is a good use for all the data that is in the IA, the same use and more could be made with the larger amount of data that is in Commons itself. Thanks, GerardM
On 14 October 2013 14:26, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Emilio J. Rodríguez-Posada, 14/10/2013 14:18:
Internet Archive has this problem in several other topics, like its
Wayback Machine, there is not search engine to search the billions grabbed websites by keyword of whatever.
Internet Archive is a pile of hard disks and a time capsule with backups, and they try to do the best at showing the materials (media players, pdf viewers), but it is not always easy or possible.
...and that's why Hay said we need someone with a good idea. :) Now it's easy to download the dataset (though it's not perfect), of course this doesn't automatically make something cool happen with it. Except replication of the data in multiple places, which is a good thing in itself.
Nemo
______________________________**_________________ Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/commons-lhttps://lists.wikimedia.org/mailman/listinfo/commons-l
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
I agree with Gerard of course, but I still think there is no contradiction between the two things: the minute, careful curation and interface improvement work on Commons; and the mass-preservation, analysis and usage of it as a dataset. They also are interests for different people with different resources, so there is no competition. For instance, I asked some help with WikiTeam's software, and that's in python, while MediaWiki is PHP and JavaScript: by doing both things we are more likely to cover all interests and *reduce* waste of resources. :)
Nemo
On 14 October 2013 13:59, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, While I do agree that it is good to have the data in many places and, the Internet Archive on its own moves it to several places as well. Many of us have seen the IA servers at the Library of Alexandria.
While it is ok to find a use for the data at the IA, I would like us to concentrate first and foremost on how we can make better use of the media that is in Commons itself. How we can open it up to more use. Make Commons more accessable.
And you need to stop right there. As in don't express a further opinion until you realise how wrong you are. You can't do any analysis on data that is lost. And non backed up data is just data that doesn't know that it is lost yet.
Hoi,
Geni, sorry but there is a difference of their being a backup within the WMF of Commons and there being a dataset of Commons at the IA that is not current. People can do all the analysis they want on the old data and it will not make any difference. It will not make the data that is currently in Commons any more accessible.
We have been told repeatedly that the data at the WMF is secure. Beyond that the data is like knowing what the maximum is the insurance policy will pay. You know it will be not enough. It is however very much a hypothetical question. How to make Commons usable is an here and now issue. Thanks, GerardM
On 14 October 2013 22:22, geni geniice@gmail.com wrote:
On 14 October 2013 13:59, Gerard Meijssen gerard.meijssen@gmail.comwrote:
Hoi, While I do agree that it is good to have the data in many places and, the Internet Archive on its own moves it to several places as well. Many of us have seen the IA servers at the Library of Alexandria.
While it is ok to find a use for the data at the IA, I would like us to concentrate first and foremost on how we can make better use of the media that is in Commons itself. How we can open it up to more use. Make Commons more accessable.
And you need to stop right there. As in don't express a further opinion until you realise how wrong you are. You can't do any analysis on data that is lost. And non backed up data is just data that doesn't know that it is lost yet.
-- geni
Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l
Hello,
I got the torrent file file, but there isn't any peer. https://archive.org/details/wikimediacommons-torrents https://archive.org/download/wikimediacommons-torrents/wikimediacommons-torr...
Regards,
Yann
2013/10/13 Federico Leva (Nemo) nemowiki@gmail.com
WikiTeam has just finished archiving all Wikimedia Commons files up to 2012 (and some more) on the Internet Archive: https://archive.org/details/ **wikimediacommons https://archive.org/details/wikimediacommons So far it's about 24 TB of archives and there are also a hundred torrents you can help seed, ranging from few hundred MB to over a TB, most around 400 GB. Everything is documented at <https://meta.wikimedia.org/** wiki/Mirroring_Wikimedia_**project_XML_dumps#Media_**tarballshttps://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_tarballs> and if you want here are some ideas to help WikiTeam with coding: < https://code.google.com/p/**wikiteam/issues/listhttps://code.google.com/p/wikiteam/issues/list
.
Nemo
______________________________**_________________ Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/commons-lhttps://lists.wikimedia.org/mailman/listinfo/commons-l
Yann Forget, 16/10/2013 21:29:
Hello,
I got the torrent file file, but there isn't any peer. https://archive.org/details/wikimediacommons-torrents https://archive.org/download/wikimediacommons-torrents/wikimediacommons-torr...
What do you mean? archive.org webseeds them from two separate servers, don't you manage to download?
Nemo