Hi,
I'm taking part in an images discussion workshop with a number of academics tomorrow and could do with a statement about the WMF's long term commitment to supporting Wikimedia Commons (and other projects) in terms of the public availability of media. Is there an official published policy I can point to that includes, say, a 10 year or 100 commitment?
If it exists, this would be a key factor for researchers choosing where to share their images with the public.
Thanks, Fae -- http://enwp.org/user_talk:fae Guide to email tags: http://j.mp/faetags
Hoi, It is the explicit goal of the Wikimedia Foundation to make information available for as long as it exist. In addition to this, there are several copies at the Internet Archive.
If there is no statement that satisfies your need, it will not be hard for the WMF board to come up with one. Having such a statement by tomorrow is a bit much to ask for. Thanks, GerardM
On 2 June 2011 13:29, Fae faenwp@gmail.com wrote:
Hi,
I'm taking part in an images discussion workshop with a number of academics tomorrow and could do with a statement about the WMF's long term commitment to supporting Wikimedia Commons (and other projects) in terms of the public availability of media. Is there an official published policy I can point to that includes, say, a 10 year or 100 commitment?
If it exists, this would be a key factor for researchers choosing where to share their images with the public.
Thanks, Fae -- http://enwp.org/user_talk:fae Guide to email tags: http://j.mp/faetags
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Hello Fae,
There should be no explicit statement because the WMF holds it self-evident to preserve. The bigger problem might be the project scope. I don't know what kind of images your academic partners wishes to upload.
Kind regards Ziko
2011/6/2 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, It is the explicit goal of the Wikimedia Foundation to make information available for as long as it exist. In addition to this, there are several copies at the Internet Archive.
If there is no statement that satisfies your need, it will not be hard for the WMF board to come up with one. Having such a statement by tomorrow is a bit much to ask for. Thanks, GerardM
On 2 June 2011 13:29, Fae faenwp@gmail.com wrote:
Hi,
I'm taking part in an images discussion workshop with a number of academics tomorrow and could do with a statement about the WMF's long term commitment to supporting Wikimedia Commons (and other projects) in terms of the public availability of media. Is there an official published policy I can point to that includes, say, a 10 year or 100 commitment?
If it exists, this would be a key factor for researchers choosing where to share their images with the public.
Thanks, Fae -- http://enwp.org/user_talk:fae Guide to email tags: http://j.mp/faetags
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On 2 June 2011 13:24, Ziko van Dijk zvandijk@googlemail.com wrote:
There should be no explicit statement because the WMF holds it self-evident to preserve. The bigger problem might be the project scope. I don't know what kind of images your academic partners wishes to upload.
There's also the matter that no particular image can be promised to be safe from local deletion processes.
(except, e.g., WMF logos etc. But even then there's a perennial policy suggestion to move those to a non-free repo not on Commons instead.)
- d.
On 06/02/2011 02:27 PM, David Gerard wrote:
On 2 June 2011 13:24, Ziko van Dijk zvandijk@googlemail.com wrote:
There should be no explicit statement because the WMF holds it self-evident to preserve. The bigger problem might be the project scope. I don't know what kind of images your academic partners wishes to upload.
There's also the matter that no particular image can be promised to be safe from local deletion processes.
(except, e.g., WMF logos etc. But even then there's a perennial policy suggestion to move those to a non-free repo not on Commons instead.)
Full version could be always kept at archive.org.
On Thu, Jun 02, 2011 at 02:24:37PM +0200, Ziko van Dijk wrote:
Hello Fae,
There should be no explicit statement because the WMF holds it self-evident to preserve.
That reminds me of something O:-)
Perhaps something like this?
We, the wikimedia movement, hold these truths to be self evident: * That neutrality is the path to knowledge * That all knowledge should be available to all people no matter when or wherever they want it, and be free to study, free to share, free to improve * And that this state of affairs should hold in perpetuity, so that our children, and their children and etc. can benefit.
Might need some editing to make it perfect. Meta someplace?
sincerely, Kim Bruning
Kim Bruning wrote:
On Thu, Jun 02, 2011 at 02:24:37PM +0200, Ziko van Dijk wrote:
Hello Fae,
There should be no explicit statement because the WMF holds it self-evident to preserve.
That reminds me of something O:-)
Perhaps something like this?
We, the wikimedia movement, hold these truths to be self evident:
- That neutrality is the path to knowledge
- That all knowledge should be available to all people no matter when
or wherever they want it, and be free to study, free to share, free to improve
- And that this state of affairs should hold in perpetuity, so that
our children, and their children and etc. can benefit.
Might need some editing to make it perfect. Meta someplace?
You've missed out "That all editors are created equal", a palpable falsehood.
Fae, 02/06/2011 13:29:
I'm taking part in an images discussion workshop with a number of academics tomorrow and could do with a statement about the WMF's long term commitment to supporting Wikimedia Commons (and other projects) in terms of the public availability of media. Is there an official published policy I can point to that includes, say, a 10 year or 100 commitment?
The only thing I can remember is http://en.wikipedia.org/wiki/Wikipedia:Ten_things_you_may_not_know_about_Wikipedia#You_can.27t_actually_change_anything_in_Wikipedia.E2.80.A6
Nemo
Briefly responding to a couple of points raised so far:
Yes, there is a need for a policy as otherwise the WMF would have no long term operational archive plan. "Self evident" is insufficient in order to budget and plan in a credible way. If as the planned outcome of a research project I had a large image donation to make and such a commitment was absent, I would prefer to mass donate images of public interest to an organization that had one, and assume that at some point e-volunteers at Wikimedia Commons would take the initiative and port in what they fancied.
The people I'm workshopping with tomorrow have research roles within a number of leading universities along with a number of research organizations under the umbrella of the Wellcome Trust (the largest charity in the UK) and a variety of semi-associated organizations such as Cancer Research UK, Open Research Computation, Bioinformatics Training Network and FlyBase. All these folks have large image assets to discuss and are keen to move forward with an open solution to recommend on their personal networks for the long, long term public good.
I appreciate the image deletion issue, what we are talking about here are planned batch uploads of high quality donations. Part of that planning would be to discuss the relevance to the public of large number of research images and compliance with existing Commons guidelines. There may well be cases, for example many thousands of similar images of mutant drosophila, where Wikimedia Commons is not the right place for a full donation and a more specialized database host is needed.
Cheers, Fae -- http://enwp.org/user_talk:fae Guide to email tags: http://j.mp/faetags
Briefly responding to a couple of points raised so far:
Yes, there is a need for a policy as otherwise the WMF would have no long term operational archive plan. "Self evident" is insufficient in order to budget and plan in a credible way. If as the planned outcome of a research project I had a large image donation to make and such a commitment was absent, I would prefer to mass donate images of public interest to an organization that had one, and assume that at some point e-volunteers at Wikimedia Commons would take the initiative and port in what they fancied.
The people I'm workshopping with tomorrow have research roles within a number of leading universities along with a number of research organizations under the umbrella of the Wellcome Trust (the largest charity in the UK) and a variety of semi-associated organizations such as Cancer Research UK, Open Research Computation, Bioinformatics Training Network and FlyBase. All these folks have large image assets to discuss and are keen to move forward with an open solution to recommend on their personal networks for the long, long term public good.
I appreciate the image deletion issue, what we are talking about here are planned batch uploads of high quality donations. Part of that planning would be to discuss the relevance to the public of large number of research images and compliance with existing Commons guidelines. There may well be cases, for example many thousands of similar images of mutant drosophila, where Wikimedia Commons is not the right place for a full donation and a more specialized database host is needed.
Cheers, Fae -- http://enwp.org/user_talk:fae Guide to email tags: http://j.mp/faetags
Compared to many institutions, undoubtedly including some of those you will be communicating with, the Wikimedia Foundation has very limited assets and little or no endowment. And, of course, essentially no staff other than our volunteers.
I think what needs to happen is to explore ways to cooperate using each institutions relative assets. That might include, for example, endowing Commons with assets sufficient to support long term archival services as well as a corporate commitment on the Foundation's part to fulfill such services on an institutional basis, read centuries...
I'm sure there are other ways the Foundation could cooperate for public benefit and other partners who could participate in such consortiums. The threshold requirement is a commitment to accessible free public access under a fully featured open source license.
Fred
On 2 June 2011 15:19, Fred Bauder fredbaud@fairpoint.net wrote:
I think what needs to happen is to explore ways to cooperate using each institutions relative assets. That might include, for example, endowing Commons with assets sufficient to support long term archival services as well as a corporate commitment on the Foundation's part to fulfill such services on an institutional basis, read centuries...
One important plus point for WMF is that unlike, e.g. Flickr, we are not subject to corporate whims. This means that our supply of a service is not contingent on it turning a profit.
- d.
On 2 June 2011 14:21, Fae faenwp@gmail.com wrote:
Briefly responding to a couple of points raised so far:
Yes, there is a need for a policy as otherwise the WMF would have no long term operational archive plan.
Why would we have an archive plan? Archives are for things that aren't expected to needed on a regular basis any more but may need to be referred to in the future. We're not going to archive things on Commons, they'll just stay on Commons indefinitely.
Sure Tom, here's a SciFi user story:
In 2016 San Francisco has a major earthquake and the servers and operational facilities for the WMF are damaged beyond repair. The emergency hot switchover to Hong Kong is delayed due to an ongoing DoS attack from Eastern European countries. The switchover eventually appears successful and data is synchronized with Hong Kong for the next 3 weeks. At the end of 3 weeks, with a massive raft of escalating complaints about images disappearing, it is realized that this is a result of local data caches expiring. The DoS attack covered the tracks of a passive data worm that only activates during back-up cycles and the loss is irrecoverable due backups aged over 2 weeks being automatically deleted. Due to no archive strategy it is estimated that the majority of digital assets have been permanently lost and estimates for 60% partial reconstruction from remaining cache snapshots and independent global archive sites run to over 2 years of work.
Cheers, Fae -- http://enwp.org/user_talk:fae Guide to email tags: http://j.mp/faetags
On 2 June 2011 18:27, Thomas Dalton thomas.dalton@gmail.com wrote:
On 2 June 2011 14:21, Fae faenwp@gmail.com wrote:
Briefly responding to a couple of points raised so far:
Yes, there is a need for a policy as otherwise the WMF would have no long term operational archive plan.
Why would we have an archive plan? Archives are for things that aren't expected to needed on a regular basis any more but may need to be referred to in the future. We're not going to archive things on Commons, they'll just stay on Commons indefinitely.
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On 2 June 2011 18:48, Fae faenwp@gmail.com wrote:
Sure Tom, here's a SciFi user story:
In 2016 San Francisco has a major earthquake and the servers and operational facilities for the WMF are damaged beyond repair. The emergency hot switchover to Hong Kong is delayed due to an ongoing DoS attack from Eastern European countries. The switchover eventually appears successful and data is synchronized with Hong Kong for the next 3 weeks. At the end of 3 weeks, with a massive raft of escalating complaints about images disappearing, it is realized that this is a result of local data caches expiring. The DoS attack covered the tracks of a passive data worm that only activates during back-up cycles and the loss is irrecoverable due backups aged over 2 weeks being automatically deleted. Due to no archive strategy it is estimated that the majority of digital assets have been permanently lost and estimates for 60% partial reconstruction from remaining cache snapshots and independent global archive sites run to over 2 years of work.
Ah, you don't mean "archive". You mean "backup". They are very different things and serve very different purposes. The backing up of images is an issue. The text exists in loads of places, but there is a risk of losing the images. I know it has been discussed numerous times being, so hopefully the WMF is working on it (or may have recently put something in place that I'm not aware of).
On 2 June 2011 18:48, Fae faenwp@gmail.com wrote:
In 2016 San Francisco has a major earthquake and the servers and operational facilities for the WMF are damaged beyond repair. The emergency hot switchover to Hong Kong is delayed due to an ongoing DoS attack from Eastern European countries. The switchover eventually appears successful and data is synchronized with Hong Kong for the next 3 weeks. At the end of 3 weeks, with a massive raft of escalating complaints about images disappearing, it is realized that this is a result of local data caches expiring. The DoS attack covered the tracks of a passive data worm that only activates during back-up cycles and the loss is irrecoverable due backups aged over 2 weeks being automatically deleted. Due to no archive strategy it is estimated that the majority of digital assets have been permanently lost and estimates for 60% partial reconstruction from remaining cache snapshots and independent global archive sites run to over 2 years of work.
This sort of scenario is why some of us have a thing about the backups :-)
(Is there a good image backup of Commons and of the larger wikis, and - and this one may be trickier - has anyone ever downloaded said backups?)
- d.
On Thu, Jun 2, 2011 at 10:55 AM, David Gerard dgerard@gmail.com wrote:
On 2 June 2011 18:48, Fae faenwp@gmail.com wrote:
In 2016 San Francisco has a major earthquake and the servers and operational facilities for the WMF are damaged beyond repair. The emergency hot switchover to Hong Kong is delayed due to an ongoing DoS attack from Eastern European countries. The switchover eventually appears successful and data is synchronized with Hong Kong for the next 3 weeks. At the end of 3 weeks, with a massive raft of escalating complaints about images disappearing, it is realized that this is a result of local data caches expiring. The DoS attack covered the tracks of a passive data worm that only activates during back-up cycles and the loss is irrecoverable due backups aged over 2 weeks being automatically deleted. Due to no archive strategy it is estimated that the majority of digital assets have been permanently lost and estimates for 60% partial reconstruction from remaining cache snapshots and independent global archive sites run to over 2 years of work.
This sort of scenario is why some of us have a thing about the backups :-)
(Is there a good image backup of Commons and of the larger wikis, and
- and this one may be trickier - has anyone ever downloaded said
backups?)
- d.
I've floated this to Erik a couple of times, but if the Foundation would like an IT disaster response / business continuity audit, I can do those.
On Thu, Jun 2, 2011 at 11:52 AM, George Herbert george.herbert@gmail.com wrote:
On Thu, Jun 2, 2011 at 10:55 AM, David Gerard dgerard@gmail.com wrote:
On 2 June 2011 18:48, Fae faenwp@gmail.com wrote:
In 2016 San Francisco has a major earthquake and the servers and operational facilities for the WMF are damaged beyond repair. The emergency hot switchover to Hong Kong is delayed due to an ongoing DoS attack from Eastern European countries. The switchover eventually appears successful and data is synchronized with Hong Kong for the next 3 weeks. At the end of 3 weeks, with a massive raft of escalating complaints about images disappearing, it is realized that this is a result of local data caches expiring. The DoS attack covered the tracks of a passive data worm that only activates during back-up cycles and the loss is irrecoverable due backups aged over 2 weeks being automatically deleted. Due to no archive strategy it is estimated that the majority of digital assets have been permanently lost and estimates for 60% partial reconstruction from remaining cache snapshots and independent global archive sites run to over 2 years of work.
This sort of scenario is why some of us have a thing about the backups :-)
(Is there a good image backup of Commons and of the larger wikis, and
- and this one may be trickier - has anyone ever downloaded said
backups?)
- d.
I've floated this to Erik a couple of times, but if the Foundation would like an IT disaster response / business continuity audit, I can do those.
Right, when Fae asked her question I was thinking of the more philosophical type of planning for storage that archives often do ("as a matter of course we retain documents for 10 years, or in perpetuity, or whatever"); but disaster and backup planning are also relevant. That's documented as a part of technical operations rather than as board-level policies; I think we're all on the same page about caring about this issue though. It is also relevant that the WMF is a financially stable non-profit, and thus unlikely to go out of business through the vagaries of the market.
-- phoebe
On 02/06/11 19:52, George Herbert wrote:
On Thu, Jun 2, 2011 at 10:55 AM, David Gerarddgerard@gmail.com wrote:
On 2 June 2011 18:48, Faefaenwp@gmail.com wrote:
In 2016 San Francisco has a major earthquake and the servers and operational facilities for the WMF are damaged beyond repair. The emergency hot switchover to Hong Kong is delayed due to an ongoing DoS attack from Eastern European countries. The switchover eventually appears successful and data is synchronized with Hong Kong for the next 3 weeks. At the end of 3 weeks, with a massive raft of escalating complaints about images disappearing, it is realized that this is a result of local data caches expiring. The DoS attack covered the tracks of a passive data worm that only activates during back-up cycles and the loss is irrecoverable due backups aged over 2 weeks being automatically deleted. Due to no archive strategy it is estimated that the majority of digital assets have been permanently lost and estimates for 60% partial reconstruction from remaining cache snapshots and independent global archive sites run to over 2 years of work.
This sort of scenario is why some of us have a thing about the backups :-)
(Is there a good image backup of Commons and of the larger wikis, and
- and this one may be trickier - has anyone ever downloaded said
backups?)
- d.
I've floated this to Erik a couple of times, but if the Foundation would like an IT disaster response / business continuity audit, I can do those.
Tape is -- still -- your friend here. Flip the write-protect after writing, have two sets of off-site tapes, one copy of each in each of two secure and widely separated off-site locations run by two different organizations, and you're sorted.
Tape is the dumb backstop that will keep the data even when your supposedly infallible replicated and redundant systems fail. For example, it got Google out of a hole quite recently when they had to restore a significant number of Gmail accounts from tape. (see http://www.talkincloud.com/the-solution-to-the-gmail-glitch-tape-backup/ )
And, unlike other long-term storage media, there is a long history of tape storage, an understanding of its practical lifespan and risks, and well-understood procedures for making and verifying duplicate sub-master copies to new tape technologies over time to extend archive life, etc. etc.
If we say that Wikimedia Commons currently has ~10M images, and if allow 1Mbyte per image, that's only 10 TB: that will fit nicely on seven LTO5 tapes. If you use LTFS, you can also make data access and long-term data robustness easier. If you like, you can slip in a complete dump of the Mediawiki source and Commons database on each tape, as well.
Even if I'm wrong by an order of magnitude, and 140 tapes are needed, instead of 14, that's still less than $10k of media -- and I wouldn't be surprised if tape storage companies wouldn't be eager to vie to be the company that can claim it donates the media and drives which provide Wikipedia's long-term backup system.
With two tape drives being run at once at an optimal 140 MB/s each, the whole backup would take less than a day. Even if I was wrong about both the writing speed and archive size by an order of magnitude each, this would still be less than three months.
The same tape systems could also, trivally, be used to back up all the other WMF sites, on similar lines.
-- Neil
On Thu, Jun 2, 2011 at 16:11, Neil Harris neil@tonal.clara.co.uk wrote:
Tape is -- still -- your friend here. Flip the write-protect after writing, have two sets of off-site tapes, one copy of each in each of two secure and widely separated off-site locations run by two different organizations, and you're sorted.
The mechanics of the backup are largely irrelevant. What matters are the *policies*: what data do you back up, when do you back it up, how often do you test your backups, and so on. Once you've got that sorted out, it doesn't really matter whether you're storing the backups on tape, remote servers, or magic pixie dust.
On 03/06/11 00:44, Mark Wagner wrote:
On Thu, Jun 2, 2011 at 16:11, Neil Harrisneil@tonal.clara.co.uk wrote:
Tape is -- still -- your friend here. Flip the write-protect after writing, have two sets of off-site tapes, one copy of each in each of two secure and widely separated off-site locations run by two different organizations, and you're sorted.
The mechanics of the backup are largely irrelevant. What matters are the *policies*: what data do you back up, when do you back it up, how often do you test your backups, and so on. Once you've got that sorted out, it doesn't really matter whether you're storing the backups on tape, remote servers, or magic pixie dust.
Not quite.
You're right about procedures, but you can't begin defining procedures until you have something concrete to aim at.
Tape is the One True Way for large scale backup, even today (ask Google), and I thought it might be useful to give an illustration of just how cheap it would be to use. Tape is a great simplifier, and eliminates a lot of the fanciness and feature-bloat associated with more sophisticated systems -- more sophisticated is not necessarily better.
Here's a straw man proposal for procedures:
I'd suggest backing up _everything_ -- cluster servers, local office IT servers, staff PCs, the lot -- for WMF internal archive and disaster recovery purposes. Something like monthly incremental backups, filing away media to the remote sites after verification, and yearly or six-monthly total backups to a complete new set of fresh media. For only a month's worth of work, replicated disk copies is fine: the tape archive is a back-stop, for when the replicated disks fail.
Dumps for external archives could also be made using the same drives, but to different media, and with a much more restrictive policy about what is saved.
-- Neil
On Thu, Jun 2, 2011 at 5:17 PM, Neil Harris neil@tonal.clara.co.uk wrote:
On 03/06/11 00:44, Mark Wagner wrote:
On Thu, Jun 2, 2011 at 16:11, Neil Harrisneil@tonal.clara.co.uk wrote:
Tape is -- still -- your friend here. Flip the write-protect after writing, have two sets of off-site tapes, one copy of each in each of two secure and widely separated off-site locations run by two different organizations, and you're sorted.
The mechanics of the backup are largely irrelevant. What matters are the *policies*: what data do you back up, when do you back it up, how often do you test your backups, and so on. Once you've got that sorted out, it doesn't really matter whether you're storing the backups on tape, remote servers, or magic pixie dust.
Not quite.
You're right about procedures, but you can't begin defining procedures until you have something concrete to aim at.
Tape is the One True Way for large scale backup, even today (ask Google), and I thought it might be useful to give an illustration of just how cheap it would be to use. Tape is a great simplifier, and eliminates a lot of the fanciness and feature-bloat associated with more sophisticated systems -- more sophisticated is not necessarily better.
I have done large enterprise scale backup (not Google-scale, but there really isn't anyone else at Google's scale...) entirely without tape, just using nearline disk. These days it's in fact not unreasonable to do it that way. Offsiting the backups via networks versus physical tape moves are pretty much equivalent here.
That is neither here nor there to the policy question, however.
I think this is an area that I, as a technical domain expert, wish I knew more about the WMF operations staff detailed implementation and plans here; but the staff are competent folks and I don't know of any actual gaps from reasonable industry practice.
If the community is sufficiently concerned that there may be a gap, then the board should perhaps either request staff to be more open, or get an independent consultant in to review if operational details are thought to be sensitive.
We already have a policy covering data preservation and recovery under any foreseeable disaster scenarios: http://en.wikipedia.org/wiki/WP:TERMINAL
;)
Ryan Kaldari
On 6/2/11 4:44 PM, Mark Wagner wrote:
On Thu, Jun 2, 2011 at 16:11, Neil Harrisneil@tonal.clara.co.uk wrote:
Tape is -- still -- your friend here. Flip the write-protect after writing, have two sets of off-site tapes, one copy of each in each of two secure and widely separated off-site locations run by two different organizations, and you're sorted.
The mechanics of the backup are largely irrelevant. What matters are the *policies*: what data do you back up, when do you back it up, how often do you test your backups, and so on. Once you've got that sorted out, it doesn't really matter whether you're storing the backups on tape, remote servers, or magic pixie dust.
On 2 June 2011 14:21, Fae faenwp@gmail.com wrote:
Briefly responding to a couple of points raised so far:
Yes, there is a need for a policy as otherwise the WMF would have no long term operational archive plan.
Why would we have an archive plan? Archives are for things that aren't expected to needed on a regular basis any more but may need to be referred to in the future. We're not going to archive things on Commons, they'll just stay on Commons indefinitely.
If an image is hosted on Commons for 100 years and NEVER used by any other Wikimedia project would we, or why should we, retain it?
Fred
Because Commons is to be used by the world, not just sister projects. If the New York Times Online links a picture in from Commons (and credits it properly) are we going to make their later-historical story useless by deleting the picture ?
-----Original Message----- From: Fred Bauder fredbaud@fairpoint.net To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Sent: Thu, Jun 2, 2011 11:01 am Subject: Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?
On 2 June 2011 14:21, Fae faenwp@gmail.com wrote:
Briefly responding to a couple of points raised so far:
Yes, there is a need for a policy as otherwise the WMF would have no long term operational archive plan.
Why would we have an archive plan? Archives are for things that aren't expected to needed on a regular basis any more but may need to be referred to in the future. We're not going to archive things on Commons, they'll just stay on Commons indefinitely.
If an image is hosted on Commons for 100 years and NEVER used by any other Wikimedia project would we, or why should we, retain it?
Fred
_______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
A lot of questions here. IF an image is hosted and not used for in 100 years, it would be up to the people in 100 years to decide. Any guarantee we try to make for such periods is absolutely useless. Every rule we make can be re-discussed and changed in such a period.
If an organization such as the NYT uses an image from commons by inline linking, then we could indeed invalidate their historical research by deleting that image if it contains a copyright violation. CV are the main reason for deletion. Other reasons include bad quality, and duplicate images.
Teun Spaans Everybody knew it was impossible, until someone turned up who didnt know that
On Thu, Jun 2, 2011 at 8:14 PM, Wjhonson wjhonson@aol.com wrote:
Because Commons is to be used by the world, not just sister projects. If the New York Times Online links a picture in from Commons (and credits it properly) are we going to make their later-historical story useless by deleting the picture ?
-----Original Message----- From: Fred Bauder fredbaud@fairpoint.net To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Sent: Thu, Jun 2, 2011 11:01 am Subject: Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?
On 2 June 2011 14:21, Fae faenwp@gmail.com wrote:
Briefly responding to a couple of points raised so far:
Yes, there is a need for a policy as otherwise the WMF would have no long term operational archive plan.
Why would we have an archive plan? Archives are for things that aren't expected to needed on a regular basis any more but may need to be referred to in the future. We're not going to archive things on Commons, they'll just stay on Commons indefinitely.
If an image is hosted on Commons for 100 years and NEVER used by any other Wikimedia project would we, or why should we, retain it?
Fred
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Thu, Jun 2, 2011 at 6:21 AM, Fae faenwp@gmail.com wrote:
Briefly responding to a couple of points raised so far:
Yes, there is a need for a policy as otherwise the WMF would have no long term operational archive plan. "Self evident" is insufficient in order to budget and plan in a credible way. If as the planned outcome of a research project I had a large image donation to make and such a commitment was absent, I would prefer to mass donate images of public interest to an organization that had one, and assume that at some point e-volunteers at Wikimedia Commons would take the initiative and port in what they fancied.
Fae,
There is no explicit, official operational archive plan of the type you are referring to. I am familiar with the type of plan you mean -- archives and libraries in particular often have explicit retention plans that specify a date range. This kind of plan would likely be developed by the board as part of our long-range operational planning. There are difficulties, as others have pointed out, because unlike an archive we cannot guarantee retention of any particular item -- individual curation and editorial decisions are done by the community.
However, long-term preservation and dissemination of knowledge is an inherent and explicit part of our mission. You could point to: * Our mission statement, which says we will retain useful information from our projects on the Internet, free of charge, in perpetuity (http://wikimediafoundation.org/wiki/Mission_statement) * the fact that a free license enables redistribution and longer-term preservation support than copyright does, because others have the ability to preserve our collections even if the WMF itself fails (dumps are noted as a value, in our values statement: http://wikimediafoundation.org/wiki/Values).
As you note, the key part of this is free licensing under a compatible license. We are interested in supporting the ecosystem of free knowledge, so that if an organization wanted their primary archive to be someplace else (but accessible to Commons technically and through licensing) that's fine; we can upload. However, as an organization, we are absolutely committed to preserving free knowledge for the long term.
For this presentation, your preparation turn-around time is pretty short here, and I personally don't have time to pull together other community documents on this subject right now (maybe others do), but you can certainly tell the organizations our about our long-term commitment. Whether Commons is appropriate for them, however, depends on what they are looking for. The biggest argument for uploading collections to Wikimedia is not our function as an archival service (since we don't fulfill all of the requirements of a traditional archive), but rather the immense distribution and visibility our projects can give such collections, far exceeding any other online service, because of our global reach.
best, Phoebe (speaking as a member of the Board)
Thanks Phoebe, for my presentation I'll highlight long term preservation "in perpetuity" as the key point of interest and reflect some of the other issues raised on this thread about the suitability for certain types of donation.
I'm not expecting a WMF policy overnight, just thought that there might be something in existence. It sounds like an area of the mission that would be reasonable to translate into direct operational targets (say, a pragmatic 10 or 20 year plan).
Cheers, Fae -- http://enwp.org/user_talk:fae Guide to email tags: http://j.mp/faetags
On 2 June 2011 19:28, phoebe ayers phoebe.wiki@gmail.com wrote:
On Thu, Jun 2, 2011 at 6:21 AM, Fae faenwp@gmail.com wrote:
Briefly responding to a couple of points raised so far:
Yes, there is a need for a policy as otherwise the WMF would have no long term operational archive plan. "Self evident" is insufficient in order to budget and plan in a credible way. If as the planned outcome of a research project I had a large image donation to make and such a commitment was absent, I would prefer to mass donate images of public interest to an organization that had one, and assume that at some point e-volunteers at Wikimedia Commons would take the initiative and port in what they fancied.
Fae,
There is no explicit, official operational archive plan of the type you are referring to. I am familiar with the type of plan you mean -- archives and libraries in particular often have explicit retention plans that specify a date range. This kind of plan would likely be developed by the board as part of our long-range operational planning. There are difficulties, as others have pointed out, because unlike an archive we cannot guarantee retention of any particular item -- individual curation and editorial decisions are done by the community.
However, long-term preservation and dissemination of knowledge is an inherent and explicit part of our mission. You could point to:
- Our mission statement, which says we will retain useful information
from our projects on the Internet, free of charge, in perpetuity (http://wikimediafoundation.org/wiki/Mission_statement)
- the fact that a free license enables redistribution and longer-term
preservation support than copyright does, because others have the ability to preserve our collections even if the WMF itself fails (dumps are noted as a value, in our values statement: http://wikimediafoundation.org/wiki/Values).
As you note, the key part of this is free licensing under a compatible license. We are interested in supporting the ecosystem of free knowledge, so that if an organization wanted their primary archive to be someplace else (but accessible to Commons technically and through licensing) that's fine; we can upload. However, as an organization, we are absolutely committed to preserving free knowledge for the long term.
For this presentation, your preparation turn-around time is pretty short here, and I personally don't have time to pull together other community documents on this subject right now (maybe others do), but you can certainly tell the organizations our about our long-term commitment. Whether Commons is appropriate for them, however, depends on what they are looking for. The biggest argument for uploading collections to Wikimedia is not our function as an archival service (since we don't fulfill all of the requirements of a traditional archive), but rather the immense distribution and visibility our projects can give such collections, far exceeding any other online service, because of our global reach.
best, Phoebe (speaking as a member of the Board)
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
There are two caveats: nobody can tell the future of human cultural history or any individual legal organization, and while the repository and wikis as a whole, and virtually all legally hostable media of genuine value, are preserved indefinitely, obviously no guarantee can be given concerning any specific individual image or article.
Beyond that, the best guarantee is the license it's under. The Foundation licenses all its data and content (with the sole exception of non-free images used to illustrate articles on local wikis) under a license that allows anyone to use, copy, amend, or distribute them. The explicit purpose of doing so is so that anyone wishing to can not only redistribute it, but if they are unhappy with its prospects in WMF's custodianship, they can take all of it and archive it or fork from it - that is, start their own version based on all content, descriptions, data and articles they wish to take and use.
That right is enshrined on Wikipedia in policy and license - it's known as the "*right to fork*" [ie, to create derivatives and copies]. Our forking FAQ http://en.wikipedia.org/wiki/Wikipedia:FAQ/Forking expands on this giving details of where data can be downloaded, as well as Wikipedia holding a list of websites that mirror its contenthttp://en.wikipedia.org/wiki/Category:Websites_which_use_Wikipediafor anyone's use.
As the financial market crash proved, promises made by one organization are only useful insofar as that organization can promise to endure and meet them. Our approach is to spread our content and make sure others know we actively support re-archiving and reuse of it, ensuring that copies and archives will always exist.
At worst I cannot be sure if all data is routinely provided - a staff member can comment on this - but the policy, rights, traditions, choice of license, and endorsement of other sites doing so in practice, is our way of ensuring a practical commitment is made.
FT2
On Thu, Jun 2, 2011 at 12:29 PM, Fae faenwp@gmail.com wrote:
Hi,
I'm taking part in an images discussion workshop with a number of academics tomorrow and could do with a statement about the WMF's long term commitment to supporting Wikimedia Commons (and other projects) in terms of the public availability of media. Is there an official published policy I can point to that includes, say, a 10 year or 100 commitment?
If it exists, this would be a key factor for researchers choosing where to share their images with the public.
Thanks, Fae
OH, I see: Don't put your eggs all in one basket.
Fred
There are two caveats: nobody can tell the future of human cultural history or any individual legal organization, and while the repository and wikis as a whole, and virtually all legally hostable media of genuine value, are preserved indefinitely, obviously no guarantee can be given concerning any specific individual image or article.
Beyond that, the best guarantee is the license it's under. The Foundation licenses all its data and content (with the sole exception of non-free images used to illustrate articles on local wikis) under a license that allows anyone to use, copy, amend, or distribute them. The explicit purpose of doing so is so that anyone wishing to can not only redistribute it, but if they are unhappy with its prospects in WMF's custodianship, they can take all of it and archive it or fork from it - that is, start their own version based on all content, descriptions, data and articles they wish to take and use.
That right is enshrined on Wikipedia in policy and license - it's known as the "*right to fork*" [ie, to create derivatives and copies]. Our forking FAQ http://en.wikipedia.org/wiki/Wikipedia:FAQ/Forking expands on this giving details of where data can be downloaded, as well as Wikipedia holding a list of websites that mirror its contenthttp://en.wikipedia.org/wiki/Category:Websites_which_use_Wikipediafor anyone's use.
As the financial market crash proved, promises made by one organization are only useful insofar as that organization can promise to endure and meet them. Our approach is to spread our content and make sure others know we actively support re-archiving and reuse of it, ensuring that copies and archives will always exist.
At worst I cannot be sure if all data is routinely provided - a staff member can comment on this - but the policy, rights, traditions, choice of license, and endorsement of other sites doing so in practice, is our way of ensuring a practical commitment is made.
FT2
On Thu, Jun 2, 2011 at 12:29 PM, Fae faenwp@gmail.com wrote:
Hi,
I'm taking part in an images discussion workshop with a number of academics tomorrow and could do with a statement about the WMF's long term commitment to supporting Wikimedia Commons (and other projects) in terms of the public availability of media. Is there an official published policy I can point to that includes, say, a 10 year or 100 commitment?
If it exists, this would be a key factor for researchers choosing where to share their images with the public.
Thanks, Fae
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
OH, I see: Don't put your eggs all in one basket.
Fred
Actually, that is the benefit of giving Commons access to archives of images. Access by Commons under its license conventions gives access to everyone and results in all interesting images going viral.
There are images of little or no value, as anyone who has viewed an old photo album of family photos that has become disassociated from its family knows.
Fred
Hi all;
Just like the scripts to preserve wikis[1], I'm working in a new script to download all Wikimedia Commons images packed by day. But I have limited spare time. Sad that volunteers have to do this without any help from Wikimedia Foundation.
I started too an effort in meta: (with low activity) to mirror XML dumps.[2] If you know about universities or research groups which works with Wiki[pm]edia XML dumps, they would be a possible successful target to mirror them.
If you want to download the texts into your PC, you only need 100GB free and to run this Python script.[3]
I heard that Internet Archive saves XML dumps quarterly or so, but no official announcement. Also, I heard about Library of Congress wanting to mirror the dumps, but not news since a long time.
L'Encyclopédie has an "uptime"[4] of 260 years[5] and growing. Will Wiki[pm]edia projects reach that?
Regards, emijrp
[1] http://code.google.com/p/wikiteam/ [2] http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps [3] http://code.google.com/p/wikiteam/source/browse/trunk/wikipediadownloader.py [4] http://en.wikipedia.org/wiki/Uptime [5] http://en.wikipedia.org/wiki/Encyclop%C3%A9die
2011/6/2 Fae faenwp@gmail.com
Hi,
I'm taking part in an images discussion workshop with a number of academics tomorrow and could do with a statement about the WMF's long term commitment to supporting Wikimedia Commons (and other projects) in terms of the public availability of media. Is there an official published policy I can point to that includes, say, a 10 year or 100 commitment?
If it exists, this would be a key factor for researchers choosing where to share their images with the public.
Thanks, Fae -- http://enwp.org/user_talk:fae Guide to email tags: http://j.mp/faetags
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
wikimedia-l@lists.wikimedia.org