Hi Dan,
It probably should still output a warning and list all identical files, so they can be tackled manually after the upload. Giving preference to the media file from the GLAM probably makes sense, but you still want to substitute any other identical files, right?
When manually tackling identical files, the following potential issues should be looked at: - Existing files may already be included into Wikipedia articles - it probably would make sense to replace them by the newly uploaded version - Metadata of existing files may be more complete or complementary to the metadata provided by the GLAM, especially if it has been enhanced by the community (translations, etc.) - it certainly would make sense not to throw away these additional metadata that have been contributed by the community. - There might be derivatives based on existing files - it certainly makes sense to ensure that they can be properly tracked to the original file
This may not be complete; maybe someone actively involved in uploads that have encountered the problem of such duplicates wants to go through the list, complement it and add it to the help/documentation pages...
Have a nice week end!
Beat
-----Original Message----- From: glamtools-bounces@lists.wikimedia.org [mailto:glamtools-bounces@lists.wikimedia.org] On Behalf Of dan entous Sent: Samstag, 3. Mai 2014 10:03 To: Conversations revolving around the development of GLAM Digital Tools Subject: Re: [Glamtools] Advice on uploading a batch from a GLAM when individuals have already uploaded some of that GLAMs images?
GWToolset ignores the SHA-1 duplication warning. as far as i remember, the intent is to make sure the source of the mediafile and metadata is from the GLAM.
with kind regards, dan
On May 3, 2014, at 09:36 , Fæ faewik@gmail.com wrote:
On 03/05/2014, Brian Wolff bawolff@gmail.com wrote:
On May 2, 2014 5:40 AM, "Fæ" faewik@gmail.com wrote:
I have had many issues around this in the past. If the images are the
same in quality/resolution then avoid duplicating what is currently on Commons. However if your versions are, in your view, better quality then there is no problem uploading them as they are not true duplicates. Digitally identical duplicates should be rejected automatically at upload as the files have matching SHA-1 checks.
Thats from normal upload. Gwtoolset may be different. Anyways they can be dealt with after the fact too if they are exactly identical as its easy to detect later.
Just to clarify, does the GWT ignore the SHA-1 based duplicate warning and upload the digitally identical duplicate as a new file?
If it does, rather than skipping it or giving a warning, then this seems like a bug.
Fae
faewik@gmail.com https://commons.wikimedia.org/wiki/User:Fae
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
_______________________________________________ Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools