Hi Dan,
It probably should still output a warning and list all identical files, so they can be
tackled manually after the upload.
Giving preference to the media file from the GLAM probably makes sense, but you still want
to substitute any other identical files, right?
When manually tackling identical files, the following potential issues should be looked
at:
- Existing files may already be included into Wikipedia articles - it probably would make
sense to replace them by the newly uploaded version
- Metadata of existing files may be more complete or complementary to the metadata
provided by the GLAM, especially if it has been enhanced by the community (translations,
etc.) - it certainly would make sense not to throw away these additional metadata that
have been contributed by the community.
- There might be derivatives based on existing files - it certainly makes sense to ensure
that they can be properly tracked to the original file
This may not be complete; maybe someone actively involved in uploads that have encountered
the problem of such duplicates wants to go through the list, complement it and add it to
the help/documentation pages...
Have a nice week end!
Beat
-----Original Message-----
From: glamtools-bounces(a)lists.wikimedia.org [mailto:glamtools-bounces@lists.wikimedia.org]
On Behalf Of dan entous
Sent: Samstag, 3. Mai 2014 10:03
To: Conversations revolving around the development of GLAM Digital Tools
Subject: Re: [Glamtools] Advice on uploading a batch from a GLAM when individuals have
already uploaded some of that GLAMs images?
GWToolset ignores the SHA-1 duplication warning. as far as i remember, the intent is to
make sure the source of the mediafile and metadata is from the GLAM.
with kind regards,
dan
On May 3, 2014, at 09:36 , Fæ <faewik(a)gmail.com> wrote:
On 03/05/2014, Brian Wolff <bawolff(a)gmail.com>
wrote:
On May 2, 2014 5:40 AM, "Fæ"
<faewik(a)gmail.com> wrote:
I have had many issues around this in the past. If the images are
the
same in quality/resolution then avoid duplicating what is currently
on Commons. However if your versions are, in your view, better
quality then there is no problem uploading them as they are not true duplicates.
Digitally identical duplicates should be rejected automatically at
upload as the files have matching SHA-1 checks.
Thats from normal upload. Gwtoolset may be different. Anyways they
can be dealt with after the fact too if they are exactly identical as
its easy to detect later.
Just to clarify, does the GWT ignore the SHA-1 based duplicate warning
and upload the digitally identical duplicate as a new file?
If it does, rather than skipping it or giving a warning, then this
seems like a bug.
Fae
--
faewik(a)gmail.com
https://commons.wikimedia.org/wiki/User:Fae
_______________________________________________
Glamtools mailing list
Glamtools(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/glamtools
_______________________________________________
Glamtools mailing list
Glamtools(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/glamtools