Hello,
It is a great pleasure for me to let you all know that wikimedia Israel had developed a web scraper that crawled in various archives in Israel and uploaded more than 28K free images to commons.
The tool (https://github.com/wmil-1946/wikiscraper) was developed to crawl the web site (list below) and clean them, remove water marks, etc.
Only images taken before 1946 were uploaded, as per the law in Israel and United states.
Many volunteers had joined the effort by categorizing, linking and using the images once uploaded.
It even got coverage on Israeli media (here (https://www.calcalist.co.il/internet/articles/0,7340,L-3749654,00.html))
Commons categories containing the files: [1 (https://commons.wikimedia.org/wiki/Category:Files_from_JNF_uploaded_by_Wikim...) 2 (https://commons.wikimedia.org/wiki/Category:Files_from_ISA_uploaded_by_Wikim...) 3 (https://commons.wikimedia.org/wiki/Category:Files_from_Moshe_Sharett_Archive...) 4 (https://commons.wikimedia.org/wiki/Category:Files_from_Palmah_Archive_upload...) 5 (https://commons.wikimedia.org/wiki/Category:Files_from_GPO_uploaded_by_Wikim...)]
Thanks to all the people involved in this gigantic effort and enjoy using historical images from the holy land! :)
Matanya
That's wonderful news. For those of us who don't speak Hebrew, can you say a bit more about how this project came about? -Pete
On Tue, Nov 13, 2018, 10:16 PM <matanya@foss.co.il wrote:
Hello,
It is a great pleasure for me to let you all know that wikimedia Israel had developed a web scraper that crawled in various archives in Israel and uploaded more than 28K free images to commons.
The tool (https://github.com/wmil-1946/wikiscraper) was developed to crawl the web site (list below) and clean them, remove water marks, etc.
Only images taken before 1946 were uploaded, as per the law in Israel and United states.
Many volunteers had joined the effort by categorizing, linking and using the images once uploaded.
It even got coverage on Israeli media (here ( https://www.calcalist.co.il/internet/articles/0,7340,L-3749654,00.html))
Commons categories containing the files: [1 ( https://commons.wikimedia.org/wiki/Category:Files_from_JNF_uploaded_by_Wikim...) 2 ( https://commons.wikimedia.org/wiki/Category:Files_from_ISA_uploaded_by_Wikim...) 3 ( https://commons.wikimedia.org/wiki/Category:Files_from_Moshe_Sharett_Archive...) 4 ( https://commons.wikimedia.org/wiki/Category:Files_from_Palmah_Archive_upload...) 5 ( https://commons.wikimedia.org/wiki/Category:Files_from_GPO_uploaded_by_Wikim... )]
Thanks to all the people involved in this gigantic effort and enjoy using historical images from the holy land! :)
Matanya _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Hello,
WOW! Thank you for your work!
Best Steinsplitter ________________________________ Von: Wikimedia-l wikimedia-l-bounces@lists.wikimedia.org im Auftrag von matanya@foss.co.il matanya@foss.co.il Gesendet: Mittwoch, 14. November 2018 07:15 An: wikimedia-l@lists.wikimedia.org Betreff: [Wikimedia-l] +28K images freed from Israel archives and uploaded to commons
Hello,
It is a great pleasure for me to let you all know that wikimedia Israel had developed a web scraper that crawled in various archives in Israel and uploaded more than 28K free images to commons.
The tool (https://github.com/wmil-1946/wikiscraper) was developed to crawl the web site (list below) and clean them, remove water marks, etc.
Only images taken before 1946 were uploaded, as per the law in Israel and United states.
Many volunteers had joined the effort by categorizing, linking and using the images once uploaded.
It even got coverage on Israeli media (here (https://www.calcalist.co.il/internet/articles/0,7340,L-3749654,00.html))
Commons categories containing the files: [1 (https://commons.wikimedia.org/wiki/Category:Files_from_JNF_uploaded_by_Wikim...) 2 (https://commons.wikimedia.org/wiki/Category:Files_from_ISA_uploaded_by_Wikim...) 3 (https://commons.wikimedia.org/wiki/Category:Files_from_Moshe_Sharett_Archive...) 4 (https://commons.wikimedia.org/wiki/Category:Files_from_Palmah_Archive_upload...) 5 (https://commons.wikimedia.org/wiki/Category:Files_from_GPO_uploaded_by_Wikim...)]
Thanks to all the people involved in this gigantic effort and enjoy using historical images from the holy land! :)
Matanya _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Hi,
I would like to add more information to what Matanya shared with you. We are working on a blog post for the international blog, but meanwhile here is the summary:
If you know this, it means that you are around for (too) many years, but 6 years ago Wikimedia Israel managed https://blog.wikimedia.org/2012/12/19/wikimedia-chapter-wins-victory-for-free-licenses-in-israel/[1] to get the government to make a decision to allow the public to freely use images taken by government officials. Since then we have been working to release more and more photos owned by the government and government organizations.
In recent years we have contacted organizations that hold pictures before the establishment of the State of Israel, in order to convince them to upload the photos they have to Commons. Many of them claim on their website that they owned the pictures and that they have copyright, although it isn't true and those pictures are PD according to the Israeli law. Some of them also added a watermark on this pictures to prevent using them without paying to the archive.
After many of them denied to change the copyright warnings, nor to give us the pictures we decide two years ago to do something bold. *We decided that if they don't want to "give" the photos, we will take them without asking. * We had discussions within our board about it and what we want to do, and we have commissioned a detailed legal opinion to understand what we can do under the Israeli law, what to avoid and what we must be careful about.
With that, we moved to a project which we kept secret and only a few knew about it and we mapped all the organization which holds such pictures and reviewed which of the sites can be technically crawled. We built serval tools to crawl, remove watermarks (where we can) and upload them to commons. To avoid overloading the sites that could expose us legally to problems with other laws and also avoid unnecessary attention, we ran and used these tools slowly and carefully. We review the content and then started to upload it to commons.
As Matanya mentioned until now we managed to download and upload photos from the* Israel State Archives, Jewish National Fund Archive, Moshe Sharett Archive, Palmah Archive, and the Government Press Office*.
When it was revealed yesterday in one of the largest newspapers in Israel [2], some of these organizations responded that they welcomed the operation (even though they refused to transfer the pictures when we turned to them) and some objected or did not respond.
I hope that this (short and incomplete) summary gives more information about this unique project.
[1] https://blog.wikimedia.org/2012/12/19/wikimedia-chapter-wins-victory-for-fre... [2] https://www.calcalist.co.il/internet/articles/0,7340,L-3749654,00.html
*Itzik Edri* Chairperson itzik@wikimedia.org.il +972-54-5878078
On Wed, Nov 14, 2018 at 8:16 AM matanya@foss.co.il wrote:
Hello,
It is a great pleasure for me to let you all know that wikimedia Israel had developed a web scraper that crawled in various archives in Israel and uploaded more than 28K free images to commons.
The tool (https://github.com/wmil-1946/wikiscraper) was developed to crawl the web site (list below) and clean them, remove water marks, etc.
Only images taken before 1946 were uploaded, as per the law in Israel and United states.
Many volunteers had joined the effort by categorizing, linking and using the images once uploaded.
It even got coverage on Israeli media (here ( https://www.calcalist.co.il/internet/articles/0,7340,L-3749654,00.html))
Commons categories containing the files: [1 ( https://commons.wikimedia.org/wiki/Category:Files_from_JNF_uploaded_by_Wikim...) 2 ( https://commons.wikimedia.org/wiki/Category:Files_from_ISA_uploaded_by_Wikim...) 3 ( https://commons.wikimedia.org/wiki/Category:Files_from_Moshe_Sharett_Archive...) 4 ( https://commons.wikimedia.org/wiki/Category:Files_from_Palmah_Archive_upload...) 5 ( https://commons.wikimedia.org/wiki/Category:Files_from_GPO_uploaded_by_Wikim... )]
Thanks to all the people involved in this gigantic effort and enjoy using historical images from the holy land! :)
Matanya _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
wikimedia-l@lists.wikimedia.org