Hi all, I’m constantly encountering newspaper articles that have disappeared from Google and are no longer viewable or even discoverable via Google. They are often the sole reference url for statements, yet are behind a paywall - notably newspapers.com. How is the community handling the paywalling of historical newspaper resources? Thank you.
Sent from my iPad
On Mon, 11 Nov 2019 at 16:44, PWN pariswritersnews@gmail.com wrote:
I’m constantly encountering newspaper articles that have disappeared from Google and are no longer viewable or even discoverable via Google. They are often the sole reference url for statements, yet are behind a paywall - notably newspapers.com. How is the community handling the paywalling of historical newspaper resources?
Whenever I cite something on Wikidata, or Wikipedia, I submit a copy to the Internet Archive's Wayback Machine using the add-on for Firefox:
https://github.com/jonathanmccann/archive-url-firefox-addon
I like what Andy is doing, As a librarian, however, an alternative would be to only cite the newspaper without the URL as this analog citation would be valid, meaning just cite the newspaper article.
Regards,
Wynand van der Walt Head Librarian: Technical Services Rhodes University Library
On Mon, Nov 11, 2019 at 7:28 PM Andy Mabbett andy@pigsonthewing.org.uk wrote:
On Mon, 11 Nov 2019 at 16:44, PWN pariswritersnews@gmail.com wrote:
I’m constantly encountering newspaper articles that have disappeared
from Google and
are no longer viewable or even discoverable via Google. They are often the sole reference url for statements, yet are behind a
paywall - notably
newspapers.com. How is the community handling the paywalling of historical newspaper
resources?
Whenever I cite something on Wikidata, or Wikipedia, I submit a copy to the Internet Archive's Wayback Machine using the add-on for Firefox:
https://github.com/jonathanmccann/archive-url-firefox-addon
-- Andy Mabbett @pigsonthewing http://pigsonthewing.org.uk
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
I don't if I agree with "just citing" the newspaper article. Why not push for resolvable citations if we have the technology? There is not much value in a citation if you can't access the source to verify, don't you think?
On Tue, Nov 12, 2019 at 6:38 AM Wynand van der Walt wynlib40@gmail.com wrote:
I like what Andy is doing, As a librarian, however, an alternative would be to only cite the newspaper without the URL as this analog citation would be valid, meaning just cite the newspaper article.
Regards,
Wynand van der Walt Head Librarian: Technical Services Rhodes University Library
On Mon, Nov 11, 2019 at 7:28 PM Andy Mabbett andy@pigsonthewing.org.uk wrote:
On Mon, 11 Nov 2019 at 16:44, PWN pariswritersnews@gmail.com wrote:
I’m constantly encountering newspaper articles that have disappeared
from Google and
are no longer viewable or even discoverable via Google. They are often the sole reference url for statements, yet are behind a
paywall - notably
newspapers.com. How is the community handling the paywalling of historical newspaper
resources?
Whenever I cite something on Wikidata, or Wikipedia, I submit a copy to the Internet Archive's Wayback Machine using the add-on for Firefox:
https://github.com/jonathanmccann/archive-url-firefox-addon
-- Andy Mabbett @pigsonthewing http://pigsonthewing.org.uk
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hoi, It depends. When you approach it from a scientific point of view, you write your paper and are personally responsible for what you write. The bias of your paper may include the effort you take to verify sources. In Wikipedia it is NOT your paper and it is NOT only your responsibility. At the opposite end are the papers in the language you do not understand and consequently not accepted as a source, there are the papers, books that have not been
So a Wikipedia is biased because of the limiting of sources and as it is NOT a personal responsibility, there is also the notion if we should accept the bias that limiting brings. Thanks, GerardM
On Tue, 12 Nov 2019 at 08:03, Andra Waagmeester andra@micel.io wrote:
I don't if I agree with "just citing" the newspaper article. Why not push for resolvable citations if we have the technology? There is not much value in a citation if you can't access the source to verify, don't you think?
On Tue, Nov 12, 2019 at 6:38 AM Wynand van der Walt wynlib40@gmail.com wrote:
I like what Andy is doing, As a librarian, however, an alternative would be to only cite the newspaper without the URL as this analog citation would be valid, meaning just cite the newspaper article.
Regards,
Wynand van der Walt Head Librarian: Technical Services Rhodes University Library
On Mon, Nov 11, 2019 at 7:28 PM Andy Mabbett andy@pigsonthewing.org.uk wrote:
On Mon, 11 Nov 2019 at 16:44, PWN pariswritersnews@gmail.com wrote:
I’m constantly encountering newspaper articles that have disappeared
from Google and
are no longer viewable or even discoverable via Google. They are often the sole reference url for statements, yet are behind a
paywall - notably
newspapers.com. How is the community handling the paywalling of historical newspaper
resources?
Whenever I cite something on Wikidata, or Wikipedia, I submit a copy to the Internet Archive's Wayback Machine using the add-on for Firefox:
https://github.com/jonathanmccann/archive-url-firefox-addon
-- Andy Mabbett @pigsonthewing http://pigsonthewing.org.uk
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
I see a gradual shift from open news articles to articles behind paywall, that I guess must be a natural adjustment of the income model as newspapers transition from paper to the Internet. Good article, suitable for sources in Wikipedia and Wikidata, will probably more and more likely be behind paywalls. We need to get use to that and it has also been the case for books - to some degree - for a long time.
The Internet Archive, tools like Andy Mabbett refers to and the 'archive-url' parameter for (the English) Wikipedia's cite news template as well as Wikidata's P1065 may be a good solution, but we may also keep in mind that it might not be sustainable copyright-wise in the long term, cf. discussions around controlled lending and Society of Authors' concerns https://publishingperspectives.com/2019/01/copyright-battle-internet-archive... (not currently behind paywall :)
/Finn
On 12/11/2019 08:17, Gerard Meijssen wrote:
Hoi, It depends. When you approach it from a scientific point of view, you write your paper and are personally responsible for what you write. The bias of your paper may include the effort you take to verify sources. In Wikipedia it is NOT your paper and it is NOT only your responsibility. At the opposite end are the papers in the language you do not understand and consequently not accepted as a source, there are the papers, books that have not been
So a Wikipedia is biased because of the limiting of sources and as it is NOT a personal responsibility, there is also the notion if we should accept the bias that limiting brings. Thanks, GerardM
On Tue, 12 Nov 2019 at 08:03, Andra Waagmeester <andra@micel.io mailto:andra@micel.io> wrote:
I don't if I agree with "just citing" the newspaper article. Why not push for resolvable citations if we have the technology? There is not much value in a citation if you can't access the source to verify, don't you think? On Tue, Nov 12, 2019 at 6:38 AM Wynand van der Walt <wynlib40@gmail.com <mailto:wynlib40@gmail.com>> wrote: I like what Andy is doing, As a librarian, however, an alternative would be to only cite the newspaper without the URL as this analog citation would be valid, meaning just cite the newspaper article. Regards, Wynand van der Walt Head Librarian: Technical Services Rhodes University Library On Mon, Nov 11, 2019 at 7:28 PM Andy Mabbett <andy@pigsonthewing.org.uk <mailto:andy@pigsonthewing.org.uk>> wrote: On Mon, 11 Nov 2019 at 16:44, PWN <pariswritersnews@gmail.com <mailto:pariswritersnews@gmail.com>> wrote: > I’m constantly encountering newspaper articles that have disappeared from Google and > are no longer viewable or even discoverable via Google. > They are often the sole reference url for statements, yet are behind a paywall - notably > newspapers.com <http://newspapers.com>. > How is the community handling the paywalling of historical newspaper resources? Whenever I cite something on Wikidata, or Wikipedia, I submit a copy to the Internet Archive's Wayback Machine using the add-on for Firefox: https://github.com/jonathanmccann/archive-url-firefox-addon -- Andy Mabbett @pigsonthewing http://pigsonthewing.org.uk _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hoi, As more and more conditions are piled up, bias is not reduced but induced. Thanks, GerardM
On Tue, 12 Nov 2019 at 09:55, fn@imm.dtu.dk wrote:
I see a gradual shift from open news articles to articles behind paywall, that I guess must be a natural adjustment of the income model as newspapers transition from paper to the Internet. Good article, suitable for sources in Wikipedia and Wikidata, will probably more and more likely be behind paywalls. We need to get use to that and it has also been the case for books - to some degree - for a long time.
The Internet Archive, tools like Andy Mabbett refers to and the 'archive-url' parameter for (the English) Wikipedia's cite news template as well as Wikidata's P1065 may be a good solution, but we may also keep in mind that it might not be sustainable copyright-wise in the long term, cf. discussions around controlled lending and Society of Authors' concerns
https://publishingperspectives.com/2019/01/copyright-battle-internet-archive... (not currently behind paywall :)
/Finn
On 12/11/2019 08:17, Gerard Meijssen wrote:
Hoi, It depends. When you approach it from a scientific point of view, you write your paper and are personally responsible for what you write. The bias of your paper may include the effort you take to verify sources. In Wikipedia it is NOT your paper and it is NOT only your responsibility. At the opposite end are the papers in the language you do not understand and consequently not accepted as a source, there are the papers, books that have not been
So a Wikipedia is biased because of the limiting of sources and as it is NOT a personal responsibility, there is also the notion if we should accept the bias that limiting brings. Thanks, GerardM
On Tue, 12 Nov 2019 at 08:03, Andra Waagmeester <andra@micel.io mailto:andra@micel.io> wrote:
I don't if I agree with "just citing" the newspaper article. Why not push for resolvable citations if we have the technology? There is not much value in a citation if you can't access the source to verify, don't you think? On Tue, Nov 12, 2019 at 6:38 AM Wynand van der Walt <wynlib40@gmail.com <mailto:wynlib40@gmail.com>> wrote: I like what Andy is doing, As a librarian, however, an alternative would be to only cite the newspaper without the URL as this analog citation would be valid, meaning just cite the newspaper article. Regards, Wynand van der Walt Head Librarian: Technical Services Rhodes University Library On Mon, Nov 11, 2019 at 7:28 PM Andy Mabbett <andy@pigsonthewing.org.uk <mailto:andy@pigsonthewing.org.uk>> wrote: On Mon, 11 Nov 2019 at 16:44, PWN <pariswritersnews@gmail.com <mailto:pariswritersnews@gmail.com>> wrote: > I’m constantly encountering newspaper articles that have disappeared from Google and > are no longer viewable or even discoverable via Google. > They are often the sole reference url for statements, yet are behind a paywall - notably > newspapers.com <http://newspapers.com>. > How is the community handling the paywalling of historical newspaper resources? Whenever I cite something on Wikidata, or Wikipedia, I submit a copy to the Internet Archive's Wayback Machine using the add-on
for
Firefox: https://github.com/jonathanmccann/archive-url-firefox-addon -- Andy Mabbett @pigsonthewing http://pigsonthewing.org.uk _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:
Wikidata@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
To guard against link-rot of news websites when they're used as Reference for statements in Wikidata, is there any bot that is systematically going around and collecting new URLs that are added with the Reference URL property (P854), adding them to Internet Archive, copying the result, and then adding that as Archive URL property (P1065) + Archive Date (P2960) back into the original statement?
If not, would that be feasible, and a good idea?
-Liam
I believe that’s what InternetArchiveBot https://www.wikidata.org/wiki/User:InternetArchiveBot is for, but apparently at the moment it only updates references when they break; at least, that’s how Ymblanter summarized the request for permissions https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/InternetArchiveBot. Also, its contributions https://www.wikidata.org/wiki/Special:Contributions/InternetArchiveBot show no edits since October 29, I’m not sure if that’s a problem.
Cheers, Lucas
On 13.11.19 01:03, Liam Wyatt wrote:
To guard against link-rot of news websites when they're used as Reference for statements in Wikidata, is there any bot that is systematically going around and collecting new URLs that are added with the Reference URL property (P854), adding them to Internet Archive, copying the result, and then adding that as Archive URL property (P1065) + Archive Date (P2960) back into the original statement?
If not, would that be feasible, and a good idea?
-Liam
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Let's not spread publisher FUD on Wikimedia lists, please.
Liam Wyatt, 13/11/19 02:03:
is there any bot that is systematically going around and collecting new URLs that are added with the Reference URL property (P854), adding them to Internet Archive,
As long as MediaWiki is not broken, and the URLs are announced on the expected venues (I believe https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams#API), yes, the Internet Archive immediately crawls them. This is documented at https://www.mediawiki.org/wiki/Archived_Pages (sort of).
Federico
On Wed, Nov 13, 2019 at 4:07 AM Federico Leva (Nemo) nemowiki@gmail.com wrote:
Let's not spread publisher FUD on Wikimedia lists, please.
I was about to say the same thing... else the lists will be takenn over by FUD/BCH wars "we may also keep in mind that 'limiting the availability, sourcing, and reusability of a claim' may not be sustainable reliability-wise, as sources that impose those limits are increasingly being challenged as biased, unaccountable, and unreliable."
Liam Wyatt, 13/11/19 02:03:
is there any bot that is systematically going around and collecting new URLs that are added with the Reference URL property (P854), adding them to Internet Archive,
As long as MediaWiki is not broken, and the URLs are announced on the expected venues (I believe https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams#API), yes, the Internet Archive immediately crawls them. This is documented at https://www.mediawiki.org/wiki/Archived_Pages (sort of).
How complete is the coverage now? /S
Dear all, I thank you for your answer. I highly recommend you to see https://meta.wikimedia.org/wiki/Talk:Wikidata/Strategy/2019#Feedback_and_que... that deals in part with this important topic. The problem is that we are not sure when using a newspaper article if this article is neutral or not. This fact caused a lot of significant problems in Wikipedia and we do not want to reproduce the same thing in Wikidata. Yours Sincerely, Houcemeddine Turki ________________________________ De : Wikidata wikidata-bounces@lists.wikimedia.org de la part de Samuel Klein meta.sj@gmail.com Envoyé : mercredi 13 novembre 2019 14:21 À : Discussion list for the Wikidata project wikidata@lists.wikimedia.org Cc : Mark Graham mark@archive.org Objet : Re: [Wikidata] References to newspaper articles behind paywalls like newspapers.com
On Wed, Nov 13, 2019 at 4:07 AM Federico Leva (Nemo) <nemowiki@gmail.commailto:nemowiki@gmail.com> wrote: Let's not spread publisher FUD on Wikimedia lists, please.
I was about to say the same thing... else the lists will be takenn over by FUD/BCH wars "we may also keep in mind that 'limiting the availability, sourcing, and reusability of a claim' may not be sustainable reliability-wise, as sources that impose those limits are increasingly being challenged as biased, unaccountable, and unreliable."
Liam Wyatt, 13/11/19 02:03:
is there any bot that is systematically going around and collecting new URLs that are added with the Reference URL property (P854), adding them to Internet Archive,
As long as MediaWiki is not broken, and the URLs are announced on the expected venues (I believe https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams#API), yes, the Internet Archive immediately crawls them. This is documented at https://www.mediawiki.org/wiki/Archived_Pages (sort of).
How complete is the coverage now? /S