Hey all,
while using shortened URLs from WDQS in the WikiCite report draft https://meta.wikimedia.org/wiki/WikiCite_2016/Report, it occurred to me that tinyurl.com is blacklisted on Meta.
This is a major problem as it prevents concise URLs for gigantic queries from being linked from other Wikimedia wikis. Has anyone thought of this issue (Stas, Jonas?), in particular: should we ask Meta to remove the domain from the blacklist or potentially consider another URL shortening solution?
Dario
Hi!
This is a major problem as it prevents concise URLs for gigantic queries from being linked from other Wikimedia wikis. Has anyone thought of this issue (Stas, Jonas?), in particular: should we ask Meta to remove the domain from the blacklist or potentially consider another URL shortening solution?
Yes, we thought of it (T112715) but since Wikimedia's own URL shortener (T108557) is not up yet, we have to use what is there. If you have an idea of a shortener more suitable than tinyurl.com, we could replace/add it. So far we didn't find a better alternative.
On Wed, Jun 1, 2016 at 9:26 AM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
This is a major problem as it prevents concise URLs for gigantic queries from being linked from other Wikimedia wikis. Has anyone thought of this issue (Stas, Jonas?), in particular: should we ask Meta to remove the domain from the blacklist or potentially consider another URL shortening solution?
Yes, we thought of it (T112715) but since Wikimedia's own URL shortener (T108557) is not up yet, we have to use what is there.
got it
If you have an idea of a shortener more suitable than tinyurl.com, we could replace/add it. So far we didn't find a better alternative.
I don't, it probably depends on what shorteners are most used for spam purposes across Wikimedia projects. Maybe someone familiar with URL blacklisting from major wikis can comment?
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi!
In theory phabricator [1] has an URL shortener, but it is not enabled.
Cheers, Jons
[1] https://phabricator.wikimedia.org/phurl/
2016-06-01 10:33 GMT+02:00 Dario Taraborelli dtaraborelli@wikimedia.org:
On Wed, Jun 1, 2016 at 9:26 AM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
This is a major problem as it prevents concise URLs for gigantic queries from being linked from other Wikimedia wikis. Has anyone thought of this issue (Stas, Jonas?), in particular: should we ask Meta to remove the domain from the blacklist or potentially consider another URL shortening solution?
Yes, we thought of it (T112715) but since Wikimedia's own URL shortener (T108557) is not up yet, we have to use what is there.
got it
If you have an idea of a shortener more suitable than tinyurl.com, we could replace/add it. So far we didn't find a better alternative.
I don't, it probably depends on what shorteners are most used for spam purposes across Wikimedia projects. Maybe someone familiar with URL blacklisting from major wikis can comment?
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
--
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Any time I find an url shorter I add it to global blacklist, same do other stewards/meta admins. That's because url shorters are the most common way to circumvent blacklisting.
Vito
2016-06-01 10:33 GMT+02:00 Dario Taraborelli dtaraborelli@wikimedia.org:
On Wed, Jun 1, 2016 at 9:26 AM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
This is a major problem as it prevents concise URLs for gigantic queries from being linked from other Wikimedia wikis. Has anyone thought of this issue (Stas, Jonas?), in particular: should we ask Meta to remove the domain from the blacklist or potentially consider another URL shortening solution?
Yes, we thought of it (T112715) but since Wikimedia's own URL shortener (T108557) is not up yet, we have to use what is there.
got it
If you have an idea of a shortener more suitable than tinyurl.com, we could replace/add it. So far we didn't find a better alternative.
I don't, it probably depends on what shorteners are most used for spam purposes across Wikimedia projects. Maybe someone familiar with URL blacklisting from major wikis can comment?
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
--
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi Dario (et al),
I'm not deeply familiar with the spam blacklist issues (haven't worked with it for some years) but the problem I can see is that if we pick a currently OK third-party shortener and allow that, it'll be incredibly tempting for people to start using it for spam...
If we're wanting to have a shortener, it'd probably need to have fairly restricted inputs to avoid this problem, which comes straight back round to "create our own and restrict what people can do with it". Not sure there's an easy solution to this.
Andrew.
On 1 June 2016 at 09:33, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
On Wed, Jun 1, 2016 at 9:26 AM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
This is a major problem as it prevents concise URLs for gigantic queries from being linked from other Wikimedia wikis. Has anyone thought of this issue (Stas, Jonas?), in particular: should we ask Meta to remove the domain from the blacklist or potentially consider another URL shortening solution?
Yes, we thought of it (T112715) but since Wikimedia's own URL shortener (T108557) is not up yet, we have to use what is there.
got it
If you have an idea of a shortener more suitable than tinyurl.com, we could replace/add it. So far we didn't find a better alternative.
I don't, it probably depends on what shorteners are most used for spam purposes across Wikimedia projects. Maybe someone familiar with URL blacklisting from major wikis can comment?
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
--
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Wed, Jun 1, 2016 at 2:00 PM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Dario Taraborelli, 01/06/2016 10:33:
I don't, it probably depends on what shorteners are most used for spam purposes across Wikimedia projects. Maybe someone familiar with URL blacklisting from major wikis can comment?
Nearly all URL shorteners get blacklisted, eventually.
that makes sense. It sounds like in the short term (and until we have a Wikimedia-operated shortener), using full URLs from WDQS is – alas – the only way to go. One option we haven't mentioned would be for WDQS itself to support URL shortening, I have no idea where that would sit in terms of priorities.
Dario
Nemo
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
I suggest considering w3id
Julie
On Wednesday, June 1, 2016, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
On Wed, Jun 1, 2016 at 2:00 PM, Federico Leva (Nemo) <nemowiki@gmail.com javascript:_e(%7B%7D,'cvml','nemowiki@gmail.com');> wrote:
Dario Taraborelli, 01/06/2016 10:33:
I don't, it probably depends on what shorteners are most used for spam purposes across Wikimedia projects. Maybe someone familiar with URL blacklisting from major wikis can comment?
Nearly all URL shorteners get blacklisted, eventually.
that makes sense. It sounds like in the short term (and until we have a Wikimedia-operated shortener), using full URLs from WDQS is – alas – the only way to go. One option we haven't mentioned would be for WDQS itself to support URL shortening, I have no idea where that would sit in terms of priorities.
Dario
Nemo
Wikidata mailing list Wikidata@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','Wikidata@lists.wikimedia.org'); https://lists.wikimedia.org/mailman/listinfo/wikidata
--
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
For many events at Wikimedia NYC, we use a shortcut -- once you're on Wikipedia -- like "WP:Harlem" also known as "Wikipedia:Harlem"
The redirect (before it redirects): https://en.wikipedia.org/w/index.php?title=Wikipedia:Harlem&redirect=no
Redirects to the Schomburg editathon we held earlier this year https://en.wikipedia.org/wiki/Wikipedia:Meetup/NYC/Schomburg
I understand why URL shorteners are non ideal. I support that I guess. These Wikipedia shortcuts have been very helpful although they probably don't work across projects. Another reason to avoid Meta #sigh
Best,
- Erika
*Erika Herzog* Wikipedia *User:BrillLyle* https://en.wikipedia.org/wiki/User:BrillLyle Secretary, Wikimedia NYC https://en.wikipedia.org/wiki/Wikipedia:Meetup/NYC
On Wed, Jun 1, 2016 at 8:17 AM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
On Wed, Jun 1, 2016 at 2:00 PM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Dario Taraborelli, 01/06/2016 10:33:
I don't, it probably depends on what shorteners are most used for spam purposes across Wikimedia projects. Maybe someone familiar with URL blacklisting from major wikis can comment?
Nearly all URL shorteners get blacklisted, eventually.
that makes sense. It sounds like in the short term (and until we have a Wikimedia-operated shortener), using full URLs from WDQS is – alas – the only way to go. One option we haven't mentioned would be for WDQS itself to support URL shortening, I have no idea where that would sit in terms of priorities.
Dario
Nemo
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
--
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi!
that makes sense. It sounds like in the short term (and until we have a Wikimedia-operated shortener), using full URLs from WDQS is – alas – the only way to go. One option we haven't mentioned would be for WDQS itself to support URL shortening, I have no idea where that would sit in terms of priorities.
I don't think it'd be very high. It would require building infrastructure that Wiki extension already has, and for sole purpose of implementing functionality that is already provided by that extension. So I think the best way here is just to wait until it's deployed, and maybe gently prod responsible people from time to time :)
Hi there, may I ask what link shorteners provide you that w3id does not? Eg baked in metrics or 10char urls? Just curious why you would want to reimplement.
Julie
On Wednesday, June 1, 2016, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
that makes sense. It sounds like in the short term (and until we have a Wikimedia-operated shortener), using full URLs from WDQS is – alas – the only way to go. One option we haven't mentioned would be for WDQS itself to support URL shortening, I have no idea where that would sit in terms of priorities.
I don't think it'd be very high. It would require building infrastructure that Wiki extension already has, and for sole purpose of implementing functionality that is already provided by that extension. So I think the best way here is just to wait until it's deployed, and maybe gently prod responsible people from time to time :)
-- Stas Malyshev smalyshev@wikimedia.org javascript:;
Wikidata mailing list Wikidata@lists.wikimedia.org javascript:; https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi!
Hi there, may I ask what link shorteners provide you that w3id does not? Eg baked in metrics or 10char urls? Just curious why you would want to reimplement.
From what I can see, the target audience of w3id is a relatively small
set of very stable URL prefixes that are used a lot and never change. It also does not aim at making these URLs shorter, it aims at making them stable. Adding URL namespace to it is a manual process, and individual URLs are not stored.
The target of URL shorteners is much bigger set of URLs, many of which are relatively low-use or transient, but which can be created via automatic means in great volumes, which make URL shorter and which are aimed at storing, at least for a while, each URL as individual data piece.
So, for our purposes w3id would not be very useful.
While I agree the primary aim isn't shortening, the result is usually much shorter by virtue of cutting out everything non essential to identification. I say this as a non stakeholder in w3id, just a fan.
Nevertheless I take your point that it isn't automated. I would like to understand the use case better; could you describe a specific scenario?
Thanks Julie
On Wednesday, June 1, 2016, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
Hi there, may I ask what link shorteners provide you that w3id does not? Eg baked in metrics or 10char urls? Just curious why you would want to reimplement.
From what I can see, the target audience of w3id is a relatively small set of very stable URL prefixes that are used a lot and never change. It also does not aim at making these URLs shorter, it aims at making them stable. Adding URL namespace to it is a manual process, and individual URLs are not stored.
The target of URL shorteners is much bigger set of URLs, many of which are relatively low-use or transient, but which can be created via automatic means in great volumes, which make URL shorter and which are aimed at storing, at least for a while, each URL as individual data piece.
So, for our purposes w3id would not be very useful.
Stas Malyshev smalyshev@wikimedia.org javascript:;
Wikidata mailing list Wikidata@lists.wikimedia.org javascript:; https://lists.wikimedia.org/mailman/listinfo/wikidata
On Thu, Jun 2, 2016 at 12:22 AM, Julie McMurry mcmurry.julie@gmail.com wrote:
While I agree the primary aim isn't shortening, the result is usually much shorter by virtue of cutting out everything non essential to identification.
Except in the case of a giant query string, such as a complex SPARQL query.
I don't want to suggest a technology, but the URL shortener on wdqs has been EXTREMELY USEFUL to me and it would be major bummer to lose it. I recently taught a class and used a variety of examples of SPARQL queries over wikidata. Having that shortener made it much faster for me to assemble the lecture and give people easy ways to get to those queries.
I could have used another one on my own I guess, but the current implementation is much faster and less error prone when dealing with monster sparqling urls...
please find a way to keep it -Ben
On Wed, Jun 1, 2016 at 9:51 PM, Tom Morris tfmorris@gmail.com wrote:
On Thu, Jun 2, 2016 at 12:22 AM, Julie McMurry mcmurry.julie@gmail.com wrote:
While I agree the primary aim isn't shortening, the result is usually much shorter by virtue of cutting out everything non essential to identification.
Except in the case of a giant query string, such as a complex SPARQL query.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi!
I could have used another one on my own I guess, but the current implementation is much faster and less error prone when dealing with monster sparqling urls...
please find a way to keep it
There are no plans to remove URL shortening. There are plans to switch URL shortener to Wikimedia's own one, which is supposed to be coming up eventually, but before that, we plan to use existing ones. We might change a provider if it turns out there is a better one, but we do not plan to remove the functionality.
awesome, thank you!
In case its useful for anyone, I was using wikidata to teach biologists and chemists about knowledge graphs (many tinyurls toward the end) http://www.slideshare.net/goodb/computing-on-the-shoulders-of-giants
On the second part of the course students were provided with the following Jupyter python notebook which runs a sparql query against wikidata and generates an output file suitable for loading in Cytoscape (a commonly used network visualization tool in biology). Could be useful for others that need to teach this stuff.. https://github.com/SuLab/sparql_to_pandas/blob/master/SPARQL_pandas.ipynb
On Fri, Jun 3, 2016 at 9:26 AM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
I could have used another one on my own I guess, but the current implementation is much faster and less error prone when dealing with monster sparqling urls...
please find a way to keep it
There are no plans to remove URL shortening. There are plans to switch URL shortener to Wikimedia's own one, which is supposed to be coming up eventually, but before that, we plan to use existing ones. We might change a provider if it turns out there is a better one, but we do not plan to remove the functionality.
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Makes perfect sense now, thanks for explaining.
Julie
On Thu, Jun 2, 2016 at 12:51 AM, Tom Morris tfmorris@gmail.com wrote:
On Thu, Jun 2, 2016 at 12:22 AM, Julie McMurry mcmurry.julie@gmail.com wrote:
While I agree the primary aim isn't shortening, the result is usually much shorter by virtue of cutting out everything non essential to identification.
Except in the case of a giant query string, such as a complex SPARQL query.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata