FYI:
http://www.altmetric.com/blog/new-source-alert-wikipedia/
Pine
*This is an Encyclopedia* https://www.wikipedia.org/
*One gateway to the wide garden of knowledge, where lies The deep rock of our past, in which we must delve The well of our future,The clear water we must leave untainted for those who come after us,The fertile earth, in which truth may grow in bright places, tended by many hands,And the broad fall of sunshine, warming our first steps toward knowing how much we do not know.*
*—Catherine Munro*
Thanks, saw that! Really neat. We're working on it with Analytics :)
On 2/5/15, Pine W wiki.pine@gmail.com wrote:
FYI:
http://www.altmetric.com/blog/new-source-alert-wikipedia/
Pine
*This is an Encyclopedia* https://www.wikipedia.org/
*One gateway to the wide garden of knowledge, where lies The deep rock of our past, in which we must delve The well of our future,The clear water we must leave untainted for those who come after us,The fertile earth, in which truth may grow in bright places, tended by many hands,And the broad fall of sunshine, warming our first steps toward knowing how much we do not know.*
*—Catherine Munro*
Dario and I just released our first static dump of identifiers. Right now, it only includes PubMed identifiers, but I'm running an extraction right now to add DOIs. It turns out that they are non-trivial to extract with regexes[1] alone, so I wrote an island parser to extract them from wikimarkup[2] that seems to perform very well.
Halfaker, Aaron; Taraborelli, Dario (2015): Scholarly article citations in Wikipedia. figshare. http://dx.doi.org/10.6084/m9.figshare.1299540 Retrieved 22:25, Feb 05, 2015 (GMT)
1. http://stackoverflow.com/questions/27910/finding-a-doi-in-a-document-or-page 2. https://github.com/halfak/Extract-scholarly-article-citations-from-Wikipedia...
-Aaron
On Thu, Feb 5, 2015 at 4:13 PM, Jake Orlowitz jorlowitz@gmail.com wrote:
Thanks, saw that! Really neat. We're working on it with Analytics :)
On 2/5/15, Pine W wiki.pine@gmail.com wrote:
FYI:
http://www.altmetric.com/blog/new-source-alert-wikipedia/
Pine
*This is an Encyclopedia* https://www.wikipedia.org/
*One gateway to the wide garden of knowledge, where lies The deep rock of our past, in which we must delve The well of our future,The clear water
we
must leave untainted for those who come after us,The fertile earth, in which truth may grow in bright places, tended by many hands,And the broad fall of sunshine, warming our first steps toward knowing how much we do
not
know.*
*—Catherine Munro*
-- Jake Orlowitz
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Do I understand this correctly? That Wikipedia articles that cite academic publications will be included in citation count now (at least for altmetrics). While that's great recognition for Wikipedia as a corpus of scholarly work, does that mean Wikipedia will be overrun with academic authors adding citations to their academic papers in any Wikipedia article they can get away with in order to improve their citation counts for their CVs?
I note that generally we can spot self-citation because the two papers will have an author name in common, but with the ability to edit Wikipedia anonymously and pseudonymously means that we cannot spot self-citation.
While judging research purely on citation counts is a deeply flawed method of assessment, nonetheless it is a reality and the pressure on folks to "game" the system is tremendous given the role it can play in appointment, tenure, promotion and grant applications.
On the positive side, we might be able to get rid of a lot of citation-needed tags.
Kerry
_____
From: wiki-research-l-bounces@lists.wikimedia.org [mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Pine W Sent: Friday, 6 February 2015 8:13 AM To: Wiki Research-l; Raymond Leonard; Wikimedia & GLAM collaboration [Public]; North American Cultural Partnerships Subject: [Wiki-research-l] Altmetric.com now tracks Wikipedia citations
FYI:
http://www.altmetric.com/blog/new-source-alert-wikipedia/
Pine
This is an Encyclopedia https://www.wikipedia.org/ One gateway to the wide garden of knowledge, where lies The deep rock of our past, in which we must delve The well of our future, The clear water we must leave untainted for those who come after us, The fertile earth, in which truth may grow in bright places, tended by many hands, And the broad fall of sunshine, warming our first steps toward knowing how much we do not know. -Catherine Munro
And SEO spammers will add themselves, too! This is not a new problem.
On Thursday, 5 February 2015, Kerry Raymond kerry.raymond@gmail.com wrote:
Do I understand this correctly? That Wikipedia articles that cite academic publications will be included in citation count now (at least for altmetrics). While that’s great recognition for Wikipedia as a corpus of scholarly work, does that mean Wikipedia will be overrun with academic authors adding citations to their academic papers in any Wikipedia article they can get away with in order to improve their citation counts for their CVs?
I note that generally we can spot self-citation because the two papers will have an author name in common, but with the ability to edit Wikipedia anonymously and pseudonymously means that we cannot spot self-citation.
While judging research purely on citation counts is a deeply flawed method of assessment, nonetheless it is a reality and the pressure on folks to “game” the system is tremendous given the role it can play in appointment, tenure, promotion and grant applications.
On the positive side, we might be able to get rid of a lot of citation-needed tags.
Kerry
*From:* wiki-research-l-bounces@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','wiki-research-l-bounces@lists.wikimedia.org'); [mailto:wiki-research-l-bounces@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','wiki-research-l-bounces@lists.wikimedia.org');] *On Behalf Of *Pine W *Sent:* Friday, 6 February 2015 8:13 AM *To:* Wiki Research-l; Raymond Leonard; Wikimedia & GLAM collaboration [Public]; North American Cultural Partnerships *Subject:* [Wiki-research-l] Altmetric.com now tracks Wikipedia citations
FYI:
http://www.altmetric.com/blog/new-source-alert-wikipedia/
Pine
This is an Encyclopedia https://www.wikipedia.org/
- One gateway to the wide garden of knowledge, where lies The deep rock of
our past, in which we must delve The well of our future, The clear water we must leave untainted for those who come after us, The fertile earth, in which truth may grow in bright places, tended by many hands, And the broad fall of sunshine, warming our first steps toward knowing how much we do not know. —Catherine Munro *
I agree it's not a new worry, but it might change the nature of the problem a bit, and is worth at least being vigilant about. I did have a similar idea some years ago, to compute an "impact factor" for being-cited-on-Wikipedia, but after discussing it with some colleagues, didn't do so specifically because of the worry that it would encourage more gaming of Wikipedia citations. Of course it's inevitable that someone would eventually do it, but I still think it was probably right on balance to not push that date forward.
Regarding the SEO analogy, the external links on Wikipedia are on average not the best part of Wikipedia, so it's not a very heartening The citations for now are not nearly as spammy as the external links are, and I hope it stays that way!
It's of course not new that there is an incentive to spam citations. Even without explicit Wikipedia-citation-tracking, there are incentives to spam marginally relevant citations in order to increase perceived prominence. Maybe being in a Wikipedia article will get your paper in front of more grad students who will end up citing it "for real" after encountering it on Wikipedia, etc. A direct citation count feels like it's likely to exacerbate that, since now removing an irrelevant citation to someone's article is a direct attack on their metrics! Though it's possible the actual effect on editing patterns will be small.
From a research perspective, the new datasets of citations might be
interesting to track over time, and correlate back to editors, to see if there are any interesting (or "interesting") patterns.
-Mark
-- mjn | http://www.anadrome.org
Oliver Keyes ironholds@gmail.com writes:
And SEO spammers will add themselves, too! This is not a new problem.
On Thursday, 5 February 2015, Kerry Raymond kerry.raymond@gmail.com wrote:
Do I understand this correctly? That Wikipedia articles that cite academic publications will be included in citation count now (at least for altmetrics). While that’s great recognition for Wikipedia as a corpus of scholarly work, does that mean Wikipedia will be overrun with academic authors adding citations to their academic papers in any Wikipedia article they can get away with in order to improve their citation counts for their CVs?
I note that generally we can spot self-citation because the two papers will have an author name in common, but with the ability to edit Wikipedia anonymously and pseudonymously means that we cannot spot self-citation.
While judging research purely on citation counts is a deeply flawed method of assessment, nonetheless it is a reality and the pressure on folks to “game” the system is tremendous given the role it can play in appointment, tenure, promotion and grant applications.
On the positive side, we might be able to get rid of a lot of citation-needed tags.
Kerry
*From:* wiki-research-l-bounces@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','wiki-research-l-bounces@lists.wikimedia.org'); [mailto:wiki-research-l-bounces@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','wiki-research-l-bounces@lists.wikimedia.org');] *On Behalf Of *Pine W *Sent:* Friday, 6 February 2015 8:13 AM *To:* Wiki Research-l; Raymond Leonard; Wikimedia & GLAM collaboration [Public]; North American Cultural Partnerships *Subject:* [Wiki-research-l] Altmetric.com now tracks Wikipedia citations
FYI:
http://www.altmetric.com/blog/new-source-alert-wikipedia/
Pine
This is an Encyclopedia https://www.wikipedia.org/
- One gateway to the wide garden of knowledge, where lies The deep rock of
our past, in which we must delve The well of our future, The clear water we must leave untainted for those who come after us, The fertile earth, in which truth may grow in bright places, tended by many hands, And the broad fall of sunshine, warming our first steps toward knowing how much we do not know. —Catherine Munro *
I agree that we could be doing something interesting with the social dynamics of Wikipedia editing by releasing this dataset -- and that some new problems may result. However, I think that it's much better to have too much academic interest than not enough. With a little AGF and diligence, we ought to be able to deal with this problem like we've dealt with quality control concerns in the past. Academics have to be very careful about their reputation, and it's hard to cite your own unnecessarily without giving up who you are since your name's going to be on the paper.
Either way, this is a useful dataset for library sciences work and it's public anyway. We're just making it easier to work with. Honestly, that's how I got started working in this space -- helping someone get data for their own research.
-Aaron
On Fri, Feb 6, 2015 at 5:50 AM, mjn mjn@anadrome.org wrote:
I agree it's not a new worry, but it might change the nature of the problem a bit, and is worth at least being vigilant about. I did have a similar idea some years ago, to compute an "impact factor" for being-cited-on-Wikipedia, but after discussing it with some colleagues, didn't do so specifically because of the worry that it would encourage more gaming of Wikipedia citations. Of course it's inevitable that someone would eventually do it, but I still think it was probably right on balance to not push that date forward.
Regarding the SEO analogy, the external links on Wikipedia are on average not the best part of Wikipedia, so it's not a very heartening The citations for now are not nearly as spammy as the external links are, and I hope it stays that way!
It's of course not new that there is an incentive to spam citations. Even without explicit Wikipedia-citation-tracking, there are incentives to spam marginally relevant citations in order to increase perceived prominence. Maybe being in a Wikipedia article will get your paper in front of more grad students who will end up citing it "for real" after encountering it on Wikipedia, etc. A direct citation count feels like it's likely to exacerbate that, since now removing an irrelevant citation to someone's article is a direct attack on their metrics! Though it's possible the actual effect on editing patterns will be small.
From a research perspective, the new datasets of citations might be interesting to track over time, and correlate back to editors, to see if there are any interesting (or "interesting") patterns.
-Mark
-- mjn | http://www.anadrome.org
Oliver Keyes ironholds@gmail.com writes:
And SEO spammers will add themselves, too! This is not a new problem.
On Thursday, 5 February 2015, Kerry Raymond kerry.raymond@gmail.com
wrote:
Do I understand this correctly? That Wikipedia articles that cite academic publications will be included in citation count now (at least
for
altmetrics). While that’s great recognition for Wikipedia as a corpus of scholarly work, does that mean Wikipedia will be overrun with academic authors adding citations to their academic papers in any Wikipedia
article
they can get away with in order to improve their citation counts for
their
CVs?
I note that generally we can spot self-citation because the two papers will have an author name in common, but with the ability to edit
Wikipedia
anonymously and pseudonymously means that we cannot spot self-citation.
While judging research purely on citation counts is a deeply flawed
method
of assessment, nonetheless it is a reality and the pressure on folks to “game” the system is tremendous given the role it can play in
appointment,
tenure, promotion and grant applications.
On the positive side, we might be able to get rid of a lot of citation-needed tags.
Kerry
*From:* wiki-research-l-bounces@lists.wikimedia.org <javascript:_e(%7B%7D,'cvml','
wiki-research-l-bounces@lists.wikimedia.org');>
[mailto:wiki-research-l-bounces@lists.wikimedia.org <javascript:_e(%7B%7D,'cvml','
wiki-research-l-bounces@lists.wikimedia.org');>]
*On Behalf Of *Pine W *Sent:* Friday, 6 February 2015 8:13 AM *To:* Wiki Research-l; Raymond Leonard; Wikimedia & GLAM collaboration [Public]; North American Cultural Partnerships *Subject:* [Wiki-research-l] Altmetric.com now tracks Wikipedia
citations
FYI:
http://www.altmetric.com/blog/new-source-alert-wikipedia/
Pine
This is an Encyclopedia https://www.wikipedia.org/
- One gateway to the wide garden of knowledge, where lies The deep rock
of
our past, in which we must delve The well of our future, The clear
water we
must leave untainted for those who come after us, The fertile earth, in which truth may grow in bright places, tended by many hands, And the
broad
fall of sunshine, warming our first steps toward knowing how much we do
not
know. —Catherine Munro *
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
+1
(I for one have full confidence that attempts by an academic to force citations of their work will remain where they belong: in peer reviews of every paper I submit ;p)
On 6 February 2015 at 10:24, Aaron Halfaker ahalfaker@wikimedia.org wrote:
I agree that we could be doing something interesting with the social dynamics of Wikipedia editing by releasing this dataset -- and that some new problems may result. However, I think that it's much better to have too much academic interest than not enough. With a little AGF and diligence, we ought to be able to deal with this problem like we've dealt with quality control concerns in the past. Academics have to be very careful about their reputation, and it's hard to cite your own unnecessarily without giving up who you are since your name's going to be on the paper.
Either way, this is a useful dataset for library sciences work and it's public anyway. We're just making it easier to work with. Honestly, that's how I got started working in this space -- helping someone get data for their own research.
-Aaron
On Fri, Feb 6, 2015 at 5:50 AM, mjn mjn@anadrome.org wrote:
I agree it's not a new worry, but it might change the nature of the problem a bit, and is worth at least being vigilant about. I did have a similar idea some years ago, to compute an "impact factor" for being-cited-on-Wikipedia, but after discussing it with some colleagues, didn't do so specifically because of the worry that it would encourage more gaming of Wikipedia citations. Of course it's inevitable that someone would eventually do it, but I still think it was probably right on balance to not push that date forward.
Regarding the SEO analogy, the external links on Wikipedia are on average not the best part of Wikipedia, so it's not a very heartening The citations for now are not nearly as spammy as the external links are, and I hope it stays that way!
It's of course not new that there is an incentive to spam citations. Even without explicit Wikipedia-citation-tracking, there are incentives to spam marginally relevant citations in order to increase perceived prominence. Maybe being in a Wikipedia article will get your paper in front of more grad students who will end up citing it "for real" after encountering it on Wikipedia, etc. A direct citation count feels like it's likely to exacerbate that, since now removing an irrelevant citation to someone's article is a direct attack on their metrics! Though it's possible the actual effect on editing patterns will be small.
From a research perspective, the new datasets of citations might be interesting to track over time, and correlate back to editors, to see if there are any interesting (or "interesting") patterns.
-Mark
-- mjn | http://www.anadrome.org
Oliver Keyes ironholds@gmail.com writes:
And SEO spammers will add themselves, too! This is not a new problem.
On Thursday, 5 February 2015, Kerry Raymond kerry.raymond@gmail.com wrote:
Do I understand this correctly? That Wikipedia articles that cite academic publications will be included in citation count now (at least for altmetrics). While that’s great recognition for Wikipedia as a corpus of scholarly work, does that mean Wikipedia will be overrun with academic authors adding citations to their academic papers in any Wikipedia article they can get away with in order to improve their citation counts for their CVs?
I note that generally we can spot self-citation because the two papers will have an author name in common, but with the ability to edit Wikipedia anonymously and pseudonymously means that we cannot spot self-citation.
While judging research purely on citation counts is a deeply flawed method of assessment, nonetheless it is a reality and the pressure on folks to “game” the system is tremendous given the role it can play in appointment, tenure, promotion and grant applications.
On the positive side, we might be able to get rid of a lot of citation-needed tags.
Kerry
*From:* wiki-research-l-bounces@lists.wikimedia.org
javascript:_e(%7B%7D,'cvml','wiki-research-l-bounces@lists.wikimedia.org'); [mailto:wiki-research-l-bounces@lists.wikimedia.org
javascript:_e(%7B%7D,'cvml','wiki-research-l-bounces@lists.wikimedia.org');] *On Behalf Of *Pine W *Sent:* Friday, 6 February 2015 8:13 AM *To:* Wiki Research-l; Raymond Leonard; Wikimedia & GLAM collaboration [Public]; North American Cultural Partnerships *Subject:* [Wiki-research-l] Altmetric.com now tracks Wikipedia citations
FYI:
http://www.altmetric.com/blog/new-source-alert-wikipedia/
Pine
This is an Encyclopedia https://www.wikipedia.org/
- One gateway to the wide garden of knowledge, where lies The deep rock
of our past, in which we must delve The well of our future, The clear water we must leave untainted for those who come after us, The fertile earth, in which truth may grow in bright places, tended by many hands, And the broad fall of sunshine, warming our first steps toward knowing how much we do not know. —Catherine Munro *
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
I agree it's a good thing overall. I'm just alerting us to the potential problem it might create. I note it might not just be the academics themselves. In Australia at least, institutional research rankings are heavily based on citation counts. Our "Excellence in Research Assessment" (ERA) process creates massive institutional pressure to track down every possible citation, much of which is done by the library and admin teams. And with Wikipedia, it's so easy to create a new citation . it's hard to believe some people won't be tempted .
Are we able to extract the user names associated with adding links to academic papers? Some time downstream analysis of that data might be interesting, especially if there do appear to be clusters of cited papers with common author names added by the same user name or IP address. There's almost certainly a publication in that! :-)
But as I say, so long as the papers are actually relevant where they are cited in Wikipedia, this is not a bad thing for Wikipedia if academics do decide to promote their work that way.
Kerry
_____
From: wiki-research-l-bounces@lists.wikimedia.org [mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Aaron Halfaker Sent: Saturday, 7 February 2015 1:25 AM To: Research into Wikimedia content and communities Subject: Re: [Wiki-research-l] Altmetric.com now tracks Wikipedia citations
I agree that we could be doing something interesting with the social dynamics of Wikipedia editing by releasing this dataset -- and that some new problems may result. However, I think that it's much better to have too much academic interest than not enough. With a little AGF and diligence, we ought to be able to deal with this problem like we've dealt with quality control concerns in the past. Academics have to be very careful about their reputation, and it's hard to cite your own unnecessarily without giving up who you are since your name's going to be on the paper.
Either way, this is a useful dataset for library sciences work and it's public anyway. We're just making it easier to work with. Honestly, that's how I got started working in this space -- helping someone get data for their own research.
-Aaron
On Fri, Feb 6, 2015 at 5:50 AM, mjn mjn@anadrome.org wrote:
I agree it's not a new worry, but it might change the nature of the problem a bit, and is worth at least being vigilant about. I did have a similar idea some years ago, to compute an "impact factor" for being-cited-on-Wikipedia, but after discussing it with some colleagues, didn't do so specifically because of the worry that it would encourage more gaming of Wikipedia citations. Of course it's inevitable that someone would eventually do it, but I still think it was probably right on balance to not push that date forward.
Regarding the SEO analogy, the external links on Wikipedia are on average not the best part of Wikipedia, so it's not a very heartening The citations for now are not nearly as spammy as the external links are, and I hope it stays that way!
It's of course not new that there is an incentive to spam citations. Even without explicit Wikipedia-citation-tracking, there are incentives to spam marginally relevant citations in order to increase perceived prominence. Maybe being in a Wikipedia article will get your paper in front of more grad students who will end up citing it "for real" after encountering it on Wikipedia, etc. A direct citation count feels like it's likely to exacerbate that, since now removing an irrelevant citation to someone's article is a direct attack on their metrics! Though it's possible the actual effect on editing patterns will be small.
From a research perspective, the new datasets of citations might be
interesting to track over time, and correlate back to editors, to see if there are any interesting (or "interesting") patterns.
-Mark
-- mjn | http://www.anadrome.org
Oliver Keyes ironholds@gmail.com writes:
And SEO spammers will add themselves, too! This is not a new problem.
On Thursday, 5 February 2015, Kerry Raymond kerry.raymond@gmail.com
wrote:
Do I understand this correctly? That Wikipedia articles that cite academic publications will be included in citation count now (at least
for
altmetrics). While that's great recognition for Wikipedia as a corpus of scholarly work, does that mean Wikipedia will be overrun with academic authors adding citations to their academic papers in any Wikipedia
article
they can get away with in order to improve their citation counts for
their
CVs?
I note that generally we can spot self-citation because the two papers will have an author name in common, but with the ability to edit
Wikipedia
anonymously and pseudonymously means that we cannot spot self-citation.
While judging research purely on citation counts is a deeply flawed
method
of assessment, nonetheless it is a reality and the pressure on folks to "game" the system is tremendous given the role it can play in
appointment,
tenure, promotion and grant applications.
On the positive side, we might be able to get rid of a lot of citation-needed tags.
Kerry
*From:* wiki-research-l-bounces@lists.wikimedia.org
<javascript:_e(%7B%7D,'cvml','wiki-research-l-bounces@lists.wikimedia.org');
[mailto:wiki-research-l-bounces@lists.wikimedia.org
<javascript:_e(%7B%7D,'cvml','wiki-research-l-bounces@lists.wikimedia.org');
]
*On Behalf Of *Pine W *Sent:* Friday, 6 February 2015 8:13 AM *To:* Wiki Research-l; Raymond Leonard; Wikimedia & GLAM collaboration [Public]; North American Cultural Partnerships *Subject:* [Wiki-research-l] Altmetric.com now tracks Wikipedia citations
FYI:
http://www.altmetric.com/blog/new-source-alert-wikipedia/
Pine
This is an Encyclopedia https://www.wikipedia.org/
- One gateway to the wide garden of knowledge, where lies The deep rock
of
our past, in which we must delve The well of our future, The clear water
we
must leave untainted for those who come after us, The fertile earth, in which truth may grow in bright places, tended by many hands, And the
broad
fall of sunshine, warming our first steps toward knowing how much we do
not
know. -Catherine Munro *
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
We should be! We can sync templatelink or externalink entry timestamps with revision table timestamps. Sounds like a fun project!
On 6 February 2015 at 16:15, Kerry Raymond kerry.raymond@gmail.com wrote:
I agree it’s a good thing overall. I’m just alerting us to the potential problem it might create. I note it might not just be the academics themselves. In Australia at least, institutional research rankings are heavily based on citation counts. Our “Excellence in Research Assessment” (ERA) process creates massive institutional pressure to track down every possible citation, much of which is done by the library and admin teams. And with Wikipedia, it’s so easy to create a new citation … it’s hard to believe some people won’t be tempted …
Are we able to extract the user names associated with adding links to academic papers? Some time downstream analysis of that data might be interesting, especially if there do appear to be clusters of cited papers with common author names added by the same user name or IP address. There’s almost certainly a publication in that! J
But as I say, so long as the papers are actually relevant where they are cited in Wikipedia, this is not a bad thing for Wikipedia if academics do decide to promote their work that way.
Kerry
From: wiki-research-l-bounces@lists.wikimedia.org [mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Aaron Halfaker Sent: Saturday, 7 February 2015 1:25 AM To: Research into Wikimedia content and communities Subject: Re: [Wiki-research-l] Altmetric.com now tracks Wikipedia citations
I agree that we could be doing something interesting with the social dynamics of Wikipedia editing by releasing this dataset -- and that some new problems may result. However, I think that it's much better to have too much academic interest than not enough. With a little AGF and diligence, we ought to be able to deal with this problem like we've dealt with quality control concerns in the past. Academics have to be very careful about their reputation, and it's hard to cite your own unnecessarily without giving up who you are since your name's going to be on the paper.
Either way, this is a useful dataset for library sciences work and it's public anyway. We're just making it easier to work with. Honestly, that's how I got started working in this space -- helping someone get data for their own research.
-Aaron
On Fri, Feb 6, 2015 at 5:50 AM, mjn mjn@anadrome.org wrote:
I agree it's not a new worry, but it might change the nature of the problem a bit, and is worth at least being vigilant about. I did have a similar idea some years ago, to compute an "impact factor" for being-cited-on-Wikipedia, but after discussing it with some colleagues, didn't do so specifically because of the worry that it would encourage more gaming of Wikipedia citations. Of course it's inevitable that someone would eventually do it, but I still think it was probably right on balance to not push that date forward.
Regarding the SEO analogy, the external links on Wikipedia are on average not the best part of Wikipedia, so it's not a very heartening The citations for now are not nearly as spammy as the external links are, and I hope it stays that way!
It's of course not new that there is an incentive to spam citations. Even without explicit Wikipedia-citation-tracking, there are incentives to spam marginally relevant citations in order to increase perceived prominence. Maybe being in a Wikipedia article will get your paper in front of more grad students who will end up citing it "for real" after encountering it on Wikipedia, etc. A direct citation count feels like it's likely to exacerbate that, since now removing an irrelevant citation to someone's article is a direct attack on their metrics! Though it's possible the actual effect on editing patterns will be small.
From a research perspective, the new datasets of citations might be interesting to track over time, and correlate back to editors, to see if there are any interesting (or "interesting") patterns.
-Mark
-- mjn | http://www.anadrome.org
Oliver Keyes ironholds@gmail.com writes:
And SEO spammers will add themselves, too! This is not a new problem.
On Thursday, 5 February 2015, Kerry Raymond kerry.raymond@gmail.com wrote:
Do I understand this correctly? That Wikipedia articles that cite academic publications will be included in citation count now (at least for altmetrics). While that’s great recognition for Wikipedia as a corpus of scholarly work, does that mean Wikipedia will be overrun with academic authors adding citations to their academic papers in any Wikipedia article they can get away with in order to improve their citation counts for their CVs?
I note that generally we can spot self-citation because the two papers will have an author name in common, but with the ability to edit Wikipedia anonymously and pseudonymously means that we cannot spot self-citation.
While judging research purely on citation counts is a deeply flawed method of assessment, nonetheless it is a reality and the pressure on folks to “game” the system is tremendous given the role it can play in appointment, tenure, promotion and grant applications.
On the positive side, we might be able to get rid of a lot of citation-needed tags.
Kerry
*From:* wiki-research-l-bounces@lists.wikimedia.org
javascript:_e(%7B%7D,'cvml','wiki-research-l-bounces@lists.wikimedia.org'); [mailto:wiki-research-l-bounces@lists.wikimedia.org
javascript:_e(%7B%7D,'cvml','wiki-research-l-bounces@lists.wikimedia.org');] *On Behalf Of *Pine W *Sent:* Friday, 6 February 2015 8:13 AM *To:* Wiki Research-l; Raymond Leonard; Wikimedia & GLAM collaboration [Public]; North American Cultural Partnerships *Subject:* [Wiki-research-l] Altmetric.com now tracks Wikipedia citations
FYI:
http://www.altmetric.com/blog/new-source-alert-wikipedia/
Pine
This is an Encyclopedia https://www.wikipedia.org/
- One gateway to the wide garden of knowledge, where lies The deep rock
of our past, in which we must delve The well of our future, The clear water we must leave untainted for those who come after us, The fertile earth, in which truth may grow in bright places, tended by many hands, And the broad fall of sunshine, warming our first steps toward knowing how much we do not know. —Catherine Munro *
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
I thought that Wikipedia addressed the SEO problem by getting Google to not follow the off-wiki links when crawling, so that Wikipedia's page rank would not follow through to off-Wikipedia links. But I cannot (using Google) find the page where I read that.
While that doesn't prevent people from spamming Wikipedia with external links to catch people's eyeballs while reading Wikipedia, it should address the SEO problem somewhat.
Kerry
That's actually a Wikipedia thing, by putting
<a rel="nofollow" class="external text"
in the article source code. Internal articles in contrast say
<a href="" class="internal"
That's not a Google good will thing.
Sincerely, Laura Hale
On Fri, Feb 6, 2015 at 9:44 PM, Kerry Raymond kerry.raymond@gmail.com wrote:
I thought that Wikipedia addressed the SEO problem by getting Google to not follow the off-wiki links when crawling, so that Wikipedia's page rank would not follow through to off-Wikipedia links. But I cannot (using Google) find the page where I read that.
While that doesn't prevent people from spamming Wikipedia with external links to catch people's eyeballs while reading Wikipedia, it should address the SEO problem somewhat.
Kerry
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
It also requires SEO people to demonstrate a modicum of logical reasoning skills. Sadly, from my work on understanding our traffic trends, this appears to be beyond at least some of them.
On Friday, 6 February 2015, Laura Hale laura@fanhistory.com wrote:
That's actually a Wikipedia thing, by putting
<a rel="nofollow" class="external text"
in the article source code. Internal articles in contrast say
<a href="" class="internal"
That's not a Google good will thing.
Sincerely, Laura Hale
On Fri, Feb 6, 2015 at 9:44 PM, Kerry Raymond <kerry.raymond@gmail.com javascript:_e(%7B%7D,'cvml','kerry.raymond@gmail.com');> wrote:
I thought that Wikipedia addressed the SEO problem by getting Google to not follow the off-wiki links when crawling, so that Wikipedia's page rank would not follow through to off-Wikipedia links. But I cannot (using Google) find the page where I read that.
While that doesn't prevent people from spamming Wikipedia with external links to catch people's eyeballs while reading Wikipedia, it should address the SEO problem somewhat.
Kerry
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','Wiki-research-l@lists.wikimedia.org'); https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- twitter: purplepopple
wiki-research-l@lists.wikimedia.org