This is research I did for The Wikinewsie Group newsletter. We've been
looking at ways to increase community participation and content output.
The method has been suggested as a possible roadmap to success in these
areas, so we wanted to explore the implications with a known case of this
being done. Given the nature of news reporting, this implications here are
really only relevant to other Wikinews projects, not other WMF projects.
Original copy available at
http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analy…
Wikinews Content Import Analysis
*An analysis of the impact of content import on community development and
traffic on Serbian Wikinews*
One of the current goals of members of The Wikinewsie Group is to increase
participation on all projects. Three members of the group’s provisional
board are active on English Wikinews. Behind the scenes, the English
Wikinews community also have this as a goal. One of the perennial solutions
suggested by outsiders to the project is to increase community
participation and content by importing content from other similarly
licensed news projects to English Wikinews. This essay seeks to look at a
case study from another language Wikinews project to determine the impact
on overall content creation and community participation to determine if the
case study of external content import offers a path towards community
content creation and new contributor
recruitment.[1]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_C…
In 2009, Serbian Wikinews as a project, led by
Millosh<http://meta.wikimedia.org/wiki/User:Millosh>sh>,
created a bot that imported content from similarly licensed Serbian
language news sources with the content being published on the
project.[2]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Conte…
The
import bot, and subsequent surpassing of English Wikinews in terms of
content production, was a point of pride for the project with the news
documented in *Balkan
Insight.*[3]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Cont…
At
the same time, English Wikinews was also debating doing something similar,
but this was ultimately rejected by the community because the articles
would not meet the project’s
guidelines.[4]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Co…
Since
then, the two projects have diverged greatly in terms of policies.
What has the impact been on both communities in terms of content creation
rates before and after October 2009, when Serbian Wikinews started
importing content from other news sources? For perspective, Serbian
Wikinews was created in July 2005 while English Wikinews was created in
November
2004.[5]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_…
The
importing bot on Serbian Wikinews was active until April 3, 2013 when it
was blocked by an administrator citing the contributors as an undesirable
way to write
news.[6]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_…
All
comparisons between the two projects will be based on the start date for
Serbian Wikinews unless otherwise stated.
[
7]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import…
The
purpose of this analysis is to understand what happened on Serbian Wikinews
in the context of content import, to see how this impacted contributor
participation, overall content creation and article traffic. Once
understood, this can serve as a potential guide for implementation or
non-implementation of similar solutions for increasing content and
participation.
One of the first ways to measure the raw impact of the content import on
Serbian Wikinews is to examine the slope for article creation before and
after the import. If the import was successful, the slope of the line for
article creation as a factor of time would be steeper. Using the average
number of articles published per day in a month, the slope of the line for
the period between May 2005 and June 2009 was 1.05 for Serbian Wikinews and
-4.02 for English Wikinews. For the period between October 2009 and March
2013, the slope of the line is -0.29 for Serbian Wikinews and -4.57 for
English Wikinews. The growth on Serbian Wikinews increased steadily from
its founding until the period where they added the importing bot. Following
that period, the project saw a decline in growth. This contrasts to English
Wikinews which saw a decline in growth over the whole period.
[image: Graph en v sr
daily.png]<http://meta.wikimedia.org/wiki/File:Graph_en_v_sr_daily.png&g…
This is visualized in the graph above which shows the average daily article
production for both projects. English Wikinews has been incrementally
declining in total daily articles produced. Serbian Wikinews in comparison
was slowly increasing its daily production prior to the external content
production. Since that time, community production has decreased and
rapidly. As one data point, this appears to suggest that the content import
potentially is a net negative for community content creation.
Another potential way to determine the impact of the content importing is
to look at the slope of the line for editors who have ever contributed to
the project before and after the content import. This gives a perspective
regarding the ability to attract new contributors. Using the total number
of registered users who have ever contributed by
month[8]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_…-8>,
the slope of the line for the period between May 2005 and June 2009 was
3.1451 for Serbian Wikinews and 0.0460 for English Wikinews. For the period
between October 2009 and March 2013, the slope of the line is -2.0084 for
Serbian Wikinews and 0.1814 for English Wikinews. When the bot import
activity was halved, the period between October 2010 and March 2013 still
does not show pre-content import levels of attracting new
contributors[9]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_C…
with
a slope of 2.211. The rate of historical contributor growth can be seen on
the graph below. Serbian Wikinews’s rate of attracting new contributors was
worse after the content importing. The three month period where the bot was
turned off, the slope is 1.5, which suggests that turning it off alone did
not assist in attracting new contributors.
[image: Graph en v sr history
editors.png]<http://meta.wikimedia.org/wiki/File:Graph_en_v_sr_history_e…
Another way of looking at contributors is to look at the total number of
active editors contributing to the main space, where articles are
published, on a monthly basis. The slope can be calculated to see the
relative increase or decrease based on that. The slope of the line for the
period between May 2005 and June 2009 was 1.2970 for Serbian Wikinews and
-0.1246 for English Wikinews. For the period between October 2009 and March
2013, the slope of the line is -1.4319 for Serbian Wikinews and -0.1108 for
English Wikinews. Prior to the introduction of the news importer, Serbian
Wikinews had more participants as it had more articles. In the period prior
to the introduction of the imported content, Serbian Wikinews saw a small
increase in the number of active contributors on a monthly basis. Following
the introduction of imported content, they saw a decline of contributors at
the same rate.
Bot contributions could possibly be seen as facilitating the ease of human
participation on a project, with the content import on Serbian Wikinews
largely being done by a bot. The correlation for the total number of human
versus bot edits to the main space was calculated to confirm the
possibility that this might be
happening.[10]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Co…
This
data is not available for all months. This lack of monthly data accounts
for using different time periods than other data. For the period between
July 2005 and April 2009, the correlation between total bot edits and human
edits to the main space was 0.249. The correlation for the period between
October 2009 and April 2013, the period when content import was active, was
0.467 which suggests a small correlation between increased bot edits and
increased human edits. The correlation between bot edits and human edits
from October 2010 and April 2013 when bot imports was halved was -0.149,
which suggests a completely random relationship between human contributors
to the main space and bot edits to the main space. This suggests overall
there is no conclusive link that bot-related contributors impact
human-related contributors to a project.
One possible argument is that a community is not needed and human
production of original content is not required. The goal of projects is to
freely share knowledge: if this is knowledge that has been previously
published on another website and has a compatible license with the Wikinews
project and it can get traffic, a community is not necessarily a
requirement for the project to function. Traffic and imported content can
sustain the project.
With this thinking in mind, Serbian Wikinews´s model of content importing
could be argued as successful if it generated increased amounts of traffic
compared to periods when the import bot was not active and production rates
were lower. Using monthly traffic totals for Serbian
Wikinews[11]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Cont…
and
comparing that against daily production rates by month, the following graph
is generated.
[image: Graph en v sr
traffic.png]<http://meta.wikimedia.org/wiki/File:Graph_en_v_sr_traffic.p…
Serbian Wikinews had a surge in traffic in the initial period before the
implementation of the content importer following all time traffic lows.
Following this, the graph suggests that this content import led to a rise
in traffic, but this was not sustained. In fact, a large spike in traffic
occurred after the content import was
halved.[12]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Conte…
Traffic
in a number of periods actually appears higher than when the content import
was most
active.[13]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Conte…
Observational
data is, to a degree, also supported by correlation. The correlation
between the article traffic and total articles created by day from June
2008 to June 2009, the period before content import, was 0.72. This number
suggests a strong correlation between article production and traffic
totals. In the period between October 2009 and March 2013 when the import
bot was active, the correlation was 0.17. This suggests traffic relative to
article production was close to random. It is supported when the
correlation is found for the period between October 2009 and October 2010,
when the import bot was most active. In that period, the correlation was
-0.04, which suggests almost true randomness. In the period between October
2010 and March 2013, when the bot activity was halved, the correlation was
-0.47. This suggests that the greater the number of articles, the less
traffic Serbian Wikinews had. Serbian Wikinews did not benefit from
increased traffic during periods of increased article creation as a result
of content
import.[14]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Conte…
Serbian Wikinews has the largest archive of published news stories of any
news project. As of May 2013, their 75,000 articles account for 37% of all
content across all Wikinews projects. The next closest language project in
terms of content size is Polish Wikinews with about 25,000 articles and
English Wikinews with nearly 20,000 articles. The archived material could
be perceived as being useful by the wider community as a large archive of
historical news material. To determine this, the total monthly page views
was divided by the total number of articles on the project to determine the
relative access levels to these news stories as a historical archive. For
Serbian Wikinews, prior to the introduction of the content importer, the
project had an average of between 20 and 80 views per
article.[15]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Cont…
Following
the introduction of the content importing, the average monthly page views
per article drops to less than ten. As the graph below shows, this pattern
of per article drop off is consistent across English, Spanish and Polish
Wikinews , though for projects other than Serbian the percentage drop is
less.
[image: Graph en v sr traffic es
pl.png]<http://meta.wikimedia.org/wiki/File:Graph_en_v_sr_traffic_es_pl.…
The total average article views per month dropped for Serbian Wikinews and
it appears the project is not being used as a resource by Serbian speakers
to view historical news
stories.[16]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Cont…
On the whole, the data suggests that Serbian Wikinews did not benefit from
an increase in contributor written news stories, in creating a larger
editing community, or an increase in traffic as a result of new story
content import from other news reporting sites. This appears to be
something that the Serbian Wikinews community has recognised as problematic
when the importer was blocked from the community in April 2013. If other
Wikinews
projects[17]<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Cont…
are
considering content import, they should consider the lessons from Serbian
Wikinews to see if the outcomes achieved by the import match with the
project’s own goals.
Notes
1.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-1>
While editor retention would be ideal to study, editor retention is
much harder to address without looking at the individual history of
contributors. The number of Serbian Wikinews contributors is small enough
to make this feasible. Some information can be gleaned by looking at
http://stats.wikimedia.org/wikinews/EN/TablesWikipediaSR.htm and this is
an area where further analysis may be useful.
2.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-2>
Details of this are documented on English Wikinews at
Wikinews:Water_cooler/miscellaneous/archives/2009/October<http://en.wiki…
.
3.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-3>
http://www.balkaninsight.com/en/article/serbian-wikinews-first-in-number-of…
4.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-4>
Supporters at the time included Juliancolton, ShakataGaNai, and
Tempodivalse. Contributors opposing included Blood Red Sandman, Pi zero,
BrianMcNeil, and Bawolff. A record of some of this conversation can be
found at Wikinews:Water_cooler/miscellaneous/archives/2009/October#Articles
copied from
VOA<http://en.wikinews.org/wiki/en:_Wikinews:Water_cooler/miscellaneous/…OA>.
The Serbian Wikinews bot was imported and beta tested on the project
starting on October 2009, with the announcement made on
Wikinews:Reports/October
2009 <http://en.wikinews.org/wiki/en:_Wikinews:Reports/October_2009>.
The request to test the bot is found
here<http://en.wikinews.org/w/index.php?title=Wikinews:Water_cooler/misc…
.
5.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-5>
Stats used for the dates and all the data for this analysis can be
found on or linked from
http://stats.wikimedia.org/wikinews/EN/TablesWikipediansEditsGt5.htm
6.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-6>
See
Посебно:Доприноси/Millbot-Beta<http://en.wikinews.org/wiki/sr:_%D0%9F%D0…
for
a history of the bot’s editing.
7.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-7>
Most of this analysis assumes there are no other major changes to
either project that would lead to “unnatural” changes in community output
and participation. This is not true for English Wikinews, which underwent
major changes in reviewing. This led to the creation of a fork in September
2011, with the date mentioned at
Wikizine/EN2011-128<http://meta.wikimedia.org/wiki/Wikizine/EN2011-128&g…28>.
The project closed and deleted in August 2012 according to English
Wikinews’s Signpost at Wikipedia:Wikipedia Signpost/2012-08-20/News and
notes<http://en.wikipedia.org/wiki/en:_Wikipedia:Wikipedia_Signpost/2012…es>.
There are also other independent variables present on English Wikinews that
may possibly accountfor downward editing trends. Some of these mirror
patterns on English Wikipedia, including moves towards making information
more neutral, verifiable and greater enforcement of copyright policy.
8.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-8>
This number will always increase because it is not an average of who
has edited in a given month but historically how many people have ever
edited.
9.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-9>
As a point of contrast, following the Open Globe fork from English
Wikinews, the slope for new editor growth on English Wikinews was higher
than both of the periods mentioned. It was 0.232.
10.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-10>
This was done by adding the total number of editors to main space in
these groups.
11.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-11>
Data found at
http://stats.wikimedia.org/wikinews/EN/TablesPageViewsMonthly.htm ,
which provides stats from June 2008 to May 2013.
12.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-12>
The traffic averages in that periods suggest higher traffics, but this
is offset by the medians which suggest the opposite. The following tables
provide greater insight into traffic median and mode for these periods.
English, Spanish and Polish Wikinews traffic information is provided as a
basis for comparison.
PeriodMath / DatesMedian - SerbianMean - SerbianMedian - EnglishMean -
EnglishMedian - SpanishMean - SpanishMedian - PolishMean - PolishPrior
to content importJune 2008 - June 2009186,000.00183,384.625,700,000.00
5,838,461.54703,000.00724,846.15712,000.00714,615.38Content import
activeOctober
2009 - March 2013274,500.00322,571.435,550,000.005,695,238.10665,500.00
746,690.48689,000.00719,571.43Post Open Globe forkSeptember 2012 - June
2013296,500.00292,200.006,400,000.006,260,000.00710,000.00743,200.00
671,500.00696,000.00Content import halvedOctober 2010 - March 2013
274,500.00280,033.335,500,000.005,490,000.00649,500.00683,400.00
667,500.00698,833.33Content turned offMarch 2013 - June 2013271,000.00
266,666.676,500,000.006,666,666.67908,000.00913,000.00664,000.00
681,000.00Content import most activeOctober 2009 - October 2010256,000.00
405,461.546,100,000.006,153,846.15876,000.00878,384.62794,000.00
746,461.54
13.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-13>
As a point of reference, English Wikinews has a completely different
pattern than Serbian Wikinews.
<http://meta.wikimedia.org/wiki/File:Graph_en_v_sr_traffic_en.png>
<http://meta.wikimedia.org/wiki/File:Graph_en_v_sr_traffic_en.png>
English Wikinews articles created per day versus article views.
English Wikinews correlations generally suggest to a small degree, the
greater the content production, the more views, though the correlation for
the period between March 2013 and May 2013 suggests the opposite is true:
the less content produced, the greater the page views. Similar patterns
also hold relatively true for Spanish Wikinews. The period between
September 2012 and May 2013, which is the period after the closure of the
English Wikinews fork, has a correlation of 0.853. For Spanish Wikinews,
when compared directly to Serbian Wikinews, the pre-content import period
of June 2008 to June 2009, has the most randomness for the relationship
between daily content production and page views with a correlation of 0.23.
<http://meta.wikimedia.org/wiki/File:Graph_en_v_sr_traffic_es.png>
<http://meta.wikimedia.org/wiki/File:Graph_en_v_sr_traffic_es.png>
The relationship between per day article production and views on Spanish
Wikinews.
14.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-14>
Given the nature of SEO and the amount of traffic derived from Google,
it is possible that Google’s algorithm gave less value to Serbian Wikinews
articles that were copies from other sites. Serbian Wikinews also lacks a
visible Twitter and Facebook account. For English and Spanish Wikinews,
where Google may prefer the content because it is original and is more
likely to put results higher in searches, Google related traffic may be
more consistent overall. It is also possible that English and Spanish
Wikinews traffic may also be dependent on other variables such as type of
content, social media efforts, incoming links from sister projects, etc.
15.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-15>
This number is based on dividing the total number of articles and the
total number of monthly page views. This number is likely not a true
reflection of actual views because page views includes all pages on a
project and, according to the stats page, contains bot generated traffic
totals which account for roughly 15% of all page views counted.
16.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-16>
Some of this can possibly be explained by total language speakers.
Serbian is spoken by approximately 9.2 million people compared to 40
million Polish speakers and 500 million Spanish speakers. This cannot
completely account for all the differences. The June 2008 starting point is
80 views for Serbian Wikinews compared to 119 for Polish Wikinews and 314
for Spanish Wikinews. If traffic was based on relative population of
speakers, Serbian should have started at a lower average or Polish should
have started at a higher point: The two are too close, despite one language
having about 5 times as many speakers.
17.
↑<http://meta.wikimedia.org/wiki/User:LauraHale/Wikinews_Content_Import_Analysis#cite_ref-17>
This research is not applicable to other Wikimedia projects, because
news is news. Once published, new stories are generally not refactored.
Instead, new news stories are published with updated information. This
implicitly differs from Wikipedia, Wiktionary and Wikivoyage, where
imported content could easily be refactored, changed and updated by the
community. Other research would need to be done to determine the success of
content import on community and traffic on other sister projects.
Analysis conducted by
LauraHale<http://meta.wikimedia.org/wiki/User:LauraHale>le>.
Raw data used for analysis available upon request.
--
twitter: purplepopple
blog:
ozziesport.com