The graphs are for wikipedia - the individual languages in them. So graphs for 'en' gives the data for en alone. 

They don't seem to show any dip associated with the launch of wikidata. 

On Sun, Jul 19, 2015 at 9:37 PM, WereSpielChequers <> wrote:
Is the longevity measured within projects or across all projects? Anecdotally the launch of wiki data has lost a number of editors from various Wikipedia as they migrated to wiki data. 

On Sunday, 19 July 2015, jeph <> wrote:
Hi Pine,

If you want to look at editor longevity/retention you need to look at - Editor Cohort Longevity.  Eg: A screenshot of the the editor cohort longevity graph with the filter set at 5% - 100% for en. 

The link( you mentioned shows - the total edit activity of a cohort over its lifetime.

The editor longevity/retention measured by active edit sessions in 'en' had a fall from 2005 levels and has since stabilized at the lower 2007 levels. A graph with similar result - pointed out to me by Atlasowa


On Sun, Jul 19, 2015 at 2:12 PM, Netha Hussain <> wrote:
Hello everyone,

 I am Netha Hussain, a Wikimania attendee. I will be around at India booth in Chapters Village from 9 to 10 am on July 19 to talk about Jeph's graphs about editor activity analysis. I am also anticipating Jeph to come online over my computer during the same time, so you can also direct your questions to the creator himself! 

See you later today at India booth!


On Sun, Jul 19, 2015 at 1:26 PM, Pine W <> wrote:
Hi Jeph,

Interesting. Am I reading correctly from that editor longevity, measured by number of edit sessions, is continuing to shrink over time?



On Fri, Jul 17, 2015 at 6:52 AM, jeph <> wrote:
Hi All,

If any of of you are at the wikimania currently, Netha Hussain would be happy to run you guys through the graphs in person and take any questions on them. She is free from 09:00 - 11:00 CDT (mexican local time) on Sunday. 


On Wed, Jul 15, 2015 at 1:19 PM, jeph <> wrote:
@ Aaron
  • I've added the definitions.'m using the historical definition - for an active editor.
  • The longevity graph shows some interesting results when we compare 'en' with other languages like 'es', 'zh' etc. I'll upload them and send the link as soon as I can.
  • Graphs 4&5 do show results that are different from [1]
    • Graph 4 - Monthly Editor Activity Split By Cohort - Stacked Bars . Selecting 1-2 in the selector show the contribution of the cohort that joined in that month in all the months.
    • Graph 5 - Monthly Editor Activity % Split By Cohort - Stacked Bars. Selecting the same (1-2 in the selector) show the contribution of the cohort that joined in that month as a % of the total activity in a given month for all months.
      • In the month of Jan 07 , The cohort Jan 07 contributed 'x'%. 
      • The activity in month Jan 07 = cohort Jan 07 + cohort Dec 06 ..... Cohort Jan 01.
    • The editor activity peaks in Jan 07 - March 07 as shown in Graph 4 and many other graphs [1] etc.
    • Graph 5 show that for the same period the contribution in % for the cohorts joining in the months Jan 07 - March 07, aka the new comers each month has remained the same and it is < 40 %. So the older editors contributed 60+ % in those months. Which tell us that the contribution to the fall in active editors lies both with the new editors in a month and also the older editors. In fact the older editors contributed more to the fall.
  • I have not looked specifically at (No of edits in first session after registration)
  • It was [1] that got me working on the graphs :-)
There are five different graphs at The explanation for each of them can be found at the bottom of each graph. I've generated the graphs for other wikis too 'es', 'de', 'ru' etc. I'll put them up as soon as I can.

On Wed, Jul 15, 2015 at 4:27 AM, Aaron Halfaker <> wrote:
There are a lot of undefined metrics in your methods.  For example, what do you mean by "canonical definition of edit sessions".  Is it [0]?  Also, is there something that we learn from this longevity analysis that we didn't learn from previous research? E.g. [1] and [2].  One point that I think would look into is the engagement measure used in [1] (# of edits in first session after registration).  In my work on [1], it looked like this stat remained consistent since 2004 and therefor didn't seem to explain the drop in newcomer retention. 


On Tue, Jul 14, 2015 at 2:01 PM, jeph <> wrote:
Hi All,

I been working on graphs to visualize the entire edit activity of in wiki for some time now. I'm documenting all of it at

The graphs can be viewed at Currently only graphs for 'en' have been put up, I'll add the graphs for the wikis soon.

  • The editors are split into groups based on the month in which they made their first edit.
  • The active edit sessions (value or percentage etc) for the groups are then plotted as stacked bars or as a matrix. I've used the canonical definition of an active edit session. The value are + or - .1% of the values on
  • There is a selector on each graph that lets you filter the data in the graph. On moving the cursor to the left end of the selector you will get a resize cursor. The selection can then are moved or redrawn.
  • In graphs 1,2 the selector filters by percentage.
  • In graphs 3,4,5 the selector filters by the age of the cohort.
Preliminary Finding
Would you to hear what you guys think of the graphs & any ideas you would have for me.


Wiki-research-l mailing list

Wiki-research-l mailing list

Wiki-research-l mailing list

Wiki-research-l mailing list

Netha Hussain
Student of Medicine and Surgery
Govt. Medical College, Kozhikode

Blogs :

Wiki-research-l mailing list

Wiki-research-l mailing list