Jussi-Ville Heiskanen wrote:
Charles Matthews wrote:
Thomas Dalton wrote:
2009/2/16 Charles Matthews charles.r.matthews@ntlworld.com:
I believe we have another decade before Wikipedia lives up to its potential as a comprehensive reference. My main hope is that life around the wiki stays dull enough so that the job largely gets done.
Indeed. Current predictions show growth in terms of article numbers pretty much ending in around 4 or 5 years time. We'll then need several more years to actually get all the articles up the scratch. A decade may even be optimistic.
Yeah, well, my reaction to the whole "fruit" discussion is that it is systemic-bias-lite. I'll settle for five years to start most of the articles of interest to those with a fairly parochial view of what constitutes an interesting topic, and 25 years more to catch up with the rest of the planet. You're not telling me that we'll have articles correspording to all the other language versions - total interwiki converage - by 2014?
Personally I think this is a very interesting point. You will forgive if I have asked this before, and not gotten a reply. (I honestly forget if I have broached this subject before, I know I have often thought I should ask the question.)
Does anyone know how many unique (that is not reproduced around other languages) articles there are in toto in the non-English language wikipedias, which do not have a corresponding English language wikipedia article? Can even a rough estimate be made?
Yours,
Jussi-Ville Heiskanen
On Wed, Feb 18, 2009 at 9:05 PM, Jussi-Ville Heiskanen cimonavaro@gmail.com wrote:
Personally I think this is a very interesting point. You will forgive if I have asked this before, and not gotten a reply. (I honestly forget if I have broached this subject before, I know I have often thought I should ask the question.)
Does anyone know how many unique (that is not reproduced around other languages) articles there are in toto in the non-English language wikipedias, which do not have a corresponding English language wikipedia article? Can even a rough estimate be made?
Yours,
Jussi-Ville Heiskanen
[[User:Piotrus/Wikipedia interwiki and specialized knowledge test]] https://secure.wikimedia.org/wikipedia/en/wiki/User:Piotrus/Wikipedia_interw...
Might be of interest.
Gwern Branwen wrote:
On Wed, Feb 18, 2009 at 9:05 PM, Jussi-Ville Heiskanen cimonavaro@gmail.com wrote:
Personally I think this is a very interesting point. You will forgive if I have asked this before, and not gotten a reply. (I honestly forget if I have broached this subject before, I know I have often thought I should ask the question.)
Does anyone know how many unique (that is not reproduced around other languages) articles there are in toto in the non-English language wikipedias, which do not have a corresponding English language wikipedia article? Can even a rough estimate be made?
Yours,
Jussi-Ville Heiskanen
[[User:Piotrus/Wikipedia interwiki and specialized knowledge test]] https://secure.wikimedia.org/wikipedia/en/wiki/User:Piotrus/Wikipedia_interw...
Might be of interest.
It is.
If I read that correctly the upshot seemed to be that just by translating articles from the non-English language wikipedias (presuming they would not be deleted immediately because of a lack of English language web-sources :-( that is) there would be fertile ground for an addition of around two million new articles to the English language wikipedia.
Would there be any workable way to create a big (huge?) "Missing Articles" project by somehow mass generating a list of the various non-English language articles still not translated to the English language wikipedia?
<smirk> At least the creation of such lists from those various language projects wouldn't be problematic in terms of database copyright. </smirk>
Yours,
Jussi-Ville Heiskanen
2009/2/19 Jussi-Ville Heiskanen cimonavaro@gmail.com:
If I read that correctly the upshot seemed to be that just by translating articles from the non-English language wikipedias (presuming they would not be deleted immediately because of a lack of English language web-sources :-( that is) there would be fertile ground for an addition of around two million new articles to the English language wikipedia.
That was in summer 2006, the latest stats in the updates correspond to 4 million articles waiting for translation (although that's an over estimate since it doesn't take into account articles that exist in multiple languages, just not English). The test is very imprecise and should only be used to give a rough idea - I think we can say there are "millions" of articles waiting for translation, anything more than that would be false precision.
Presumably whether something is worth doing in the wikipedia could be defined to depend on whether anybody appreciates us doing it.
So I wonder if these location articles were translated whether they would see much traffic? Do we have any evidence from any that have been translated how much they were read, compared to how much they were read in the original wiki?
Jussi-Ville Heiskanen schreef:
Would there be any workable way to create a big (huge?) "Missing Articles" project by somehow mass generating a list of the various non-English language articles still not translated to the English language wikipedia?
I did something like this, four years ago. Made a list of all German articles with some interwiki links but no link to enwiki. I posted it at the WikiProject Missing Articles, and there were several people who were interested in working on it.
See [[Wikipedia:WikiProject Missing encyclopedic articles/de]] and [[Wikipedia:WikiProject Missing encyclopedic articles/fr]]. It may be worth doing this again. (But I lack the resources of doing it at the moment, as the database dumps have grown too much for me to handle them.)
Eugene
In general, about article creation, this has obviously slowed quite a lot. However, I think that is a good thing, as that means that article writers now have a chance to catch up to all the new articles. That is why the precentage of articles that are GAs, FAs, or FLs is rising, and will continue to rise.
Currently, these 3 categories combined make up about 1 in 277 articles (0.36%). It's better than before, but it obviously could go a lot further. That's why I think the focus during Wikipedia's "adolescent" should be article improvement, and not so much creation.
On a related topic, really the only way to increase in this way is by getting for users. I'm curious, as the growth in Wikipedia has slowed, has the numbers of ACTIVE users slowed as well? Because really, beyond just the policies and long-winded arguments on ANI, there's the fact that you can't watch over all the articles that Wikipedia has without getting more and more editors. Another reason why being nice to newcomers and leaving a good first impression is so crucial. Obviously, as you can read in the Slashdot comments (and many other places), this is not Wikipedia's strength, at all.
Mark Nilrad
Mark Nilrad wrote:
I'm curious, as the growth in Wikipedia has slowed, has the numbers of ACTIVE users slowed as well?
If you're talking about the demographics of editors - I think it is now more three years since WP attracted a very large group of people, arriving over a few months only, who created a "boom" in article production (quantity not quality). Many of those will have left by now - others have become some of our most productive editors. This can only happen once: WP became a Net phenomenon at some point in 2005, and that was because all of a sudden many people heard of it who hadn't before, or who had ignored it. I would say the growth in editors was "over trend" at that point. We are seeing more like a sustainable rate now, and probably (who knows?) a higher proportion of "encyclopedist" types.
Obviously, as you can read in the Slashdot comments (and many other places), this is not Wikipedia's strength, at all.
One thing that is not at all obvious to me is that there is any really really credible reporting on this or other aspects of Wikipedia. It's anecdotal at best - one or two incidents taken to stand for the site as a whole, and its complexities. Plus people writing ignorant and inaccurate stuff, of course.
Charles
On Fri, Feb 20, 2009 at 4:07 PM, Charles Matthews charles.r.matthews@ntlworld.com wrote:
[outside views of Wikipedia among general public]
One thing that is not at all obvious to me is that there is any really really credible reporting on this or other aspects of Wikipedia. It's anecdotal at best - one or two incidents taken to stand for the site as a whole, and its complexities. Plus people writing ignorant and inaccurate stuff, of course.
I agree, but there is probably a lot more unreported anecdotal stuff where people hear from others what Wikipedia is like and gain a false (or true, YMMV) impression of what Wikipedia is like and what it is about. There are certainly a lot of misunderstandings around, including the one that Wikipedia is some homogenous whole, when in fact it is a lot larger, mixed ad varied than people realise. Though there are uniformities and constants as well. But the treatment a first-time editor gets depends on who they encounter, what they edit and how good they are at adapting to the standards they encounter. What everyone can do is try and take the time to encourage new editors and not treat people as if they should know what to do (or not do).
Carcharoth
On Fri, Feb 20, 2009 at 11:07 AM, Charles Matthews charles.r.matthews@ntlworld.com wrote:
Mark Nilrad wrote:
I'm curious, as the growth in Wikipedia has slowed, has the numbers of ACTIVE users slowed as well?
If you're talking about the demographics of editors - I think it is now more three years since WP attracted a very large group of people, arriving over a few months only, who created a "boom" in article production (quantity not quality). Many of those will have left by now - others have become some of our most productive editors. This can only happen once: WP became a Net phenomenon at some point in 2005, and that was because all of a sudden many people heard of it who hadn't before, or who had ignored it. I would say the growth in editors was "over trend" at that point. We are seeing more like a sustainable rate now, and probably (who knows?) a higher proportion of "encyclopedist" types.
The short answer is yes, the number of active editors has declined. See [[Wikipedia:Editing frequency]] and [[Wikipedia:Wikipedia Signpost/2009-01-03/Editing stats]]. For the most part, I agree with Charles Matthews's hopeful interpretation that Wikipedia is entering a stable, sustainable phase, although others see more dire futures based on the demographic trends. I think a different sort of person may be attracted to the project now.
Note, however, that although new users peaked in early 2007 (the same time community size peaked, and about six months after creation rate peaked), the rate of new accounts has actually been moderately stable, varying only about 50% since about 2006. Also, some published and unpublished studies suggest that the the editor turnover rate is considerably higher than MeatballWiki would have us believe, although I'm not yet clear how much that applies to the very, very active core of editors.
Obviously, as you can read in the Slashdot comments (and many other places), this is not Wikipedia's strength, at all.
One thing that is not at all obvious to me is that there is any really really credible reporting on this or other aspects of Wikipedia. It's anecdotal at best - one or two incidents taken to stand for the site as a whole, and its complexities. Plus people writing ignorant and inaccurate stuff, of course.
Indeed. So much stuff goes on, and no one has really dared to dive in deep enough in a systematic way to produce convincing stories about the social dynamics of the project.
I was thinking about this, and it might be nice to do some experiments to find out what kinds of experiences make new users become regular editors. Similar to what happened with the fundraiser banner messages, as a start maybe we could design several different new users greetings to replace standard template greetings, and randomly sort which message is given to any particular new users. Each message would emphasize something different about Wikipedia, e.g., community, needed new articles, citations and improving existing articles, having fun, doing something good for the world, etc. Then weeks, months, and a year later we can find out whether the initial frame of the project has a significant impact, and possibly tailor the way we treat newbies to better pull them into the community.
-Sage (User:Ragesoss)
On Fri, Feb 20, 2009 at 10:34 AM, Mark Nilrad marknilrad@yahoo.com wrote:
In general, about article creation, this has obviously slowed quite a lot. However, I think that is a good thing, as that means that article writers now have a chance to catch up to all the new articles. That is why the precentage of articles that are GAs, FAs, or FLs is rising, and will continue to rise.
Currently, these 3 categories combined make up about 1 in 277 articles (0.36%). It's better than before, but it obviously could go a lot further. That's why I think the focus during Wikipedia's "adolescent" should be article improvement, and not so much creation.
On a related topic, really the only way to increase in this way is by getting for users. I'm curious, as the growth in Wikipedia has slowed, has the numbers of ACTIVE users slowed as well? Because really, beyond just the policies and long-winded arguments on ANI, there's the fact that you can't watch over all the articles that Wikipedia has without getting more and more editors. Another reason why being nice to newcomers and leaving a good first impression is so crucial. Obviously, as you can read in the Slashdot comments (and many other places), this is not Wikipedia's strength, at all.
Mark Nilrad
Relevant to this discussion: https://secure.wikimedia.org/wikipedia/en/wiki/Wikipedia:Wikipedia_Signpost/...
"The new data shows that the community continued to grow for about six months after the peak in article growth rate, reaching a maximum of 18,126 registered user accounts (excluding bots) with at least 20 article namespace edits in the month of March 2007. By September 2008 (the last month covered by the new statistics) only 13,971 accounts made 20 or more article edits. Anonymous edits and total non-bot edits across all namespaces also peaked in March 2007."
"In 2005, users making at least one article edit in a given month were about twice as likely to make over 100 edits that month than in 2007 and 2008. However, the proportion of active users making at least 2500 article edits per month has been rising since early 2007."
"User:MBisanz has charted the number of new accounts registered per month, which tells a very similar story: March 2007 recorded the largest number of new accounts, and the rate of new account creation has fallen significantly since then. Declines in activity have also been noted, and fretted about, at Wikipedia:Requests for adminship."
My take on this is that all this tells a story of ever fewer people joining, and casual editors continuously winding down & leaving. This means that the small group of dead-ender hardcore editors will grow as a percentage, which may seem like a good thing ('yay, more editors are becoming obssessed!') until you consider the larger picture.
(My cynical sarcastic take on this is to thank all the reference nazis & deletionists & vandal-fighters; we couldn't've done't without ye!)
Gwern Branwen wrote:
"User:MBisanz has charted the number of new accounts registered per month, which tells a very similar story: March 2007 recorded the largest number of new accounts, and the rate of new account creation has fallen significantly since then. Declines in activity have also been noted, and fretted about, at Wikipedia:Requests for adminship."
My take on this is that all this tells a story of ever fewer people joining, and casual editors continuously winding down & leaving. This means that the small group of dead-ender hardcore editors will grow as a percentage, which may seem like a good thing ('yay, more editors are becoming obssessed!') until you consider the larger picture.
(My cynical sarcastic take on this is to thank all the reference nazis & deletionists & vandal-fighters; we couldn't've done't without ye!)
But note that most accounts don't edit _at all, ever_. Account creation may be the simplest statistic to get, but it is also the most simplistic.
The English Wikipedia is around 10,000 editors, basically.
Charles
2009/2/19 Jussi-Ville Heiskanen cimonavaro@gmail.com:
Does anyone know how many unique (that is not reproduced around other languages) articles there are in toto in the non-English language wikipedias, which do not have a corresponding English language wikipedia article? Can even a rough estimate be made?
It's difficult to say, since things are divided up differently in different language versions. That's why getting interwiki links right is so difficult in many cases.
Jussi-Ville Heiskanen wrote:
Personally I think this is a very interesting point. You will forgive if I have asked this before, and not gotten a reply. (I honestly forget if I have broached this subject before, I know I have often thought I should ask the question.)
Does anyone know how many unique (that is not reproduced around other languages) articles there are in toto in the non-English language wikipedias, which do not have a corresponding English language wikipedia article? Can even a rough estimate be made?
On the basis of clicking "Zufälliger Artikel" 50 times, it looks to me like around 50% of deWP articles do not have interwiki to enWP. Only a small proportion of those without such interwiki look like they should have a corresponding article in enWP. The proportion with interwiki but no English interwiki is not huge - say 25%? This is not a very sophisticated technique from a statistical point of view, but it could be refined to get a better view by sampling of the overlapping of the Wikipedias. It all suggests the answer to the question is "around one million" - not 500000 (too low), not two million (maybe too high?).
Charles
On Thu, Feb 19, 2009 at 2:18 PM, Charles Matthews charles.r.matthews@ntlworld.com wrote:
Jussi-Ville Heiskanen wrote:
Personally I think this is a very interesting point. You will forgive if I have asked this before, and not gotten a reply. (I honestly forget if I have broached this subject before, I know I have often thought I should ask the question.)
Does anyone know how many unique (that is not reproduced around other languages) articles there are in toto in the non-English language wikipedias, which do not have a corresponding English language wikipedia article? Can even a rough estimate be made?
On the basis of clicking "Zufälliger Artikel" 50 times, it looks to me like around 50% of deWP articles do not have interwiki to enWP. Only a small proportion of those without such interwiki look like they should have a corresponding article in enWP. The proportion with interwiki but no English interwiki is not huge - say 25%? This is not a very sophisticated technique from a statistical point of view, but it could be refined to get a better view by sampling of the overlapping of the Wikipedias. It all suggests the answer to the question is "around one million" - not 500000 (too low), not two million (maybe too high?).
Does anyone know the answer to the opposite question? How many articles on the English Wikipedia lack interwiki links? It is possible (but less likely) that the articles exist in both places, but haven't been linked with an interwiki yet. I find examples of that fairly regularly, but am not sure how common it is.
Carcharoth
Does anyone know the answer to the opposite question? How many articles on the English Wikipedia lack interwiki links? It is possible (but less likely) that the articles exist in both places, but haven't been linked with an interwiki yet. I find examples of that fairly regularly, but am not sure how common it is.
Carcharoth
I was just doing a sample of 50 pages on enWP anyway to estimate the mean number of interwiki links - it's around three, but clearly skewed by hitting very common topics. Anyway, it doesn't seem stupid to divide total articles in Wikipedias by 3 to get total unique topics: enWP probably has more unique topics than other languages. So, guess what, consistent with "one million to translate" as ballpark figure.
On the question posed by Carcharoth, it looks like around half of enWP's articles have no interwiki.
Charles