On Thu, Sep 17, 2009 at 9:55 PM, Robert Rohde <rarohde(a)gmail.com> wrote:
On Thu, Sep 17, 2009 at 8:25 PM, Steve Bennett
<stevagewp(a)gmail.com>
wrote:
On Fri, Sep 18, 2009 at 12:20 PM, Robert Rohde
<rarohde(a)gmail.com>
wrote:
That
particular result is unpublished. I could make you a list of
infrequently viewed articles, but it would be quite long.
Could you make a list of the 100 least viewed? Or are there are large
number which are essentially equal?
My sample consisted of collating 30 non-consecutive hours of data on
enwiki traffic where each hour was randomly chosen from any point
during the last 8 months. This was filtered to only include page
titles that were valid mainspace pages.
In those 30 hours, there are 1.36 million valid article titles that
are viewed exactly once [1].
Examples include:
129342_Ependes
1421_in_literature
Antiprotonic_helium
Antonella_Mularoni
Madhusoodhanan_Nair
Blue_Murder_(play)
Ozonotherapy
Veronika_Krausas
Verret,_New_Brunswick
Bare_Truth_(Nat_album)
As you can see, these are obscure topics, but they are not necessarily
crazy topics. If I were to repeat it with a longer baseline (say 1000
hours rather than 30) I'm suspect you might get more interesting
information on the tail, but right now probably the best I can say is
that a cumulatively significant amount of traffic goes to relatively
obscure pages.
-Robert Rohde
[1] Note: Because the traffic data is based on url request stings, and
some url strings map to the same pages, i.e. Blue_Ocean and
Blue%20Ocean, the number of valid article titles in not necessarily
the same as the number of distinct pages. For practical reasons my
analysis was based of the url strings, and so probably over counts the
number of distinct articles involved, and to a degree overstates the
fraction of traffic to obscure pages.