When user is looking for e certain information on Wikipedia, it sometimes find this information by using redirects. Not very often but number of hits on redirects  can be hundreds or even thousands times more than number of hit on corresponding articles. 
So if you consider subjects of what users were looking for it does not make sense to separate hits on redirects from hits on corresponding articles. 
When calculating data for wikipediatrends.com we combined URLs of articles and their redirects exectly like Oliver described. It was not very hard because we did it before summarising and cleaning raw data.


On Sun, Apr 27, 2014 at 9:21 PM, Oliver Keyes <okeyes@wikimedia.org> wrote:
The problem with that is that, as Henrik said, it works on the basis of URLs, not page names. The only way to discover "X is a redirect to Y" would be to prod the database for that information, for every unique URL.


On 27 April 2014 05:36, Jane Darnell <jane023@gmail.com> wrote:
Henrik,

I don't know about the page history part of this question, but going
forward it would be nice to offer an extra option to include hits on
all incoming redirects though. There are two issues with this. The
first is when the name needs disambiguation and gets split out into
towns, provinces or countries (for places) or gets split out into
occupations (for people).  The second is when wikipedians go through
with bots and "correct" links. I assume they do this so that the page
hits will become more representative, but I often work on old names
and I try to preserve original spelling (especially by older authors)
to increase the "findability". To do this I create a lot of redirects
based on older spellings to use in pages where the older spelling is
used in a reference. I've noticed "redirect" corrections in cases
where there is no disambiguation needed, and so I think offering an
option for all redirects might help stop that behavior.

Jane

2014-04-27 12:34 GMT+02:00, Henrik Abelsson <henrik@abelsson.com>:
>
> On 2014-04-27 08:45, ENWP Pine wrote:
>
>> * I think it would be desirable to add an https option to
>> stats.grok.se so that viewers' interests in page readership statistics
>> are more private.
>>
> Hm, why not? I'll request a certificate and start serving https also. I
> hadn't really thought of page readership statistics as something all
> that sensitive, but I don't see any downside to also serving https.
>> * There is an issue in the statistics given at [1]. As you can see
>> from [2] editors created and edited the project page on days when
>> stats.grok.se said there were no pageviews. This may be the result of
>> a page move [3] [4] and the pre-move views were not integrated into
>> the results shown in [1]. Is this the expected and desired behavior?
>> From my point of view as a Signpost author, this is undesirable as we
>> try to track our readership statistics. I think it is the case that if
>> page A is moved to new page B then the statistics for page A should be
>> integrated into those for page B and a notice should be given to the
>> viewer that the statistics for page B includes those from page A which
>> was moved on date X. This problem may affect other pages that are the
>> subject of mergers. I think it would be the case that if page A is
>> merged into page B then we would want some notice to appear on
>> stats.grok.se alerting the viewer that there was a merger, the date of
>> the merger, and offering the viewer a way to select statistics with
>> and without the historical information from page A as they look at the
>> viewership statistics for page B.
>>
> That would indeed be better. However, the statistics data tracks URLs
> rather than pages and it's computationally expensive to look up the page
> history and what URLs it has been accessible through. The average
> throughput of view statistics is that some tens of thousands of entries
> are added per minute 24/7. One could perhaps do it as the data was
> requested, but that would mean making several round-trips to the WMF
> servers to look up the history of all moves and correlate URLs across
> time. It's certainly possible to build a tool that does that on top of
> the stats.grok.se and wikipedia APIs though.
>
> -henrik
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics



--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics




--
Thank you.

Alex Druk
alex.druk@gmail.com
(775) 237-8550 Google voice