This looks possibly relevant: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Overview

On Thu, Jun 25, 2015 at 10:03 AM, James Douglas <jdouglas@wikimedia.org> wrote:
> The varnish logs == request logs == also in HDFS.

Ah ha, thanks!

> To get access you'll need a phabricator ticket asking for stat1002 and analytics cluster access, with Ottomata CCd to make the patch and Dan CCd to confirm you need it.

Cool, I'll get on that.  In the meantime, where can I learn about the infrastructure?


On Thu, Jun 25, 2015 at 10:01 AM, Oliver Keyes <okeyes@wikimedia.org> wrote:
The varnish logs == request logs == also in HDFS. To get access you'll
need a phabricator ticket asking for stat1002 and analytics cluster
access, with Ottomata CCd to make the patch and Dan CCd to confirm you
need it.

On 25 June 2015 at 12:53, James Douglas <jdouglas@wikimedia.org> wrote:
> From IRC, it sounds like this information ought to be available in the
> Varnish logs.  What's the story there?
>
> On Thu, Jun 25, 2015 at 9:52 AM, James Douglas <jdouglas@wikimedia.org>
> wrote:
>>
>> I misspoke: we're looking for HTTP requests coming from users who are
>> leaving the Portal, not retrieving the portal.
>>
>> e.g. Clicking on enwiki, using one of the search forms, etc.
>>
>> On Thu, Jun 25, 2015 at 9:50 AM, Oliver Keyes <okeyes@wikimedia.org>
>> wrote:
>>>
>>> * Nope :(
>>> * It's in HDFS!
>>>
>>> On 25 June 2015 at 12:05, James Douglas <jdouglas@wikimedia.org> wrote:
>>> > Let's say, hypothetically, that I wanted to measure information about
>>> > HTTP
>>> > requests coming into the Wikipedia Portal (www.wikipedia.org).
>>> >
>>> > * Do we record this information?
>>> >   * If so, is it accessible via analytical tools?
>>> >     * If so, how do I get my mitts on it?
>>> >   * If not, is it accessible from a database or similar?
>>> >
>>> > Context: https://phabricator.wikimedia.org/T100673
>>> >
>>> > _______________________________________________
>>> > Wikimedia-search mailing list
>>> > Wikimedia-search@lists.wikimedia.org
>>> > https://lists.wikimedia.org/mailman/listinfo/wikimedia-search
>>> >
>>>
>>>
>>>
>>> --
>>> Oliver Keyes
>>> Research Analyst
>>> Wikimedia Foundation
>>>
>>> _______________________________________________
>>> Wikimedia-search mailing list
>>> Wikimedia-search@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikimedia-search
>>
>>
>
>
> _______________________________________________
> Wikimedia-search mailing list
> Wikimedia-search@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikimedia-search
>



--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
Wikimedia-search mailing list
Wikimedia-search@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimedia-search