I'm not Adrian but we work together on this project, and that's indeed
what we're doing, and the guess was correct as well.
Thanks so far!
On Mon 15.05 08:45, Addshore wrote:
I believe in this case data is being crunched, in
hadoop, which is where
the WDQS access logs are.
And I think the page in question that Adrian wanted to load was
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples,
at a guess he is looking at how often these example queries are requested
via the service.
On Mon, 15 May 2017 at 00:22 Nuria Ruiz <nuria(a)wikimedia.org> wrote:
> >(i.e. implying that we need to collect the data somewhere else, and move
> to production for number crunching only)?
> I think we should probably set up a sync up so you get an overview of how
> this works cause this is a brief response. Data is harvested in some
> production machines, it is processed (in different production machines) and
> moved to stats machines (also production but a sheltered environment). We
> do not use stats machines to harvest data. They just provide access to it
> and are sized so you can process and crunch data, this talk explains a bit
> how does this all works:
https://www.youtube.com/watch?v=tx1pagZOsiM
>
> We might be talking pass each other here, if so, a meeting might help.
>
>
> >Nuria, what exactly do you have in mind when you say "a development
> instance of Wikidata"?
> If you need to look at a wikidata query and see what it shows on the logs
> when you query x or y, that step should be done on a (wikidata) *test
> environment* that logs the http requests for your queries as received by
> the server. So you can "test" your queries agains a server and see how
> those are received.
>
>
> Thanks,
>
> Nuria
>
>
>
>
>
> On Sun, May 14, 2017 at 1:10 PM, Adrian Bielefeldt <
> Adrian.Bielefeldt(a)mailbox.tu-dresden.de> wrote:
>
>> Hi Addshore,
>> thanks for the advice, I can now connect.
>>
>> Greetings,
>>
>> Adrian
>>
>>
>> On 05/13/2017 05:47 PM, Addshore wrote:
>>
>> You should be able to connect to
query.wikidata.org via the webproxy.
>>
>>
https://wikitech.wikimedia.org/wiki/HTTP_proxy
>>
>> On Sat, 13 May 2017 at 15:23 Adrian Bielefeldt <
>> Adrian.Bielefeldt(a)mailbox.tu-dresden.de> wrote:
>>
>>> Hello Nuri,
>>>
>>> I'm working on a project
>>>
<https://meta.wikimedia.org/wiki/Research:Understanding_Wikidata_Queries>
>>> analyzing the wikidata SPARQL-queries. We extract specific fields (e.g.
>>> uri_query, hour) from wmf.wdqs_extract, parse the queries with a java
>>> program using open_rdf as the parser and then analyze it for different
>>> metrics like variable count, which entities are being used and so on.
>>>
>>> At the moment I'm working on checking which entries equal one of the
>>> example queries at
>>>
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples
>>> using this
>>>
<https://github.com/Wikidata/QueryAnalysis/blob/master/src/main/java/general/Main.java#L339-L376>
>>> code. Unfortunately the program cannot connect to the website, so I'm
>>> assuming I have to create an exception for this request or ask for it to be
>>> created.
>>>
>>> Greetings,
>>>
>>> Adrian
>>> _______________________________________________
>>> Analytics mailing list
>>> Analytics(a)lists.wikimedia.org
>>>
https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>
>>
>> _______________________________________________
>> Analytics mailing
listAnalytics@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics(a)lists.wikimedia.org
>>
https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics