1. Is there a unique key for the query log? The log I
am refering to is
the *wdqs_extract* table from the hive database wmf. We would like
to be
able to permanently link our own >computed data with the log entry we
computed it from.
The answer is no, there is not, other than one you can calculate with the
data available.
3. Is there any other database system besides hive
installed on the server?
Ahemm.. hive is not a database but I imagine if you are
asking whether you
need to write hive-friendly sql to access data? The answer is yes, you have
to. You are talking to hadoop with SQL that is going to serialize itself
into java code and return you the data you are interested in.
Beeline or hive should work.
On Tue, Jan 3, 2017 at 9:30 AM, Stas Malyshev <smalyshev(a)wikimedia.org>
wrote:
Hi!
1. Is there a unique key for the query log?
The log I am refering to
is the *wdqs_extract* table**from
the hive database wmf.**We would like to be able to
permanently link our own computed data with the log entry we
computed it from.
I think you can use hostname+sequence (from
https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequest, assuming
those are preserved in wdqs_extract) as a key.
2. Is it possible to find out if a query in a
given log entry was
accepted by the sparql endpoint as valid?
If it wasn't, the result code should be 400.
3. Is there any other database system besides
hive installed on the
server?
I think the currently recommended interface is beeline, not sure about
other DB systems.
And finally a question on conventions for
this mailing list: Am I
correct in sending one mail for multiple questions or should I send
separate mails for each question?
I think it's ok. For the questions regarding data and other WDQS
specifics you may also CC me or discovery(a)lists.wikimedia.org.
--
Stas Malyshev
smalyshev(a)wikimedia.org
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics