[Labs-l] [Wikitech-l] [tool labs] central logging service

Petr Bena benapetr at gmail.com
Sat Jun 8 10:36:44 UTC 2013


Replies bellow

On Sat, Jun 8, 2013 at 11:58 AM, Ori Livneh <ori at wikimedia.org> wrote:
> On Sat, Jun 8, 2013 at 1:44 AM, Petr Bena <benapetr at gmail.com> wrote:
>
>> Hi,
>>
>> per https://bugzilla.wikimedia.org/show_bug.cgi?id=48846 we are
>> setting up a logging service for tools and bots, so that every tool or
>> bot can easily log to a central logging base which in future should
>> support some nice web-based gui with log filtering (eventually some
>> 3rd notifications, like e-mails (sms) / irc pings) and so on.
>>
>> The website of this project is here: http://tools.wmflabs.org/logs/
>>
>> I haven't got much feedback so far so I would appreciate some, in this
>> moment we are in phase where we are discussing the best options how to
>> design this feature.
>>
>
> What does the 's' prefix mean?
>

it is shortcut for "store"

you can either type

s ...
or
store ...

it will do same, or you can use "n" (noreply) that will do same as s
or store but will send no reply on success

> What is the expected character encoding of messages?
>

You got me. I have no idea, either mono is smart enough to find out or
it's uses some standard encoding by default, like utf-8 or unicode? I
will find out. Google doesn't bring clear answer.

> How is malformed input handled? What happens if I log this message: "s test
> 0 hi\ns fake_tool 0 evil_message\n"?
>

malformed (invalid) input gets response ERROR: <explanation>

malformed (malicious) input is something I am still thinking of, I
don't expect many users to intentionally break logs of others, but I
am thinking of implementing optional authentication. If tool requested
to use authentication (it would subscribe / register a folder somehow)
it could for example get a temporary token that would stored somewhere
like /data/project/logs/tokens/tool and which would be readable only
by tool account.

This token would be updated periodically for security reasons and tool
would be required to provide it together with log message in order to
write to its folder.

But this is something I am still designing and I want to keep it
opt-in as it complicate stuff and many users will likely not require
it. Also we are getting far beyond possibilities of rsyslog now
(another reason why using own daemon could be better).

> If I don't terminate my message with a newline, does my message swallow the
> next one to come along?
>

I forgot to mention newline isn't required in UDP connection. You can
send multiple log lines using newline or just 1 line and it will be
parsed.

If you terminate TCP connection before writing newline (which means -
buffer is finished, store it - to daemon), it will be likely dropped
(you won't get the STORED response either).

> How would you go about adding a field, if you discovered you needed one?
> Would clients using the old format break?
>

What kind of field do you mean?

> How reliable does the service need to be? What is the cost of an outage? Is
> it OK for the service to be down for a few hours
>

It's not OK for it to be down for few hours, however the clients
should be able to recognize the server is down if they are using tcp
(because they won't be able to connect).

The wrapper script I made now (logeater) should be optimized to
recognize this as well and keep retrying until successful or
something. Outage is always a problem for any kind of logger.

>
>
>>
>> Some people believe that it would be best to use some already existing
>> logging service such as facebook's scribe, I /think/ that despite the
>> already made and working solution might be a good idea, on other hand
>> these usually were designed for another purpose than what we need and
>> thus it's quite complicated to set them up for our needs. Having said
>> that I started working on simple, but powerful logging daemon which
>> should do precisely what we need (and which can be infinitely extended
>> for our purposes) - but of course we can set up multiple solutions and
>> let tool operators pick what they prefer most.
>>
>
> I think your intuitions are good. Everyone has their favorite message
> queue, but most are quite complex. "Simple" and "adequate" are the virtues
> you should optimize for.
>
> Get input from some early adopters of your service, then iterate. Let
> people know that the service is experimental for now, so that you have
> plenty of room to change the protocol if you need to. If you find yourself
> adding feature after feature, stop, and reconsider existing solutions.
>
>
>> If it's simpler for a tool / bot operator to intergrate with syslog,
>> then why we shouldn't have it as well? But I must admit I found it
>> quite complicated to make rsyslog behave as we need (I believe it's
>> not even possible for it to match all out potential needs)
>>
>> _______________________________________________
>> Wikitech-l mailing list
>> Wikitech-l at lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



More information about the Labs-l mailing list