[Engineering] Logstash goofyness from 2016-01-01 through 2016-01-20 (seeking input)

Bryan Davis bd808 at wikimedia.org
Tue Jan 26 18:51:03 UTC 2016


On Wed, Jan 20, 2016 at 6:53 PM, Bryan Davis <bd808 at wikimedia.org> wrote:
> Something in the logstash process on logstash1001 went bananas when
> midnight 2015-12-31 rolled over to 2016-01-01. Specifically it decided
> that 2015 was so much fun that it would back up and do it again. This
> wasn't noticed until late in the UTC day on 2016-01-20 when comparing
> fatalmonitor output on fluorine to the kibana dashboard of the same
> name. I restarted the logstash process and it started using the
> correct year.
>
> The open question I have now is should try and salvage the 80 million
> logs events that are in the wrong Elasticsearch indices or just delete
> them? Fixing them will require dumping the log record data out as json
> blobs, changing the date and loading them back into the correct
> Elasticsearch index. I'm leaning towards just deleting them unless
> someone has a very strong objection.

The rogue 2015-01-* indices have all been dropped. Restarting the
Logstash processes seem to have corrected all of their confusion. I'm
going to write this off to Java + JRuby weirdness for now.

Bryan
-- 
Bryan Davis              Wikimedia Foundation    <bd808 at wikimedia.org>
[[m:User:BDavis_(WMF)]]  Sr Software Engineer            Boise, ID USA
irc: bd808                                        v:415.839.6885 x6855



More information about the Engineering mailing list