Everything is working great--thanks so much for the early-morning save, Mr. Andreescu! We really appreciate it!
On Fri, Jan 23, 2015 at 10:39 AM, Anna Koval akoval@wikimedia.org wrote:
+1 :)
Thanks, Dan, et al.
Worked great for me this morning. I'm a happy camper.
Anna :)
On Friday, January 23, 2015, Edward Galvez egalvez@wikimedia.org wrote:
Thank you so much!!! We really appreciate it!
-Edward
On Fri, Jan 23, 2015 at 9:31 AM, Dan Andreescu dandreescu@wikimedia.org wrote:
Wikimetrics has been having serious connectivity problems for a few days. It turned out to be solvable by using some new hostnames (labsdb1002.eqiad.wmnet). I fixed it just now, please retry your reports and let me know if anything is still wrong.
On Fri, Jan 23, 2015 at 10:46 AM, Dan Andreescu dandreescu@wikimedia.org wrote:
Hi everyone. I will work on this as soon as I get into the office, in about an hour from now. Yuvi suggested one thing that I wasn't aware of that might make this a simple fix.
On Friday, January 23, 2015, Dan Higgins dhiggins@wikimedia.org wrote:
Hi Kevin,
Sorry to be a pest but do you have any update on sorting out the Wikimetrics issues? It seems to have gotten worse since we last spoke to you with around 1 in 10 reports going through.
Thanks,
Dan
On Tue, Jan 20, 2015 at 7:17 PM, Kevin Leduc kevin@wikimedia.org wrote:
All the developers are in transit to SF today. Dan said he'd be in the office this afternoon. First dev I see I'll notify them of problems in wikimetrics.
On Tue, Jan 20, 2015 at 11:10 AM, Amanda Bittaker abittaker@wikimedia.org wrote: > > Hello again gentlemen, > > I think Dan might have already pinged you, but just in case, I wanted > to let you know that we are getting these failures again. It's kind > of crunch time for getting this data, so we're just banging our heads > against the wall and retrying the reports until they work (1 out of 4 > times for me.) Is there any way you all could work your magic again? > > Many thanks once again, > Amanda > > > > On Wed, Dec 10, 2014 at 4:30 PM, Kevin Leduc kevin@wikimedia.org > wrote: > > It's good to hear it's working again. Don't hesitate to reach out > > to us > > here or at wikimetrics@lists.wikimedia.org if you notice this kind > > of > > trouble again. > > > > On Wed, Dec 10, 2014 at 3:37 PM, Amanda Bittaker > > abittaker@wikimedia.org > > wrote: > >> > >> It's working perfectly now--a thousand thank yous, Dan and Marcel. > >> > >> On Wed, Dec 10, 2014 at 3:24 PM, Edward Galvez > >> egalvez@wikimedia.org > >> wrote: > >>> > >>> Thanks so much Dan and Marcel! > >>> > >>> -E > >>> > >>> > >>> On Wed, Dec 10, 2014 at 3:08 PM, Dan Andreescu > >>> dandreescu@wikimedia.org > >>> wrote: > >>>> > >>>> forgot Marcel - my fault. Jaime & folks, in general Marcel > >>>> rules and > >>>> he's probably going to help you out faster / better than I can. > >>>> > >>>> On Wed, Dec 10, 2014 at 5:57 PM, Dan Andreescu > >>>> dandreescu@wikimedia.org wrote: > >>>>> > >>>>> Ok, Amanda and anyone else who had problems. Please try again. > >>>>> I > >>>>> think I've cleared up some gunk and that might have helped > >>>>> things. We'll be > >>>>> looking at performance more closely soon. > >>>>> > >>>>> > >>>>> > >>>>> Steps taken, logging mostly for post-mortem purpose > >>>>> > >>>>> * delete from report where recurrent_parent_id is null and > >>>>> recurrent = > >>>>> 0 and created < date('2014-12-01'); > >>>>> ** This deleted records that are not visible in the system > >>>>> anymore. > >>>>> They are recoverable from the wikimetrics database backups but > >>>>> we don't need > >>>>> them in the database. These probably slowed some things down, > >>>>> in total the > >>>>> statement deleted 1623628 rows. > >>>>> > >>>>> * alter table report add column old_recurrent tinyint(1); > >>>>> update report > >>>>> set recurrent = 0, old_recurrent = 1 where user_id = 461 and > >>>>> recurrent = 1; > >>>>> ** This disables WikimetricsBot recurrent reports, but > >>>>> preserves the > >>>>> data so we can deal with them later. When labs is done > >>>>> re-synchronizing, we > >>>>> will be re-running these reports. They feed data to Vital > >>>>> Signs, in case > >>>>> someone's curious about what they are. > >>>>> > >>>>> * Stopped and rebooted the system. The backup system seems to > >>>>> be > >>>>> hanging or taking a really long time. I'd like to take a look > >>>>> at this in > >>>>> more depth, but my guess is the amount it's transferring has > >>>>> gone beyond > >>>>> what we expected. > >>>>> > >>>>> On Wed, Dec 10, 2014 at 5:23 PM, Dan Andreescu > >>>>> dandreescu@wikimedia.org wrote: > >>>>>> > >>>>>> We're sorry - the problems we were facing last week have > >>>>>> probably > >>>>>> festered. I'm going to turn off some things and reset the > >>>>>> system. I'll > >>>>>> report back. > >>>>>> > >>>>>> On Wed, Dec 10, 2014 at 4:47 PM, Amanda Bittaker > >>>>>> abittaker@wikimedia.org wrote: > >>>>>>> > >>>>>>> Oh yes, and Jaime did have me restart my browser and clear > >>>>>>> the cache, > >>>>>>> but it did not help. > >>>>>>> > >>>>>>> Thanks again, > >>>>>>> Amanda > >>>>>>> > >>>>>>> On Wed, Dec 10, 2014 at 1:45 PM, Amanda Bittaker > >>>>>>> abittaker@wikimedia.org wrote: > >>>>>>>> > >>>>>>>> Hello Kevin, > >>>>>>>> > >>>>>>>> Jaime asked me to email you about some trouble I've been > >>>>>>>> having with > >>>>>>>> Wikimetrics. The whole team has been experiencing a pretty > >>>>>>>> high rate of > >>>>>>>> failures in both report creation and cohort uploads. Almost > >>>>>>>> nothing has > >>>>>>>> gotten through for me today: of the last 13 reports I've > >>>>>>>> run, 3 were > >>>>>>>> successful. Of the failures, I would say maybe only two or > >>>>>>>> three "pended" > >>>>>>>> at all before becoming failures. I've been experiencing the > >>>>>>>> same problem > >>>>>>>> with cohort uploads. > >>>>>>>> > >>>>>>>> The reports have been: Newly Registered, Edits, and Rolling > >>>>>>>> Active > >>>>>>>> Editor using expanded cohorts. Please find attached an > >>>>>>>> example of one of > >>>>>>>> the reports. I tried uploading cohorts using text files of > >>>>>>>> user names and > >>>>>>>> pasting user names from Notepad into the "Paste Usernames" > >>>>>>>> field. I do > >>>>>>>> expand the cohorts every time. > >>>>>>>> > >>>>>>>> Do you know why the failure rate is so high, especially this > >>>>>>>> morning, and is there a way to eliminate or mitigate this > >>>>>>>> problem in the > >>>>>>>> future? > >>>>>>>> > >>>>>>>> Many thanks for the assistance, and please do let me know if > >>>>>>>> you > >>>>>>>> need any more information from me on this. > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Amanda > >>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >>> > >>> > >>> -- > >>> Edward Galvez > >>> Program Evaluation Associate > >>> Wikimedia Foundation > >> > >> > >
-- Edward Galvez Program Evaluation Associate Wikimedia Foundation
-- Sent from Gmail Mobile
Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics