Hey, We are currently having problem with ORES service which sends out time out errors intermittently. More information can be found in https://phabricator.wikimedia.org/T167819
We will let you know when the service is back to full capacity.
Best
Update, Everything is normal now. See the ticket for more details.
Sorry for any inconvenience caused by this incident.
Best
On Jun 13, 2017 11:25 PM, "Amir Tafreshi" amir.tafreshi_ext@wikimedia.de wrote:
Hey, We are currently having problem with ORES service which sends out time out errors intermittently. More information can be found in https://phabricator.wikimedia.org/T167819
We will let you know when the service is back to full capacity.
Best
Amir Sarabadani Tafreshi Software Engineer (contractor)
Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
FYI: https://wikitech.wikimedia.org/wiki/Incident_documentation/20170613-ORES
On Tue, Jun 13, 2017 at 3:19 PM, Amir Tafreshi < amir.tafreshi_ext@wikimedia.de> wrote:
Update, Everything is normal now. See the ticket for more details.
Sorry for any inconvenience caused by this incident.
Best
On Jun 13, 2017 11:25 PM, "Amir Tafreshi" amir.tafreshi_ext@wikimedia.de wrote:
Hey, We are currently having problem with ORES service which sends out time out errors intermittently. More information can be found in https://phabricator.wikimedia.org/T167819
We will let you know when the service is back to full capacity.
Best
Amir Sarabadani Tafreshi Software Engineer (contractor)
Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
AI mailing list AI@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ai
- 19:24 mutante: scb1001 - killed process 10971 (pdfrendering/electron)
This apparently fixed it.
I see there is already this https://gerrit.wikimedia.org/r/358888 now
I checked the pdfrender instances on the other SCB nodes, and some used 7G and 15G respectively. My patch limits this to 2G, which should be enough for normal operation.
Memory on the SCB nodes has been tight for a while. I think there are plans to move OCG (which uses the vast bulk of memory) to a separate cluster. There is also https://phabricator.wikimedia.org/T146664 for limiting the memory used by ORES.
On Tue, Jun 13, 2017 at 3:29 PM, Daniel Zahn dzahn@wikimedia.org wrote:
- 19:24 mutante: scb1001 - killed process 10971 (pdfrendering/electron)
This apparently fixed it.
I see there is already this https://gerrit.wikimedia.org/r/358888 now _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Also ORES is moving to its dedicated nodes: https://phabricator.wikimedia.org/T165171
Best
On 14 June 2017 at 03:18, Gabriel Wicke gwicke@wikimedia.org wrote:
I checked the pdfrender instances on the other SCB nodes, and some used 7G and 15G respectively. My patch limits this to 2G, which should be enough for normal operation.
Memory on the SCB nodes has been tight for a while. I think there are plans to move OCG (which uses the vast bulk of memory) to a separate cluster. There is also https://phabricator.wikimedia.org/T146664 for limiting the memory used by ORES.
On Tue, Jun 13, 2017 at 3:29 PM, Daniel Zahn dzahn@wikimedia.org wrote:
- 19:24 mutante: scb1001 - killed process 10971
(pdfrendering/electron)
This apparently fixed it.
I see there is already this https://gerrit.wikimedia.org/r/358888 now _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Gabriel Wicke Principal Engineer, Wikimedia Foundation _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I merged Gabriel's patch to limit memory usage to 2G.
Myself and others are currently looking into a way to prevent this from happening and going unoticed again in the future... If you want to keep up with the progress, please check out, https://phabricator.wikimedia.org/T167830.
Zppix Volunteer developer for WMF enwp.org/User:Zppix
On Jun 14, 2017 1:53 PM, "Daniel Zahn" dzahn@wikimedia.org wrote:
I merged Gabriel's patch to limit memory usage to 2G. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org