Hello, I am investigating the 6-13-17 incident involving ores as a volunteer for wikimedia AI, I was wondering if theres a way PDFrender can interact less server resource wise when it comes to being on the same server as ORES to keep ORES from failing, if not is it possible that either ORES or PDFRender get moved to different servers perhaps? If you have any questions, comments, or concerns please reply to me here or #wikimedia-ai thanks!
Zppix Volunteer developer for WMF enwp.org/User:Zppix
Hello zppix,
Thank you for reaching out. We are aware of resource utilisation problems on SCB hosts. After the incident, we have limited the amount of memory the PDF rendering service can use~[1]. However, it is important to note that most of the time ORES is the service using most of the resources on these machines, and they are shared amongst 9 different services. In light of that fact, we have previously recommended commissioning a separate cluster for ORES. As far as I know, the budget should be there and it should be implemented in FY 2017/2018, but will let Aaron correct me here in case I'm wrong :)
Cheers,
Marko Obrovac, PhD Senior Services Engineer Wikimedia Foundation
[1] https://gerrit.wikimedia.org/r/#/c/358888/
On 16 June 2017 at 00:10, zppix e megadev44s.mail@gmail.com wrote:
Hello, I am investigating the 6-13-17 incident involving ores as a volunteer for wikimedia AI, I was wondering if theres a way PDFrender can interact less server resource wise when it comes to being on the same server as ORES to keep ORES from failing, if not is it possible that either ORES or PDFRender get moved to different servers perhaps? If you have any questions, comments, or concerns please reply to me here or #wikimedia-ai thanks!
Zppix Volunteer developer for WMF enwp.org/User:Zppix
Services mailing list Services@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/services
See https://phabricator.wikimedia.org/T165171 for progress on the new ORES cluster. TL;DR: hardware is installed and in the process of being setup. Subscribe to https://phabricator.wikimedia.org/T168073 to track the actual switch from SCB hardware in EQIAD.
Zppix, thanks for the note. In this case, you can see that all relevant parties had been informed and had commented re. ongoing efforts in the phab cards and via wikitech-l.
See: https://lists.wikimedia.org/pipermail/wikitech-l/2017-June/088324.html & https://phabricator.wikimedia.org/T167834 & https://phabricator.wikimedia. org/T165171
On Fri, Jun 16, 2017 at 9:12 AM, Marko Obrovac mobrovac@wikimedia.org wrote:
Hello zppix,
Thank you for reaching out. We are aware of resource utilisation problems on SCB hosts. After the incident, we have limited the amount of memory the PDF rendering service can use~[1]. However, it is important to note that most of the time ORES is the service using most of the resources on these machines, and they are shared amongst 9 different services. In light of that fact, we have previously recommended commissioning a separate cluster for ORES. As far as I know, the budget should be there and it should be implemented in FY 2017/2018, but will let Aaron correct me here in case I'm wrong :)
Cheers,
Marko Obrovac, PhD Senior Services Engineer Wikimedia Foundation
[1] https://gerrit.wikimedia.org/r/#/c/358888/
On 16 June 2017 at 00:10, zppix e megadev44s.mail@gmail.com wrote:
Hello, I am investigating the 6-13-17 incident involving ores as a volunteer for wikimedia AI, I was wondering if theres a way PDFrender can interact less server resource wise when it comes to being on the same server as ORES to keep ORES from failing, if not is it possible that either ORES or PDFRender get moved to different servers perhaps? If you have any questions, comments, or concerns please reply to me here or #wikimedia-ai thanks!
Zppix Volunteer developer for WMF enwp.org/User:Zppix
Services mailing list Services@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/services
Thanks everyone for the quick replies! Zppix Volunteer developer for WMF enwp.org/User:Zppix
On Jun 16, 2017 10:23 AM, "Aaron Halfaker" ahalfaker@wikimedia.org wrote:
See https://phabricator.wikimedia.org/T165171 for progress on the new ORES cluster. TL;DR: hardware is installed and in the process of being setup. Subscribe to https://phabricator.wikimedia.org/T168073 to track the actual switch from SCB hardware in EQIAD.
Zppix, thanks for the note. In this case, you can see that all relevant parties had been informed and had commented re. ongoing efforts in the phab cards and via wikitech-l.
See: https://lists.wikimedia.org/pipermail/wikitech-l/2017-June/ 088324.html & https://phabricator.wikimedia.org/T167834 & https://phabricator.wikimedia.org/T165171
On Fri, Jun 16, 2017 at 9:12 AM, Marko Obrovac mobrovac@wikimedia.org wrote:
Hello zppix,
Thank you for reaching out. We are aware of resource utilisation problems on SCB hosts. After the incident, we have limited the amount of memory the PDF rendering service can use~[1]. However, it is important to note that most of the time ORES is the service using most of the resources on these machines, and they are shared amongst 9 different services. In light of that fact, we have previously recommended commissioning a separate cluster for ORES. As far as I know, the budget should be there and it should be implemented in FY 2017/2018, but will let Aaron correct me here in case I'm wrong :)
Cheers,
Marko Obrovac, PhD Senior Services Engineer Wikimedia Foundation
[1] https://gerrit.wikimedia.org/r/#/c/358888/
On 16 June 2017 at 00:10, zppix e megadev44s.mail@gmail.com wrote:
Hello, I am investigating the 6-13-17 incident involving ores as a volunteer for wikimedia AI, I was wondering if theres a way PDFrender can interact less server resource wise when it comes to being on the same server as ORES to keep ORES from failing, if not is it possible that either ORES or PDFRender get moved to different servers perhaps? If you have any questions, comments, or concerns please reply to me here or #wikimedia-ai thanks!
Zppix Volunteer developer for WMF enwp.org/User:Zppix
Services mailing list Services@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/services