Re: [Analytics] Resources stat1005

12 Aug 2017

I have only 2 comments:

1.  Please nice <https://en.wikipedia.org/wiki/Nice_(Unix)> any heavy long
running local processes, so that others can continue to use the machine.

2. For large data, consider using the Hadoop cluster!  I think you are
getting your data from the webrequest logs in Hadoop anyway, so you might
as well continue to do processing there, no?  If you do, you shouldn’t have
to worry (too much) about resource contention:
https://yarn.wikimedia.org/cluster/scheduler

:)

- Andrew Otto
  Systems Engineer, WMF

On Sat, Aug 12, 2017 at 2:20 PM, Erik Zachte &lt;ezachte(a)wikimedia.org&gt; wrote:

...
  I will soon start the two Wikistats jobs which run for
about several weeks
 each month,

 They might use two cores each, one for unzip, one for perl.

 How many cores are there anyway?

 Cheers,

 Erik

 *From:* Analytics [mailto:analytics-bounces@lists.wikimedia.org] *On
 Behalf Of *Adrian Bielefeldt
 *Sent:* Saturday, August 12, 2017 19:44
 *To:* analytics(a)lists.wikimedia.org
 *Subject:* [Analytics] Resources stat1005

 Hello everyone,

 I wanted to ask about resource allocation on stat1005. We
 <https://meta.wikimedia.org/wiki/Research:Understanding_Wikidata_Queries>
 need quite a bit since we process every entry in wdqs_extract and I was
 wondering how many cores and how much memory we can use without conflicting
 with anyone else.

 Greetings,

 Adrian

 _______________________________________________
 Analytics mailing list
 Analytics(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

Re: [Analytics] Resources stat1005