[Labs-admin] Fwd: Re: Servers with GPUs

Andrew Bogott abogott at wikimedia.org
Thu Feb 23 18:33:21 UTC 2017


On 2/23/17 12:30 PM, Yuvi Panda wrote:
> Do we have the human bandwidth to commit to doing this as a team? GPUs 
> are fickle beasts.
Noooo!  I'm not telling them 'we will do this' only 'this might be 
technically possible'

>
> On Feb 23, 2017 5:31 AM, "Andrew Bogott" <abogott at wikimedia.org 
> <mailto:abogott at wikimedia.org>> wrote:
>
>     FYI, here is a thing that reading/ores people are daydreaming
>     about.  I've made it clear that bare-metal-in-labs is a
>     nonstarter; having special hardware with special VM types is not
>     necessarily all that much work.  I bet a labvirt with a GPU will
>     be super expensive though, and I don't know a damn thing about how
>     GPU resource-contention would be handled.
>
>     -A
>
>
>     -------- Forwarded Message --------
>     Delivered-To: 	andrewbogott at gmail.com <mailto:andrewbogott at gmail.com>
>     Received: 	by 10.55.130.1 with SMTP id e1csp77787qkd; Wed, 22 Feb
>     2017 08:57:16 -0800 (PST)
>     X-Received: 	by 10.237.36.116 with SMTP id
>     s49mr18754207qtc.128.1487782636280; Wed, 22 Feb 2017 08:57:16
>     -0800 (PST)
>     Return-Path: 	<abogott+caf_=andrewbogott=gmail.com at wikimedia.org>
>     <mailto:abogott+caf_=andrewbogott=gmail.com at wikimedia.org>
>     Received: 	from mail-qk0-f173.google.com
>     <http://mail-qk0-f173.google.com> (mail-qk0-f173.google.com
>     <http://mail-qk0-f173.google.com>. [209.85.220.173]) by
>     mx.google.com <http://mx.google.com> with ESMTPS id
>     q53si660238qtf.337.2017.02.22.08.57.16 for
>     <andrewbogott at gmail.com> <mailto:andrewbogott at gmail.com>
>     (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
>     Wed, 22 Feb 2017 08:57:16 -0800 (PST)
>     Received-SPF: 	pass (google.com <http://google.com>: domain of
>     abogott+caf_=andrewbogott=gmail.com at wikimedia.org
>     <mailto:abogott+caf_=andrewbogott=gmail.com at wikimedia.org>
>     designates 209.85.220.173 as permitted sender)
>     client-ip=209.85.220.173;
>     Authentication-Results: 	mx.google.com <http://mx.google.com>;
>     dkim=pass header.i=@wikimedia.org
>     <mailto:header.i=@wikimedia.org>; spf=pass (google.com
>     <http://google.com>: domain of
>     abogott+caf_=andrewbogott=gmail.com at wikimedia.org
>     <mailto:abogott+caf_=andrewbogott=gmail.com at wikimedia.org>
>     designates 209.85.220.173 as permitted sender)
>     smtp.mailfrom=abogott+caf_=andrewbogott=gmail.com at wikimedia.org
>     <mailto:smtp.mailfrom=abogott+caf_=andrewbogott=gmail.com at wikimedia.org>;
>     dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=wikimedia.org
>     <http://wikimedia.org>
>     Received: 	by mail-qk0-f173.google.com
>     <http://mail-qk0-f173.google.com> with SMTP id n127so300qkf.0 for
>     <andrewbogott at gmail.com> <mailto:andrewbogott at gmail.com>; Wed, 22
>     Feb 2017 08:57:16 -0800 (PST)
>     X-Google-DKIM-Signature: 	v=1; a=rsa-sha256; c=relaxed/relaxed;
>     d=1e100.net <http://1e100.net>; s=20161025;
>     h=x-gm-message-state:delivered-to:dkim-signature:mime-version
>     :in-reply-to:references:from:date:message-id:subject:to:cc;
>     bh=geU9E0JuTBp45fwea4nD9VRvS20wnOpfRwfENW1LjhU=;
>     b=JCBzeI8wEoxwlF8L9N02zAe4B6yM5bhHph//6v2nv99TlzVWmYMh+M5YysH03KFmox
>     Pv6DRB9RYw7g+3w6On4goiSqbRlOanasU9gEBuOqLaivY/4ictoMjr0mfCZynFxGwwcO
>     nfgiGna0f9LVgrElZZDd01HweqTJ0uBW7tPPJDxt74YLf7I+tysJReVuvW/e4GzGtMHG
>     B801Icgxf5WNdjQYUImCtXXHQFxlDcW+PydqDPQMH5Tyg8ssKWUQwFqCKe0PVdtQsCY9
>     HLYgur5owYC+yZr2D1CNR3iUMH2jcHM0IdsuLmiKL6kSCy+x6oslXO4kJf3Hj3O7sFEy
>     3ZFw==
>     X-Gm-Message-State:
>     AMke39mzi2oJ/S99gBkx9KxBafTj2nbLxmj5JK4xkqZ1ij7Mt9aq5oAT2u0MCGpmwqclQKf7e+yX+X2zHORrO8Cpp0HTeDQVCA==
>
>     X-Received: 	by 10.55.94.6 with SMTP id
>     s6mr34227257qkb.166.1487782635865; Wed, 22 Feb 2017 08:57:15 -0800
>     (PST)
>     X-Forwarded-To: 	andrewbogott at gmail.com
>     <mailto:andrewbogott at gmail.com>
>     X-Forwarded-For: 	abogott at wikimedia.org
>     <mailto:abogott at wikimedia.org> andrewbogott at gmail.com
>     <mailto:andrewbogott at gmail.com>
>     Delivered-To: 	abogott at wikimedia.org <mailto:abogott at wikimedia.org>
>     Received: 	by 10.140.99.34 with SMTP id p31csp954099qge; Wed, 22
>     Feb 2017 08:57:14 -0800 (PST)
>     X-Received: 	by 10.237.50.229 with SMTP id
>     z92mr30222596qtd.182.1487782634693; Wed, 22 Feb 2017 08:57:14
>     -0800 (PST)
>     Return-Path: 	<abaso at wikimedia.org> <mailto:abaso at wikimedia.org>
>     Received: 	from mx1001.wikimedia.org <http://mx1001.wikimedia.org>
>     (mx1001.wikimedia.org <http://mx1001.wikimedia.org>.
>     [208.80.154.76]) by mx.google.com <http://mx.google.com> with
>     ESMTPS id 52si1315621qtw.95.2017.02.22.08.57.14 for
>     <abogott at wikimedia.org> <mailto:abogott at wikimedia.org>
>     (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
>     Wed, 22 Feb 2017 08:57:14 -0800 (PST)
>     Received-SPF: 	pass (google.com <http://google.com>: domain of
>     abaso at wikimedia.org <mailto:abaso at wikimedia.org> designates
>     2a00:1450:400c:c0c::22a as permitted sender)
>     client-ip=2a00:1450:400c:c0c::22a;
>     Received: 	from mail-wr0-x22a.google.com
>     <http://mail-wr0-x22a.google.com>
>     ([2a00:1450:400c:c0c::22a]:33057) by mx1001.wikimedia.org
>     <http://mx1001.wikimedia.org> with esmtps
>     (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2)
>     (envelope-from <abaso at wikimedia.org> <mailto:abaso at wikimedia.org>)
>     id 1cgaE4-0000fA-8M for abogott at wikimedia.org
>     <mailto:abogott at wikimedia.org>; Wed, 22 Feb 2017 16:57:14 +0000
>     Received: 	by mail-wr0-x22a.google.com
>     <http://mail-wr0-x22a.google.com> with SMTP id 97so6296811wrb.0
>     for <abogott at wikimedia.org> <mailto:abogott at wikimedia.org>; Wed,
>     22 Feb 2017 08:57:12 -0800 (PST)
>     DKIM-Signature: 	v=1; a=rsa-sha256; c=relaxed/relaxed;
>     d=wikimedia.org <http://wikimedia.org>; s=google;
>     h=mime-version:in-reply-to:references:from:date:message-id:subject:to
>     :cc; bh=geU9E0JuTBp45fwea4nD9VRvS20wnOpfRwfENW1LjhU=;
>     b=WFFw+gGuZwu1fwbD30zuj8lNJi5A1g7aoIe2iVvXiXk2ePZe0PpxBVjGtiO52g9DmK
>     ivzQn2KiDrcgF1X3pHaYSUiINhiR5N/IFRcmPznm0HESdJyycHRUySCxgRVHM6Hlt2a7
>     YflNt6jxThtJE3Hygnodo53N7EPKxHWtj4NV0=
>     X-Received: 	by 10.223.172.66 with SMTP id
>     v60mr25236742wrc.77.1487782631305; Wed, 22 Feb 2017 08:57:11 -0800
>     (PST)
>     MIME-Version: 	1.0
>     Received: 	by 10.80.148.203 with HTTP; Wed, 22 Feb 2017 08:56:30
>     -0800 (PST)
>     In-Reply-To: 	<3a1992ee-44b4-8f8c-f946-15b8f6bad81c at wikimedia.org>
>     <mailto:3a1992ee-44b4-8f8c-f946-15b8f6bad81c at wikimedia.org>
>     References:
>     <CAB74=NqwwtYKkgxEZWex5HxFso1xBZ+Y4swRzJzRzhwKGXrnVw at mail.gmail.com>
>     <mailto:CAB74=NqwwtYKkgxEZWex5HxFso1xBZ+Y4swRzJzRzhwKGXrnVw at mail.gmail.com>
>     <CALuiOYYEp34ZTR76x2cAEkmpd03njx+SZUEAbWP5cmqkp4ERaQ at mail.gmail.com>
>     <mailto:CALuiOYYEp34ZTR76x2cAEkmpd03njx+SZUEAbWP5cmqkp4ERaQ at mail.gmail.com>
>     <CAB74=NpEkj9r9SdGfsn=VFz44DxRz+Ch5gKrrZJeoEhdDBz5xw at mail.gmail.com>
>     <mailto:CAB74=NpEkj9r9SdGfsn=VFz44DxRz+Ch5gKrrZJeoEhdDBz5xw at mail.gmail.com>
>     <CAO8=cz3j_A018QCU9iry+UjWEFQZ5AcV4AD942RhEiFkXvqS=Q at mail.gmail.com>
>     <mailto:CAO8=cz3j_A018QCU9iry+UjWEFQZ5AcV4AD942RhEiFkXvqS=Q at mail.gmail.com>
>     <CAKP=3Wz0R5DpsA8-b+PmJHuUNYggisyN5_4yUStN6cga7ftApA at mail.gmail.com>
>     <mailto:CAKP=3Wz0R5DpsA8-b+PmJHuUNYggisyN5_4yUStN6cga7ftApA at mail.gmail.com>
>     <CAH8ZkEggG2WvTeGrZVVyMVKyThe7rZe+TqQsWfOkboW5HtsY7w at mail.gmail.com>
>     <mailto:CAH8ZkEggG2WvTeGrZVVyMVKyThe7rZe+TqQsWfOkboW5HtsY7w at mail.gmail.com>
>     <CAKP=3WwvZ3wfL1XH1aeszPN5Ja3sua_brFifzzSS5SHD0UzpEA at mail.gmail.com>
>     <mailto:CAKP=3WwvZ3wfL1XH1aeszPN5Ja3sua_brFifzzSS5SHD0UzpEA at mail.gmail.com>
>     <CAB74=NoUBJG_f31nb0ACzKyv5JZhnnwqK7N0LVoVeHssozTaEg at mail.gmail.com>
>     <mailto:CAB74=NoUBJG_f31nb0ACzKyv5JZhnnwqK7N0LVoVeHssozTaEg at mail.gmail.com>
>     <CAB74=NrrdD46n8nhY3rBmXNNkwkWxyiBiAb9xYNb8XzkUuHBOw at mail.gmail.com>
>     <mailto:CAB74=NrrdD46n8nhY3rBmXNNkwkWxyiBiAb9xYNb8XzkUuHBOw at mail.gmail.com>
>     <CAKP=3WxiqnJZjc=2hPcCe5U7cdbxBMqkdzr91SAOdO7+ns9m0w at mail.gmail.com>
>     <mailto:CAKP=3WxiqnJZjc=2hPcCe5U7cdbxBMqkdzr91SAOdO7+ns9m0w at mail.gmail.com>
>     <CAB74=Nq09YL-PY_nbq-oir_7dReYjzLfVz-FgqvPYmD5U=Nqow at mail.gmail.com>
>     <mailto:CAB74=Nq09YL-PY_nbq-oir_7dReYjzLfVz-FgqvPYmD5U=Nqow at mail.gmail.com>
>     <CAKP=3Wxy4dncDCQJo4bgGwAta-4+Y_xJ0iQHeC1GuPfH6y=RpQ at mail.gmail.com>
>     <mailto:CAKP=3Wxy4dncDCQJo4bgGwAta-4+Y_xJ0iQHeC1GuPfH6y=RpQ at mail.gmail.com>
>     <40905b9f-e1e6-1ab2-5d9b-baa5a1dbd9ac at wikimedia.org>
>     <mailto:40905b9f-e1e6-1ab2-5d9b-baa5a1dbd9ac at wikimedia.org>
>     <CAB74=Npe3UwzPZv+ditB1bmNASvNuXgK00ZanM65E7-XySX_pg at mail.gmail.com>
>     <mailto:CAB74=Npe3UwzPZv+ditB1bmNASvNuXgK00ZanM65E7-XySX_pg at mail.gmail.com>
>     <3a1992ee-44b4-8f8c-f946-15b8f6bad81c at wikimedia.org>
>     <mailto:3a1992ee-44b4-8f8c-f946-15b8f6bad81c at wikimedia.org>
>     From: 	Adam Baso <abaso at wikimedia.org> <mailto:abaso at wikimedia.org>
>     Date: 	Wed, 22 Feb 2017 10:56:30 -0600
>     Message-ID:
>     <CAB74=Nq+aSspOt2bp1Hno58xjcERf3P8S_c_HPeaBgFWiY_NFw at mail.gmail.com>
>     <mailto:CAB74=Nq+aSspOt2bp1Hno58xjcERf3P8S_c_HPeaBgFWiY_NFw at mail.gmail.com>
>
>     Subject: 	Re: Servers with GPUs
>     To: 	Andrew Bogott <abogott at wikimedia.org>
>     <mailto:abogott at wikimedia.org>
>     Cc: 	Aaron Halfaker <ahalfaker at wikimedia.org>
>     <mailto:ahalfaker at wikimedia.org>, Dario Taraborelli
>     <dtaraborelli at wikimedia.org> <mailto:dtaraborelli at wikimedia.org>,
>     Ellery Wulczyn <ewulczyn at wikimedia.org>
>     <mailto:ewulczyn at wikimedia.org>, Andrew Otto <otto at wikimedia.org>
>     <mailto:otto at wikimedia.org>, Corey Floyd <cfloyd at wikimedia.org>
>     <mailto:cfloyd at wikimedia.org>, Andrew Otto <acotto at gmail.com>
>     <mailto:acotto at gmail.com>
>     Content-Type: 	multipart/alternative;
>     boundary=f403045cf1523e23fe05492163db
>
>
>
>     I feel comfortable saying we want one of these options. I'm trying
>     to gchat Ryan Lane to see if has insight on the Nova support. I
>     just realized when I checked action=history he was one of the
>     editors of HeterogeneousGpuAcceleratorSupport
>     <https://wiki.openstack.org/wiki/HeterogeneousGpuAcceleratorSupport>!
>
>     -Adam
>
>     On Tue, Feb 21, 2017 at 2:16 PM, Andrew Bogott
>     <abogott at wikimedia.org <mailto:abogott at wikimedia.org>> wrote:
>
>         On 2/21/17 1:36 PM, Adam Baso wrote:
>>         I think either that, or if it's easier, OpenStack-provisioned
>>         physical servers. Do you think the latter is doable?
>         It's possible, although the last time we visited that issue we
>         swiftly determined that even though people were asking for it
>         no one actually wanted it. The conclusion from that process is
>         at
>         https://wikitech.wikimedia.org/wiki/Labs_labs_labs/Bare_Metal
>         <https://wikitech.wikimedia.org/wiki/Labs_labs_labs/Bare_Metal>
>
>         If GPU instance support is in nova and actually maintained,
>         that might be worth a try.
>
>         -A
>
>
>
>>         I gather the former would require different, if not deeper
>>         analysis (cf. HeterogeneousGpuAcceleratorSupport
>>         <https://wiki.openstack.org/wiki/HeterogeneousGpuAcceleratorSupport>).
>>
>>
>>         -Adam
>>
>>         On Tue, Feb 21, 2017 at 12:08 PM, Andrew Bogott
>>         <abogott at wikimedia.org <mailto:abogott at wikimedia.org>> wrote:
>>
>>             Can y'all tell me a bit more about how this would relate
>>             to labs?  Is the idea that you want an option to create
>>             VMs with virtualized GPU hardware?  Or... something
>>             else?  (I'm not immediately clear on how that would work,
>>             but I'm largely ignorant on the subject.)
>>
>>             -A
>>
>>
>>
>>             On 2/21/17 12:02 PM, Aaron Halfaker wrote:
>>>             +9000 :D. (also + Andrew bogott)
>>>
>>>             Would love to have this kind of resource in labs and
>>>             openly available.  I'm personally stoked to start
>>>             experimenting but not ready to invest in production GPUs
>>>             yet.  A few of my external collaborators have asked
>>>             about GPU resources in labs too.
>>>
>>>
>>>
>>>             On Feb 21, 2017 11:52, "Adam Baso" <abaso at wikimedia.org
>>>             <mailto:abaso at wikimedia.org>> wrote:
>>>
>>>                 Following up on this here current thread, what do
>>>                 you all think about doing the GPU acceleration in
>>>                 Labs first?
>>>
>>>                 I don't know if it was halfak or marktraceur who
>>>                 suggested it first (although Aaron's brought it up a
>>>                 couple times now), but it's /probably/ less up front
>>>                 architectural overhead to start out with, even if in
>>>                 the future we'd have a strict requirement on HA (Q1
>>>                 FY 18-19 at the very latest under current guess). As
>>>                 Aaron has rightly noted doing this in Labs also
>>>                  lets us learn plus give community members greater
>>>                 access to innovate early, too. The primary downside
>>>                 of not getting this in production with HA up front
>>>                 is if funding dries up for FY 18-19 we're stuck or
>>>                 pegged to certain workflows later on. But maybe it's
>>>                 not worth worrying about that too much.
>>>
>>>                 I think one very much open question, though, would
>>>                 be if it's possible to have a machine with the GPU
>>>                 card installed and specifically assigned in Labs.
>>>                 Does anyone know if that's actually possible?
>>>
>>>                 -Adam
>>>
>>>
>>>
>>>
>>>
>>>                 On Thu, Feb 16, 2017 at 4:56 PM, Aaron Halfaker
>>>                 <ahalfaker at wikimedia.org
>>>                 <mailto:ahalfaker at wikimedia.org>> wrote:
>>>
>>>                     +1 for looping me in on that thread &
>>>                     discussion. I'd like to loop in someone from
>>>                     Labs (probably andrebogott) about purchasing
>>>                     GPUs for Labs so that we can (1) run prediction
>>>                     models in Beta and (2) empower our
>>>                     volunteers/external researchers to experiment
>>>                     with us.
>>>
>>>                     -Aaron
>>>
>>>                     On Thu, Feb 16, 2017 at 12:11 PM, Adam Baso
>>>                     <abaso at wikimedia.org
>>>                     <mailto:abaso at wikimedia.org>> wrote:
>>>
>>>                         +Corey
>>>
>>>                         On Thu, Feb 16, 2017 at 11:15 AM, Adam Baso
>>>                         <abaso at wikimedia.org
>>>                         <mailto:abaso at wikimedia.org>> wrote:
>>>
>>>                             We have a need for push notification
>>>                             servers already, so I've opened a thread
>>>                             with Mark and Faidon about getting those
>>>                             servers and putting in the
>>>                             Nvidia-recommended cards for TensorFlow
>>>                             (ostensibly for machine vision), for the
>>>                             sake of simplifying assumptions about
>>>                             hardware. I'm awaiting their feedback
>>>                             about whether we actually need to split
>>>                             the servers. If we /do/ need to split
>>>                             the servers for separate purposes, then
>>>                             I think that means we'd push back the
>>>                             online computer vision servers and GPUs
>>>                             purchase to early Q4 FY 17-18 rather
>>>                             than just getting it done Q1 FY 17-18 -
>>>                             which is when we need to be moving
>>>                             aggressively on push notification so it
>>>                             would be prudent to just get it done in
>>>                             one fell swoop.
>>>
>>>                             Aaron, I know last week you had said
>>>                             you'd be /open/ to collaborating on
>>>                             this...and I was quite
>>>                             noncommittal!...But I appreciate your
>>>                             saying you'd /like/ to collaborate here.
>>>                             Would you like if I loop you on that
>>>                             thread with Mark and Faidon? *Any others
>>>                             who should / would like to join that
>>>                             thread?* Just be prepared for the thread
>>>                             to be covering two separate use cases -
>>>                             one on cross-platform push notification
>>>                             and one on basic machine vision.
>>>
>>>                             -Adam
>>>
>>>                             On Thu, Feb 16, 2017 at 10:57 AM, Aaron
>>>                             Halfaker <ahalfaker at wikimedia.org
>>>                             <mailto:ahalfaker at wikimedia.org>> wrote:
>>>
>>>                                 OK so I think we'll want to (1) get
>>>                                 a GPU in the stat boxes ASAP and (2)
>>>                                 decide whether we want to plan GPU
>>>                                 resources in Prod for FY2018 or FY2019.
>>>
>>>                                 For (2), I don't think my team's
>>>                                 current plans will bring us to using
>>>                                 the GPU in production in the next
>>>                                 year, but I suspect that Reading may
>>>                                 want to push some work re. image
>>>                                 processing in that time.  If that's
>>>                                 the case, I want my team to be able
>>>                                 to collaborate and support getting
>>>                                 that deployed in prod.  To do this
>>>                                 well, I want GPU resources in
>>>                                 Wikimedia Labs too.  That sounds
>>>                                 like a whole other can of worms.
>>>
>>>
>>>
>>>                                 On Wed, Feb 15, 2017 at 4:40 PM,
>>>                                 Ellery Wulczyn
>>>                                 <ewulczyn at wikimedia.org
>>>                                 <mailto:ewulczyn at wikimedia.org>> wrote:
>>>
>>>                                     Having GPUs for training should
>>>                                     be sufficient for now, although
>>>                                     if we end up getting a ton of
>>>                                     use, using GPUs could be a lot
>>>                                     faster and probably cheaper than
>>>                                     provisioning the same compute
>>>                                     amount of CPUs.
>>>
>>>                                     On Thu, Feb 2, 2017 at 12:21 PM,
>>>                                     Aaron Halfaker
>>>                                     <ahalfaker at wikimedia.org
>>>                                     <mailto:ahalfaker at wikimedia.org>>
>>>                                     wrote:
>>>
>>>                                         If we only need the GPU for
>>>                                         model training, it'll be OK
>>>                                         to use one stat box.  If we
>>>                                         need the GPU for
>>>                                         scoring/predictions, we'll
>>>                                         need a whole new hardware plan.
>>>
>>>                                         On Thu, Feb 2, 2017 at 1:06
>>>                                         PM, Andrew Otto
>>>                                         <otto at wikimedia.org
>>>                                         <mailto:otto at wikimedia.org>>
>>>                                         wrote:
>>>
>>>                                             Hm, a good rule of thumb
>>>                                             is: If it can be offline
>>>                                             or not running and not
>>>                                             affect end users, then
>>>                                             it is probably fine to
>>>                                             use a stat box.
>>>
>>>
>>>
>>>                                             On Thu, Feb 2, 2017 at
>>>                                             1:56 PM, Adam Baso
>>>                                             <abaso at wikimedia.org
>>>                                             <mailto:abaso at wikimedia.org>>
>>>                                             wrote:
>>>
>>>                                                 Got it - I take that
>>>                                                 to be the case
>>>                                                 whether it's batched
>>>                                                 operation (e.g., on
>>>                                                 millions of files)
>>>                                                 or it's more of an
>>>                                                 in-the-user-flow
>>>                                                 sort of thing. Is
>>>                                                 that right?
>>>
>>>                                                 On Thu, Feb 2, 2017
>>>                                                 at 10:55 AM, Andrew
>>>                                                 Otto
>>>                                                 <acotto at gmail.com
>>>                                                 <mailto:acotto at gmail.com>>
>>>                                                 wrote:
>>>
>>>                                                     I’d say, if you
>>>                                                     are going to
>>>                                                     just to
>>>                                                     analytics type
>>>                                                      stuff, then the
>>>                                                     single stat
>>>                                                     machine will
>>>                                                     do.  If you want
>>>                                                     to depend on a
>>>                                                     GPU for a
>>>                                                     end-user
>>>                                                     production
>>>                                                     thing, then
>>>                                                     you’ll have to
>>>                                                     work with ops to
>>>                                                     find another
>>>                                                     place to run it.
>>>                                                     :/ :)
>>>
>>>                                                     On Thu, Feb 2,
>>>                                                     2017 at 11:40
>>>                                                     AM, Aaron
>>>                                                     Halfaker
>>>                                                     <ahalfaker at wikimedia.org
>>>                                                     <mailto:ahalfaker at wikimedia.org>>
>>>                                                     wrote:
>>>
>>>                                                         Ellery, will
>>>                                                         we need the
>>>                                                         GPUs in
>>>                                                         order to use
>>>                                                         a NN or will
>>>                                                         we only need
>>>                                                         it for
>>>                                                         training models?
>>>
>>>                                                         On Thu, Feb
>>>                                                         2, 2017 at
>>>                                                         10:21 AM,
>>>                                                         Adam Baso
>>>                                                         <abaso at wikimedia.org
>>>                                                         <mailto:abaso at wikimedia.org>>
>>>                                                         wrote:
>>>
>>>                                                             I
>>>                                                             envision
>>>                                                             two
>>>                                                             primary
>>>                                                             uses:
>>>
>>>                                                             1) Large
>>>                                                             scale
>>>                                                             batch
>>>                                                             offline
>>>                                                             processing
>>>                                                             of
>>>                                                             existing
>>>                                                             media
>>>                                                             assets
>>>                                                             so that
>>>                                                             the
>>>                                                             material
>>>                                                             is ready
>>>                                                             for
>>>                                                             curatorial
>>>                                                             flows.
>>>                                                             2) As
>>>                                                             part of
>>>                                                             an end
>>>                                                             user
>>>                                                             flow
>>>                                                             where
>>>                                                             multiple
>>>                                                             concurrent
>>>                                                             users
>>>                                                             are
>>>                                                             uploading
>>>                                                             media
>>>                                                             and
>>>                                                             verifying
>>>                                                             and
>>>                                                             adding
>>>                                                             structured
>>>                                                             data on
>>>                                                             the fly
>>>                                                             as part
>>>                                                             of
>>>                                                             production
>>>                                                             use.
>>>
>>>                                                             Can both
>>>                                                             of these
>>>                                                             be done
>>>                                                             on stats
>>>                                                             machines?
>>>
>>>                                                             Ought we
>>>                                                             have GPU
>>>                                                             acceleration
>>>                                                             in two
>>>                                                             machines
>>>                                                             instead
>>>                                                             of one
>>>                                                             machine?
>>>
>>>                                                             -Adam
>>>
>>>
>>>
>>>                                                             On Thu,
>>>                                                             Feb 2,
>>>                                                             2017 at
>>>                                                             8:15 AM,
>>>                                                             Andrew
>>>                                                             Otto
>>>                                                             <otto at wikimedia.org
>>>                                                             <mailto:otto at wikimedia.org>>
>>>                                                             wrote:
>>>
>>>                                                                 Oh
>>>                                                                 ya! 
>>>                                                                 If
>>>                                                                 you
>>>                                                                 have
>>>                                                                 a
>>>                                                                 use
>>>                                                                 case
>>>                                                                 for
>>>                                                                 this
>>>                                                                 too,
>>>                                                                 all
>>>                                                                 the
>>>                                                                 better!
>>>
>>>                                                                 Do
>>>                                                                 you
>>>                                                                 need
>>>                                                                 it
>>>                                                                 for
>>>                                                                 analytics
>>>                                                                 type
>>>                                                                 work? 
>>>                                                                 Or
>>>                                                                 do
>>>                                                                 you
>>>                                                                 need
>>>                                                                 it
>>>                                                                 to
>>>                                                                 process
>>>                                                                 stuff
>>>                                                                 for
>>>                                                                 a
>>>                                                                 production
>>>                                                                 feature?
>>>
>>>                                                                 On
>>>                                                                 Thu,
>>>                                                                 Feb
>>>                                                                 2,
>>>                                                                 2017
>>>                                                                 at
>>>                                                                 9:10
>>>                                                                 AM,
>>>                                                                 Aaron
>>>                                                                 Halfaker
>>>                                                                 <ahalfaker at wikimedia.org
>>>                                                                 <mailto:ahalfaker at wikimedia.org>>
>>>                                                                 wrote:
>>>
>>>                                                                     Hi
>>>                                                                     Adam,
>>>
>>>                                                                     +
>>>                                                                     a
>>>                                                                     bunch
>>>                                                                     of
>>>                                                                     CCs
>>>
>>>                                                                     Last
>>>                                                                     I
>>>                                                                     heard,
>>>                                                                     Dario
>>>                                                                     thought
>>>                                                                     we
>>>                                                                     might
>>>                                                                     be
>>>                                                                     able
>>>                                                                     to
>>>                                                                     cover
>>>                                                                     the
>>>                                                                     cost
>>>                                                                     with
>>>                                                                     Research
>>>                                                                     budget. 
>>>                                                                     Otto
>>>                                                                     thought
>>>                                                                     that
>>>                                                                     we
>>>                                                                     could
>>>                                                                     get
>>>                                                                     a
>>>                                                                     top
>>>                                                                     of
>>>                                                                     line
>>>                                                                     GPU
>>>                                                                     and
>>>                                                                     load
>>>                                                                     it
>>>                                                                     into
>>>                                                                     an
>>>                                                                     analytics
>>>                                                                     machine
>>>                                                                     some
>>>                                                                     time
>>>                                                                     in
>>>                                                                     Q4
>>>                                                                     of
>>>                                                                     this
>>>                                                                     year.
>>>                                                                     Ellery
>>>                                                                     was
>>>                                                                     planning
>>>                                                                     to
>>>                                                                     use
>>>                                                                     3rd
>>>                                                                     party
>>>                                                                     GPU
>>>                                                                     processing
>>>                                                                     services
>>>                                                                     until
>>>                                                                     it
>>>                                                                     was
>>>                                                                     ready.
>>>
>>>
>>>                                                                     See
>>>                                                                     https://phabricator.wikimedia.org/T148843
>>>                                                                     <https://phabricator.wikimedia.org/T148843>
>>>
>>>                                                                     -Aaron
>>>
>>>                                                                     On
>>>                                                                     Wed,
>>>                                                                     Feb
>>>                                                                     1,
>>>                                                                     2017
>>>                                                                     at
>>>                                                                     6:01
>>>                                                                     PM,
>>>                                                                     Adam
>>>                                                                     Baso
>>>                                                                     <abaso at wikimedia.org
>>>                                                                     <mailto:abaso at wikimedia.org>>
>>>                                                                     wrote:
>>>
>>>                                                                         Aaron,
>>>                                                                         okay
>>>                                                                         if
>>>                                                                         I
>>>                                                                         schedule
>>>                                                                         a
>>>                                                                         20
>>>                                                                         minute
>>>                                                                         meeting
>>>                                                                         with
>>>                                                                         you
>>>                                                                         to
>>>                                                                         talk
>>>                                                                         servers
>>>                                                                         with
>>>                                                                         GPUs?
>>>
>>>
>>>                                                                         Broadly,
>>>                                                                         I'm
>>>                                                                         trying
>>>                                                                         to
>>>                                                                         figure
>>>                                                                         out
>>>                                                                         what
>>>                                                                         server
>>>                                                                         CapEx
>>>                                                                         I
>>>                                                                         need
>>>                                                                         to
>>>                                                                         ask
>>>                                                                         of
>>>                                                                         Mark
>>>                                                                         (e.g.,
>>>                                                                         for
>>>                                                                         TensorFlow
>>>                                                                         object
>>>                                                                         detection
>>>                                                                         in
>>>                                                                         anticipation
>>>                                                                         of
>>>                                                                         work
>>>                                                                         later
>>>                                                                         in
>>>                                                                         FY
>>>                                                                         17-18
>>>                                                                         /
>>>                                                                         earlier
>>>                                                                         FY
>>>                                                                         18-19).
>>>                                                                         I
>>>                                                                         had
>>>                                                                         asked
>>>                                                                         him
>>>                                                                         the
>>>                                                                         other
>>>                                                                         day
>>>                                                                         about
>>>                                                                         when
>>>                                                                         he
>>>                                                                         needs
>>>                                                                         requests
>>>                                                                         for
>>>                                                                         next
>>>                                                                         FY,
>>>                                                                         and
>>>                                                                         he
>>>                                                                         basically
>>>                                                                         said
>>>                                                                         the
>>>                                                                         sooner
>>>                                                                         the
>>>                                                                         better.
>>>
>>>                                                                         -Adam
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
>
>     _______________________________________________
>     Labs-admin mailing list
>     Labs-admin at lists.wikimedia.org <mailto:Labs-admin at lists.wikimedia.org>
>     https://lists.wikimedia.org/mailman/listinfo/labs-admin
>     <https://lists.wikimedia.org/mailman/listinfo/labs-admin>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-admin/attachments/20170223/b3bb1a44/attachment-0001.html>


More information about the Labs-admin mailing list