Hi,
On Fri, Jul 27, 2012 at 8:25 AM, Sumana Harihareswara
<sumanah(a)wikimedia.org> wrote:
Thomas, am forwarding to the list - hope folks can
help you out! You'll
want to subscribe so as to get replies. :)
-------- Original Message --------
Subject: RE: Status of Collection Extension in Powerpedia
Date: Mon, 16 Jul 2012 12:16:42 -0400
From: Kern, Thomas (CONTR) <Thomas.Kern(a)hq.doe.gov>
To: 'Sumana Hariharwswara' <sumanah(a)wikimedia.org>
Hello, my name is Thomas Kern. I am the Linux system administrator that
is responsible for building and maintaining the servers that support
DOE's Powerpedia (MediaWiki 1.18). During the recent upgrade of the
servers from CentOS 5 to CentOS 6 and MediaWiki from 1.16 to 1.18, the
wiki administrators wanted to switch from a PDF print add-on that
produced a PDF for that one requested wiki page to the Collection add-on
so that "books" of wiki pages could be created. I installed the
Collection add-on and the rendering services in the same dev/test server
that host the dev/test Mediawiki and its database. That works rather
well. The real problems arose when I tried to move it into production.
Our production environment is more complex. We have 4 servers, one for
the database (lnxwiki0), two for the wiki web servers (lnxwiki1 &
lnxwiki2) and a utility server (lnxwiki3) to host the rendering services
and the pywiki-bots. The two web servers are front-ended by our F5
appliance that provides one name and IP address for Powerpedia and
round-robins the traffic to the two web servers. When I turn on
Collection in the production system, the process of selecting wiki pages
for a book works fine and when it gets submitted for rendering, the
rendering services seem to work until the user gets the screen that says
the book is ready and presents the URL for it. When that is selected,
the user gets an error that the file doesn't seem to exist. A tcpdump of
the transaction shows the rendering request coming in from
https://powerpedia.energy.gov/wiki/Special:Book and trying to go back to
the same NAME. I think having Powerpedia being a virtual entity of two
servers is what is causing problems. I think that is Collection could
communicate with the rendering server using the REAL host name that is
setting up the book, then the rendering server could communicate
directly back to that server to deliver the book.
Any help, guidance, configuration hints for this would be greatly
appreciated.
Looks like Thomas already asked this over a month ago and no one
replied to him[0]. That list seems to be moderated? I tried bumping
the thread and it said the post was being held. (ah, now I see it went
through)
I tried the onwiki docs[1], readthedocs[2], and the WMF config for the
regular cluster[3] and didn't find any answers. I'm not sure where the
config is stored for the PDF rendering cluster. (the canonical copy
should live at gerrit so it probably needs to be moved there)
My first guess is this is related to the more common question about
storage for uploaded files. They need to be either served from a
dedicated box just for uploads (like
upload.powerpedia.energy.gov) or
every node in the cluster needs to have access to every file for the
wiki (either because they propagate somehow or because there's a
shared, networked filesystem in use) or a hybrid like at wikimedia:
there's a separate
upload.wikimedia.org cluster for just that one
domain and all nodes for that cluster have access to all files through
a shared filesystem. Each of the 800+ wikis on the cluster has it's
own domain for the wiki but they all share one domain for uploads.
-Jeremy
[0]
https://groups.google.com/d/topic/mwlib/ArKw7cuQAoU/discussion
[1]
https://www.mediawiki.org/wiki/Extension:Collection
[2]
http://mwlib.readthedocs.org/en/latest/
[3]
https://gerrit.wikimedia.org/r/gitweb?p=operations/mediawiki-config.git;a=s…