Marcin Cieslak (saper) now has access to ptolemy and ortelius

List overview All Threads
Download

newer

older

planet-090916.osm.bz2 -...

Recent OSM replication issues

Ævar Arnfjörð Bjarmason

14 Sep 2009 14 Sep '09

11:41 p.m.

On Sat, Sep 12, 2009 at 11:32 AM, Ævar Arnfjörð Bjarmason avarab@gmail.com wrote:

...

Marcin Cieslak (saper) which was at Wikimania 2009 and he's excited about helping set up our OSM system on the ptolemy and ortelius servers (see http://lists.wikimedia.org/pipermail/wikitech-l/2009-September/045131.html). He wants to help set up the database / tile rendering aspect (which is pretty much all we're doing).

Here are his details as required for server admins:

* Name: Marcin Cieślak * Email: Marcin Cieslak saper@saper.info * Address: [CUT OUT]

And he's agreed to follow our privacy policy: http://wikimediafoundation.org/wiki/Privacy_policy

(not that there are any actual privacy concerns, these servers don't have any access to private data as explained in the wikitech-l posting)

If you approve I'll take care of setting up his accounts on those servers.

Mark gave the OK. I've created the user saper on both boxes. Marcin, you should be able to:

ssh ptolemy.esams.wikimedia.org ssh ortelius.esams.wikimedia.org

Here's the notes I kept when setting up Cassini: https://wiki.toolserver.org/view/OpenStreetMap_server/Setup_notes

It would be very helpful if you continued updating that wiki page (or related pages) with the stuff you do.

Open issues on the servers that need to be solved:

* They need to be partitioned. There's some empty space on /dev/sd* that hasn't been partitioned (> 1TB) * Evidently hardy's kernel doesn't like the RAID card on these boxes. So the kernel needs to be upgraded, probably to jaunty-server

Have fun!

Show replies by date

Aude

15 Sep 15 Sep

12:27 a.m.

On Mon, Sep 14, 2009 at 2:11 PM, Ævar Arnfjörð Bjarmason avarab@gmail.comwrote:

...

On Sat, Sep 12, 2009 at 11:32 AM, Ævar Arnfjörð Bjarmason avarab@gmail.com wrote:

...
Marcin Cieslak (saper) which was at Wikimania 2009 and he's excited about helping set up our OSM system on the ptolemy and ortelius servers (see

http://lists.wikimedia.org/pipermail/wikitech-l/2009-September/045131.html ).

...
He wants to help set up the database / tile rendering aspect (which is pretty much all we're doing).

Here are his details as required for server admins:

Name: Marcin Cieślak

Email: Marcin Cieslak saper@saper.info

Address: [CUT OUT]

And he's agreed to follow our privacy policy: http://wikimediafoundation.org/wiki/Privacy_policy

(not that there are any actual privacy concerns, these servers don't have any access to private data as explained in the wikitech-l posting)

If you approve I'll take care of setting up his accounts on those

servers.

Mark gave the OK. I've created the user saper on both boxes. Marcin, you should be able to:

ssh ptolemy.esams.wikimedia.org ssh ortelius.esams.wikimedia.org

Here's the notes I kept when setting up Cassini: https://wiki.toolserver.org/view/OpenStreetMap_server/Setup_notes

It would be very helpful if you continued updating that wiki page (or related pages) with the stuff you do.

Open issues on the servers that need to be solved:

They need to be partitioned. There's some empty space on /dev/sd*

that hasn't been partitioned (> 1TB)

Evidently hardy's kernel doesn't like the RAID card on these boxes.

So the kernel needs to be upgraded, probably to jaunty-server

Have fun!

I haven't had so much free time lately, but am comfortable with helping set up and configure PostgreSQL and PostGIS and get the OSM database import going. Though, there are many more tasks to do that I have time to do.

Partitioning the servers in an optimal manner for us is something I'm not as comfortable with. I could figure it out if necessary, but Saper is probably better capable of that. One thing I do know is that the database logs should be on separate partition from the database itself, and both kept separate from the OS.

Here are a list of tasks:

http://meta.wikimedia.org/wiki/Maps_server_setup_tasks

On Cassini, work is needed on PostgreSQL configuration.

https://jira.toolserver.org/browse/TS-302

https://jira.toolserver.org/browse/TS-303

-Kate

_______________________________________________

...

Maps-l mailing list Maps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/maps-l

Aude

12:32 a.m.

On Mon, Sep 14, 2009 at 2:57 PM, Aude aude.wiki@gmail.com wrote:

...

I haven't had so much free time lately, but am comfortable with helping set up and configure PostgreSQL and PostGIS and get the OSM database import going. Though, there are many more tasks to do that I have time to do.

Partitioning the servers in an optimal manner for us is something I'm not as comfortable with. I could figure it out if necessary, but Saper is probably better capable of that. One thing I do know is that the database logs should be on separate partition from the database itself, and both kept separate from the OS.

Here are a list of tasks:

http://meta.wikimedia.org/wiki/Maps_server_setup_tasks

On Cassini, work is needed on PostgreSQL configuration.

https://jira.toolserver.org/browse/TS-302

https://jira.toolserver.org/browse/TS-303

-Kate

Another thing that might help is if we had documentation all in one place. Some of it is on the toolserver wiki, some on meta, some on the mediawiki wiki, along with items in jira and bugzilla, and various supporting info on the OSM wiki

Is there one best place to consolidate the documentation and notes?

-Kate

...

...
Maps-l mailing list Maps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/maps-l

Ævar Arnfjörð Bjarmason

2:26 a.m.

On Mon, Sep 14, 2009 at 7:02 PM, Aude aude.wiki@gmail.com wrote:

...

On Mon, Sep 14, 2009 at 2:57 PM, Aude aude.wiki@gmail.com wrote:

...
I haven't had so much free time lately, but am comfortable with helping set up and configure PostgreSQL and PostGIS and get the OSM database import going. Though, there are many more tasks to do that I have time to do.

Partitioning the servers in an optimal manner for us is something I'm not as comfortable with. I could figure it out if necessary, but Saper is probably better capable of that. One thing I do know is that the database logs should be on separate partition from the database itself, and both kept separate from the OS.

Here are a list of tasks:

http://meta.wikimedia.org/wiki/Maps_server_setup_tasks

On Cassini, work is needed on PostgreSQL configuration.

https://jira.toolserver.org/browse/TS-302

https://jira.toolserver.org/browse/TS-303

-Kate

Another thing that might help is if we had documentation all in one place. Some of it is on the toolserver wiki, some on meta, some on the mediawiki wiki, along with items in jira and bugzilla, and various supporting info on the OSM wiki

Is there one best place to consolidate the documentation and notes?

It's not really a problem that they're on separate wikis as long as things are clearly separated. E.g. the stuff on mediawiki.org has to do with the extension and the toolserver wiki to do with the servers. and meta with the project at large.

What really needs doing though is to organize all these pages. Just getting a list of all of them would be a good start, they're all over the place.

Ævar Arnfjörð Bjarmason

2:34 a.m.

On Mon, Sep 14, 2009 at 6:57 PM, Aude aude.wiki@gmail.com wrote:

...

I haven't had so much free time lately, but am comfortable with helping set up and configure PostgreSQL and PostGIS and get the OSM database import going. Though, there are many more tasks to do that I have time to do.

Yay!

...

Partitioning the servers in an optimal manner for us is something I'm not as comfortable with. I could figure it out if necessary, but Saper is probably better capable of that. One thing I do know is that the database logs should be on separate partition from the database itself, and both kept separate from the OS.

I can't remember what sort of RAID setup we have. Depending on the RAID how we partition might not matter at all.

Without having tried it I wouldn't think that putting logging on a separate disk doesn't help that much. Logging doesn't take up so much I/O. However keeping the database and its indexes on separate disks evidently helps a lot. See recent discussions on osm-dev/talk about database setup.

...

http://meta.wikimedia.org/wiki/Maps_server_setup_tasks

..although most of that is outdated when compared to this: https://wiki.toolserver.org/view/OpenStreetMap_server/Setup_notes

...

On Cassini, work is needed on PostgreSQL configuration.

https://jira.toolserver.org/browse/TS-302

I'll take care of this one.

...

https://jira.toolserver.org/browse/TS-303

I can't view this. I'll hopefully get the required JIRA permissions soon.

Ævar Arnfjörð Bjarmason

2:51 a.m.

On Mon, Sep 14, 2009 at 9:04 PM, Ævar Arnfjörð Bjarmason avarab@gmail.com wrote:

...

...
https://jira.toolserver.org/browse/TS-302

I'll take care of this one.

Okey done.

But what do we actually need PostGIS for on Cassini now that we have ptolemy? Just to give users PostGIS access?

Marcin Cieslak

3:35 a.m.

New subject: TS-302 PostGIS

Ævar Arnfjörð Bjarmason wrote:

...

On Mon, Sep 14, 2009 at 9:04 PM, Ævar Arnfjörð Bjarmason avarab@gmail.com wrote:

...
...
https://jira.toolserver.org/browse/TS-302

I'll take care of this one.

Okey done.

But what do we actually need PostGIS for on Cassini now that we have ptolemy? Just to give users PostGIS access?

Once the ptolemy will be set up, I don't think we need this for now. We can setup ident auth or something that will work good internally.

-- << Marcin Cieslak // saper@saper.info >>

Peter Körner

6:20 p.m.

New subject: Marcin Cieslak (saper) now has access to ptolemy and ortelius

...

But what do we actually need PostGIS for on Cassini now that we have ptolemy? Just to give users PostGIS access?

From the Tool-Authors point we would need read-access to both, the osm-db mirror and the postgis db, no matter if they reside on cassini or ptolemy.

Despite that: having our own postgis on cassini would allow tool authors to play with different mapnik stylesheets and/or completely new rendering technologies. That's what the toolserver is for, right?

Peter

Marcin Cieslak

3:42 a.m.

New subject: ptolemy partitioning

Ævar Arnfjörð Bjarmason wrote:

...

...
Partitioning the servers in an optimal manner for us is something I'm not as comfortable with. I could figure it out if necessary, but Saper is probably better capable of that. One thing I do know is that the database logs should be on separate partition from the database itself, and both kept separate from the OS.

I can't remember what sort of RAID setup we have. Depending on the RAID how we partition might not matter at all.

It's RAID 1+0 that means:

16 drives 140009MB each every pair of drives is a mirror everything is concatenated to be a single, large volume, 140009MB x 8 = 1120072 MB ~ 1119200 MB is the logical volume we have (in fake megabytes)

...

Without having tried it I wouldn't think that putting logging on a separate disk doesn't help that much. Logging doesn't take up so much I/O. However keeping the database and its indexes on separate disks evidently helps a lot. See recent discussions on osm-dev/talk about database setup.

We might move OS and software important stuff to a separate mirror (2 drives - but we waste 280GB for that then) and consider splitting the rest for the DB purposes, keeping individual mirrors for safety and concatenating as required for the database.

-- << Marcin Cieslak // saper@saper.info >>

Marcin Cieslak

6:53 a.m.

Ævar Arnfjörð Bjarmason wrote:

...

Evidently hardy's kernel doesn't like the RAID card on these boxes.

So the kernel needs to be upgraded, probably to jaunty-server

Ptolemy cooled down a bit:

--- raid.config.20090915 2009-09-15 00:54:17.000000000 +0000 +++ raid.config.20090915a 2009-09-15 01:07:17.000000000 +0000 @@ -7,7 +7,7 @@ Controller Model : Sun STK RAID INT Controller Serial Number : 00911AA1790 Physical Slot : 0 - Temperature : 74 C/ 165 F (Normal) + Temperature : 72 C/ 161 F (Normal) Installed memory : 256 MB Copyback : Disabled Background consistency check : Disabled @@ -18,10 +18,10 @@ -------------------------------------------------------- Controller Version Information -------------------------------------------------------- - BIOS : 5.2-0 (15825) - Firmware : 5.2-0 (15825) + BIOS : 5.2-0 (16732) + Firmware : 5.2-0 (16732) Driver : 1.1-5 (2449) - Boot Flash : 5.2-0 (15825) + Boot Flash : 5.2-0 (16732) -------------------------------------------------------- Controller Battery Information --------------------------------------------------------

I will be upgrading the driver tommorow CET from 2449 to 2463.

(shall we keep server logs on wiki or use some fancy irc stuff?)

-- << Marcin Cieslak // saper@saper.info >>

Ævar Arnfjörð Bjarmason

7:10 a.m.

On Tue, Sep 15, 2009 at 1:23 AM, Marcin Cieslak saper@saper.info wrote:

...

Ævar Arnfjörð Bjarmason wrote:

...
* Evidently hardy's kernel doesn't like the RAID card on these boxes. So the kernel needs to be upgraded, probably to jaunty-server

Ptolemy cooled down a bit:

--- raid.config.20090915 2009-09-15 00:54:17.000000000 +0000 +++ raid.config.20090915a 2009-09-15 01:07:17.000000000 +0000 @@ -7,7 +7,7 @@ Controller Model : Sun STK RAID INT Controller Serial Number : 00911AA1790 Physical Slot : 0

Temperature : 74 C/ 165 F (Normal)

Temperature : 72 C/ 161 F (Normal)

Installed memory : 256 MB Copyback : Disabled Background consistency check : Disabled @@ -18,10 +18,10 @@ -------------------------------------------------------- Controller Version Information --------------------------------------------------------

BIOS : 5.2-0 (15825)

Firmware : 5.2-0 (15825)

BIOS : 5.2-0 (16732)

Firmware : 5.2-0 (16732)

Driver : 1.1-5 (2449)

Boot Flash : 5.2-0 (15825)

Boot Flash : 5.2-0 (16732)

-------------------------------------------------------- Controller Battery Information --------------------------------------------------------

I will be upgrading the driver tommorow CET from 2449 to 2463.

(shall we keep server logs on wiki or use some fancy irc stuff?)

I don't care. But I think the IRC format is more conductive to actually being used. Since it's easier.

I haven't used the logging bot in #wikimedia-tech though. I don't know what to use without spamming logs for regular operations.

Mark Bergsma

3:49 p.m.

New subject: Server admin procedures wrt ptolemy and ortelius

Ævar Arnfjörð Bjarmason wrote:

...

...
I will be upgrading the driver tommorow CET from 2449 to 2463.

(shall we keep server logs on wiki or use some fancy irc stuff?)

I don't care. But I think the IRC format is more conductive to actually being used. Since it's easier.

I haven't used the logging bot in #wikimedia-tech though. I don't know what to use without spamming logs for regular operations.

Hi,

Just a few notes from me regarding server administration of ptolemy and ortelius...

Please keep in mind that ptolemy and ortelius are meant to be WMF production boxes. That means they're (also) managed by the Wikimedia Ops team. I think that for the near future we're happy to let you play with the boxes and experiment with what OSM/integration software/architecture works best. But eventually, when these maps are integrated into our core web sites, the servers and software will need to be managed by WMF as well as you guys. Especially since you volunteers might lose interest in the long run... :)

That means:

- Please work with us; keep us informed - Put documentation in our documentation wiki, http://wikitech.wikimedia.org. If you need access, please contact me and I'll get you set up. - Logging of server actions can be done on #wikimedia-tech using the log bot. Just use "!log <message>" in the channel, it will work. Put the server-name in the line. - If you have any problems/issues/needs related to managing the servers in general (RAID controller/driver issues?), as opposed to OSM software specific things, then certainly ask us! Chances are we've already solved it or have a certain way of doing things, and there is no need for you to reinvent the wheel. :)

Ptolemy and ortelius are, in the long run, *not* meant to be used by toolserver users. Those boxes are explicitly separate. You can't run a production database when users are running all kinds of inefficient and uncoordinated queries on it. :) For now it doesn't matter, but keep this in mind.

Cassini is a toolserver, and managed by Wikimedia Germany. They do things differently than WMF, coordinate with them to see what works there.

Thanks!

-- Mark Bergsma mark@wikimedia.org System & Network Administrator, Wikimedia Foundation

Peter Körner

6:26 p.m.

New subject: Server admin procedures wrt ptolemy and ortelius

...

Ptolemy and ortelius are, in the long run, *not* meant to be used by toolserver users. Those boxes are explicitly separate. You can't run a production database when users are running all kinds of inefficient and uncoordinated queries on it. :) For now it doesn't matter, but keep this in mind.

Okay, then we'll (*) need to mirror the dbs (postgis & osm-db) somehow, so the toolserver users can use it.

...

Cassini is a toolserver, and managed by Wikimedia Germany. They do things differently than WMF, coordinate with them to see what works there.

Peter

(*) we, that are the tool authors :)

Marcin Cieslak

18 Sep 18 Sep

5:20 a.m.

New subject: Server admin procedures wrt ptolemy and ortelius

Mark Bergsma wrote:

...

Ævar Arnfjörð Bjarmason wrote: Please keep in mind that ptolemy and ortelius are meant to be WMF production boxes. That means they're (also) managed by the Wikimedia Ops team. I think that for the near future we're happy to let you play with the boxes and experiment with what OSM/integration software/architecture works best. But eventually, when these maps are integrated into our core web sites, the servers and software will need to be managed by WMF as well as you guys. Especially since you volunteers might lose interest in the long run... :)

I think being "production" is very good - we will be on monitoring from the very beginning :-)

...

That means:

Please work with us; keep us informed

So far the only update for ptolemy: - LOM firmware update by river - Sun STK RAID INT firmware update from 5.2-0 (15825) to 5.2-0 (16732) - Tool to manage STK RAID INT installed in /usr/sbin/arcconf (Version 6.10 (B17551) from Intel website). I think that Adaptec's version (Version 6.10 (B18359)) is a bit more informative, what are you using on other servers?

- Installed as dependencies for arcconf: libstdc++5 gcc-3.3-base libgcc1

...

Put documentation in our documentation wiki,

http://wikitech.wikimedia.org. If you need access, please contact me and I'll get you set up.

Can you create accounts for Aude and myself (Saper)? Is Ævar there as well?

Looking at http://wikitech.wikimedia.org/view/Platform-specific_documentation how far is Sun Fire X4250 different?

...

Logging of server actions can be done on #wikimedia-tech using the log

bot. Just use "!log <message>" in the channel, it will work. Put the server-name in the line.

Cool, thanks.

...

If you have any problems/issues/needs related to managing the servers

in general (RAID controller/driver issues?), as opposed to OSM software specific things, then certainly ask us! Chances are we've already solved it or have a certain way of doing things, and there is no need for you to reinvent the wheel. :)

Yes, here are my questions:

(1) It has been reported that RAID controller has serious stability problems (causes kernel abends). I think this should be fixed in the new firmware OR the new driver, see below.

(2) What are the kernel upgrade procedures on the WMF servers?

(3) What are the OS upgrade procedure on the WMF servers?

(4) /home/saper/raid/linux_x86_x64_driver_v1.1.5-2463 contains Linux driver version 2463 for Sun STK RAID INT that we probably should be running. I can do that given (2) above :)

(5) I asked on #ts-admins about the management console access, that would be beneficial to perfom changes to kernel and partitioning, see the next points what we need to be done from there.

(6) I think we should reconfigure RAID - for now, I would like to put the current filesystem on a single RAID1 pair of drives. It's root, so I think this shouldn't be done from the running system. I think we can disband the current RAID 10 setup for now, we will be testing one or two possible RAID setups for Postgres as soon as we have space.

(7) I'd love to have OS repartitioned - small /, large /usr, mid-large /var, small /tmp in a traditional UNIX way. All of this on a RAID 1 volume created in step #6

(8) It would be nice to have different OS (FreeBSD or Solaris) but I understand that probably you'd like to have a uniform setup accross WMF and I think I can live with it. Would be nice to have information re #3 though if we stick to Ubuntu.

I think I could do (4)...(7) myself given access to the management console and with some possibility to have some netboot/CD-boot from there. This leads me to:

(9) I've seen this: http://wikitech.wikimedia.org/view/Automated_installation Do you have some kind of minimal netboot/recovery system to be invoked from LOM to do stuff like total repartitioning?

...

Ptolemy and ortelius are, in the long run, *not* meant to be used by toolserver users. Those boxes are explicitly separate. You can't run a production database when users are running all kinds of inefficient and uncoordinated queries on it. :) For now it doesn't matter, but keep this in mind.

Cassini is a toolserver, and managed by Wikimedia Germany. They do things differently than WMF, coordinate with them to see what works there.

I hope that we can have a joint on project on maps and use resources efficiently. For example, we might not have space for the full OSM database anywhere else then on ptolemy. However, I think we can find a way to provide production-level stability and stay within our resource base. Besides, I have no objections to having exactly the same production/monitoring features on cassini as well.

Uff, that's all from me for now :)

-- << Marcin Cieslak // saper@saper.info >>

Peter Körner

1:02 p.m.

New subject: Server admin procedures wrt ptolemy and ortelius

...

...
Ptolemy and ortelius are, in the long run, *not* meant to be used by toolserver users. Those boxes are explicitly separate. You can't run a production database when users are running all kinds of inefficient and uncoordinated queries on it. :) For now it doesn't matter, but keep this in mind.

Cassini is a toolserver, and managed by Wikimedia Germany. They do things differently than WMF, coordinate with them to see what works there.

I hope that we can have a joint on project on maps and use resources efficiently. For example, we might not have space for the full OSM database anywhere else then on ptolemy. However, I think we can find a way to provide production-level stability and stay within our resource base. Besides, I have no objections to having exactly the same production/monitoring features on cassini as well.

Okay just one question from my tool-author's perspective: Ortelius with a mapnik renderer will only need the PostGIS DB from Ptolemy, not the OSM-DB mirror. This Mirror will be used for Query-To-Map like features (and?) from the Toolserver.

So can't we go with a PostGIS-DB on Ptolemy and Cassini each, where Ptolemy is used from Ortelius for live rendering and Cassini uses it's own for customized render experiments with varying styles.

The OSM-DB however can be used directly from Cassini (r/o access of course). So what's lasting is the load balancing between the live PostGIS-DB and the OSM-DB used from the Toolservers, both residing on Ptolemy.

But I'm nearly sure I missed sth. :) Peter

Mark Bergsma

5:48 p.m.

New subject: Server admin procedures wrt ptolemy and ortelius

Hi Marcin,

Marcin Cieslak wrote:

...

...

Put documentation in our documentation wiki,

http://wikitech.wikimedia.org. If you need access, please contact me and I'll get you set up.

Can you create accounts for Aude and myself (Saper)? Is Ævar there as well?

Yes, I will get the details to you.

...

Yes, here are my questions:

(1) It has been reported that RAID controller has serious stability problems (causes kernel abends). I think this should be fixed in the new firmware OR the new driver, see below.

(2) What are the kernel upgrade procedures on the WMF servers? (3) What are the OS upgrade procedure on the WMF servers?

<snip>

I like your enthusiasm in managing these systems. :-) However, especially for the long run I think it would be easiest if you let us (WMF operations) deal with these system level things, and you concentrate on the OSM-specific software setup. Otherwise these systems will naturally diverge from our other servers, and for these production systems that's something we'll need to avoid.

What we normally do:

We upgrade the OS and kernels on need (features/stability), or when serious (remote) security issues are identified. We tend to upgrade to the latest Ubuntu kernel and use that unless we experience problems in practice. In this case there is indeed an issue with that RAID controller, so we'll upgrade it. We try to stick with Ubuntu Hardy as long as we can for miscellaneous servers. If you feel you really need newer versions for OSM setup, then let us know and we can upgrade it.

Partitioning: we know that it's traditional to separate /usr /var etc, but we have found that this usually has very little use in practice, and is more often a nuisance. These days we put everything in one large enough / and only split off data partitions on servers where it matters. Of course your databases should be running off a special partition, but for the rest there is probably no real need. If you think otherwise and have good arguments, we can surely change it, of course. We do tend to use LVM for everything non-root in those cases.

The same holds for the RAID setup: on our databases and big storage systems, most often we just run it off the same big RAID-10 array. It's more convenient and flexible and if well-configured the rest of the OS is not hitting that array much at all. If you feel there is a need, we can of course change it - but we'll need to reinstall the OS. A different RAID level would be totally fine as well of course - this is very much dependent on your needs. I picked RAID-10 as neither Aevar nor Katie knew what was necessary, and RAID-10 tends to be the best choice for databases and high performance I/O systems.

Serial console/LOM access cannot easily be handed out, but should also not be necessary usually. In the unlikely event that the system becomes unmanageable in-band, just contact us directly (ask on #wikimedia-tech for example) and we'll restore it quickly.

...

I hope that we can have a joint on project on maps and use resources efficiently. For example, we might not have space for the full OSM database anywhere else then on ptolemy. However, I think we can find a way to provide production-level stability and stay within our resource base. Besides, I have no objections to having exactly the same production/monitoring features on cassini as well.

I really want to stress that these systems need to be *separate*, they cannot be used together at all. Ideally there is no traffic between those servers at all, except in the form of cassini generating visitor traffic like the rest of the Internet. Cassini is meant for playing around where lots of people have access, the other two are (in the end) really meant for production use with limited access. Stable operation is simply not possible when arbitrary users can do arbitrary things on a system, and that's why we intended these systems to be very isolated from the start. Cassini is also managed by WMDE / Toolserver, ptolemy and ortelius are Wikimedia Foundation managed. So I'm afraid that we really cannot use those servers in one resource pool... If those separate clusters do not have enough resources/space to do what we need, I think we should look into buying more hardware. That is really not impossible. :)

Thanks for sharing your ideas. :)

-- Mark Bergsma mark@wikimedia.org System & Network Administrator, Wikimedia Foundation

Marcin Cieslak

8:34 p.m.

New subject: Server admin procedures wrt ptolemy and ortelius

Mark Bergsma wrote:

...

Hi Marcin,

Thank you for your answers re OS/upgrades/kernel - 100% agreed.

...

(Partitioning): we know that it's traditional to separate /usr /var etc, but we have found that this usually has very little use in practice, and is more often a nuisance. These days we put everything in one large enough / and only split off data partitions on servers where it matters. Of course your databases should be running off a special partition, but for the rest there is probably no real need. If you think otherwise and have good arguments, we can surely change it, of course. We do tend to use LVM for everything non-root in those cases.

Please excuse my 1995-era UNIX thinking :)

...

The same holds for the RAID setup: on our databases and big storage systems, most often we just run it off the same big RAID-10 array. It's more convenient and flexible and if well-configured the rest of the OS is not hitting that array much at all. If you feel there is a need, we can of course change it - but we'll need to reinstall the OS. A different RAID level would be totally fine as well of course - this is very much dependent on your needs. I picked RAID-10 as neither Aevar nor Katie knew what was necessary, and RAID-10 tends to be the best choice for databases and high performance I/O systems.

The issue is not about separating OS away from the rest, it's about testing how we can split two different usage patterns on the databases we might have.

The best solution would be to have an extra pair of small drives in RAID#1 so that we can check whether 2x or 3xRAID-10 does change anything in the picture indeed. I am somehow not confident about extNfs doing stuff optimally.

As soon as we confirm that we do not run out of space by removing two drives from RAID-10 I would definitely go for reinstall on a separate RAID#1 pair (taken out of the current RAID-10 if we have nothing small available).

...

Serial console/LOM access cannot easily be handed out, but should also not be necessary usually. In the unlikely event that the system becomes unmanageable in-band, just contact us directly (ask on #wikimedia-tech for example) and we'll restore it quickly.

If you handle the whole OS/hardware part - fine with me. One trouble less. :)

(re multicast from the other email)

...

However, Switches/routers handle multicast traffic specially, have group/port membership limits for them and we've also found several bugs. So before you start using it heavily, I'd like to know what for. :) With only 2 servers communicating, would unicast not be a better idea?

Spread (the tool I am thinking of) requires basically either broadcast or multicast. The choice is yours :) Should (a) this model prove as workable and (b) we will quickly find out we need to start to grow a farm of rendering servers (hopefully not) - you might very well decide that WMF might need to carry mcast traffic for example across Atlantic. For now, we are just our little family of few boxes in The Netherlands.

This is not something to even *think* about now - I would like to see how it works with our 2 or 3 servers (yes, including Cassini *for now* - see below), so multicast would certainly be an advantage.

I will probably get back to you re virtual IP addresses anyway once my ideas mature and will be ready to be put into action.

(re-arranged order below)

...

I really want to stress that these systems need to be *separate*, they cannot be used together at all. Ideally there is no traffic between those servers at all, except in the form of cassini generating visitor traffic like the rest of the Internet. Cassini is meant for playing around where lots of people have access, the other two are (in the end) really meant for production use with limited access.

We are now in the middle of the internal discussion about the future role of Cassini. It has been raised (and I share this view) that we might not really need another toolserver box (we have now one underutilized Sun and one Linux anyway) and remote access to the databases and rendering infrastructure from existing toolservers might be enough.

As I prefer to build this architecture bottom-to-the-top (i.e. ptolemy first, rendering later, user access at the end), we still need to find out what the exact role of Cassini will be.

...

Stable operation is simply not possible when arbitrary users can do arbitrary things on a system, and that's why we intended these systems to be very isolated from the start.

One of my ideas (this is only mine and other project members might certainly disagree) would be to have Cassini as the box that runs newer/experimental versions of production stuff from ortelius/ptolemy. This can still benefit toolserver users (so that they have the infrastructure to test their stylesheets for example), but will be definitely more under control unlike "playing around a lot". It can be very useful to share some functions with ortelius *before we go into production* just to test feasibility of a distributed rendering engine I am envisioning. This might mean that cassini will be much more closely coupled with prolomy/ortelius than with users and their stuff.

*I* would rather have another box coupled with the two *now* to test our load distribution concepts then another toolserver. Daniel, feel free to bash me for that :)

So, from WMF perspective, I would rather promote Cassini to be treated like almost-production box for now (as ptolemy is) and under same administration processes we have for WMF *until* the rendering infrastructure will be ironed out to go live. After this it can be a perfect staging box to test updates to the WMF production environment - with a software setup that could be promoted to the production boxes once tested.

...

Cassini is also managed by WMDE / Toolserver, ptolemy and ortelius are Wikimedia Foundation managed. So I'm afraid that we really cannot use those servers in one resource pool...

Having said above, nothing will change with Cassini without prior written consent from Wikimedia Deutschland. That's why we try to work together to have a final architecture ironed out.

...

If those separate clusters do not have enough resources/space to do what we need, I think we should look into buying more hardware. That is really not impossible. :)

Before we do that, I'd like to check whether how we can max out what we have. And I'd like to know, for example, do I need more smaller machines or just one big? And what exactly are our storage requirements (thinking about i18n-zed tiles for example)? I think we should be prepared for a higher demand than OSM currently has - that's where my concerns come from. I'd like to avoid unnecessary duplication of infrastructure where we could have just more power. Maps in many ways different than casual PHP/Mediawiki bot stuff run on Toolserver - we have much more power to control the environment (like putting users' rendering requests at the lowest priority).

To sum up:

(1) we will be working on architecture with the goal to make cassini work as optimal as possible for the project (2) as soon as we find out how much PostgreSQL space we need, I would ask you to reinstall ptolemy for us (3) at least multicast group would be fine for now

-- << Marcin Cieslak // saper@saper.info >>

Kai Krueger

19 Sep 19 Sep

12:08 a.m.

New subject: Server admin procedures wrt ptolemy and ortelius

Hi,

I would like to ask the slightly higher level question of what the overall plans are to what services are running on what machines? So far I understand it as Ptolemy will contain a postgis database and the mapnik rendering will be on Ortelius?

http://meta.wikimedia.org/wiki/Maps_server_setup_tasks shows some ideas, but it seems to have more questions than answers about the architecture and the different parts of software that are supposed to run on these machines.

...

As soon as we confirm that we do not run out of space by removing two drives from RAID-10 I would definitely go for reinstall on a separate RAID#1 pair (taken out of the current RAID-10 if we have nothing small available).

I am not sure what exactly you want to put on the disks, but if I am not mistaken, the postgis rendering database is about 100 Gb in size. In addition OSM currently has about 350Gb of tiles for its single mapnik layer. The full main OSM I think is on the order of about 1 Tb, but a good portion of that would presumably not be relevant to wikipedia, i.e. the GPX points and history and isn't currently available anyway. Only the current OSM data is probably more on the order of 200Gb, but someone else can probably provide more accurate data if it is needed. On the other hand, is there any need for the OSM database on the production servers at the moment? Are there any services yet that use that data, or should that be something for the toolserver?

...

Spread (the tool I am thinking of) requires basically either broadcast or multicast. The choice is yours :) Should (a) this model prove as workable and (b) we will quickly find out we need to start to grow a farm of rendering servers (hopefully not) - you might very well decide that WMF might need to carry mcast traffic for example across Atlantic. For now, we are just our little family of few boxes in The Netherlands.

May I ask what your plans are for a potential rendering farm is and how multicast / broadcast come into this? Renderd, the rendering backend behind mod_tile does have some ability to scale across multiple servers if needed, which may or may not be useful here, although currently untested as OSM has so far only used one server. (Currently a 8 core + hyperthreading machine). But in case it doesn't fit, it can probably be extended to do so.

...

Before we do that, I'd like to check whether how we can max out what we have. And I'd like to know, for example, do I need more smaller machines or just one big? And what exactly are our storage requirements (thinking about i18n-zed tiles for example)? I think we should be prepared for a higher demand than OSM currently has - that's where my concerns come from.

Which parts of the software stack do you think will need most work to scale up? My guess would be that the static rendering, i.e. the non-slippy map, non-tile rendering of the maps still needs some thoughts. Currently the static part of the SlippyMap extension just calls directly into the rendering stack and renders the same image every time. The "export scripts" do add some caching headers, so one can put a cache in front of it to not rerender it every time, but I would guess this needs some form of render side queueing too to deal with load spikes.

I'd like to avoid unnecessary duplication of infrastructure where

...

we could have just more power. Maps in many ways different than casual PHP/Mediawiki bot stuff run on Toolserver - we have much more power to control the environment (like putting users' rendering requests at the lowest priority).

To sum up:

(1) we will be working on architecture with the goal to make cassini work as optimal as possible for the project (2) as soon as we find out how much PostgreSQL space we need, I would ask you to reinstall ptolemy for us (3) at least multicast group would be fine for now

Maps-l mailing list Maps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/maps-l

Peter Körner

10:49 p.m.

New subject: Server admin procedures wrt ptolemy and ortelius

...

*I* would rather have another box coupled with the two *now* to test our load distribution concepts then another toolserver. Daniel, feel free to bash me for that :)

I as a tool author 'd like to have a toolserver with r/o (or if it's possible r/w) access to an Rendering/PostGIS DB as well as r/o accedes to a (as complete as possible) OSM-DB clone - may it be named cassini, willow or nightshade or whats-o-ever.

So if all toolserver *must* be separated from the live boxes, maybe cassini can be the db server to nightshade & willow as ptolemy is to ortelius..

Peter

Marcin Cieslak

18 Sep 18 Sep

5:45 a.m.

New subject: Server admin procedures wrt ptolemy and ortelius

Ah, I forgot one things.

I'd like to see how IP failover may work for us in practice (I am thinking between easy switch between ptolemy and cassini in case of some PostgreSQL problems).

(10) Do you mind allocating us one more virtual IPv4 address that will be shared between cassini and ptolemy (for now)? This leads up to:

(11) I would like to use generate multicast traffic between OSM servers. I think it would be good to have few IP addresses from the 239.192.0.0/14 range allocated.

-- << Marcin Cieslak // saper@saper.info >>

Mark Bergsma

5:51 p.m.

New subject: Server admin procedures wrt ptolemy and ortelius

Hi Marcin,

Marcin Cieslak wrote:

...

Ah, I forgot one things.

I'd like to see how IP failover may work for us in practice (I am thinking between easy switch between ptolemy and cassini in case of some PostgreSQL problems).

(10) Do you mind allocating us one more virtual IPv4 address that will be shared between cassini and ptolemy (for now)? This leads up to:

I addressed this in my previous mail.

...

(11) I would like to use generate multicast traffic between OSM servers. I think it would be good to have few IP addresses from the 239.192.0.0/14 range allocated.

Well... we do use multicast a bit. We can indeed allocate IPs for you, out of our AS-allocated range. However, Switches/routers handle multicast traffic specially, have group/port membership limits for them and we've also found several bugs. So before you start using it heavily, I'd like to know what for. :) With only 2 servers communicating, would unicast not be a better idea?

Let me know,

-- Mark Bergsma mark@wikimedia.org System & Network Administrator, Wikimedia Foundation

5575

Age (days ago)

5580

Last active (days ago)

maps-l@lists.wikimedia.org

20 comments

6 participants

tags (0)

participants (6)

Aude
Kai Krueger
Marcin Cieslak
Mark Bergsma
Peter Körner
Ævar Arnfjörð Bjarmason