[Admin-discuss] Murphy (long mail warning - sorry)
Hi all, There's been talk about what's going on with murphy recently, and, we're not quite sure yet, so figured this'd be a good place to discuss it. At the moment, it's sitting there idle, with Solaris 10 installed. Linux was taken off it for a number of reasons.. * It seems to be exhibiting kernel bugs (random system freezes for absolutely no reason. Unaccessable via SSH or serial console/ALOM). * We (redbrick) seemed to be the only people anywhere actually using Linux on a production T2000. This worries us greatly, as it means vendors are unlikely to put too much time into fixing bugs. * A number of bits of software were known to be buggy, and we had some of them patched ourselves. (For example, libmysqlclient15off) * Ubuntu 8.04 doesn't support SPARC at all, which kinda ties back into the second point. People aren't interested in supporting it. Debian has a SPARC port that apparently supports a Niagara CPU, but there seems to be nobody (like, absolutely nobody, anywhere) using it in production. I only came across one reference to one guy using it in testing. For a morning's worth of googling, this isn't encouraging. * Gentoo (who seem to have the biggest Niagara uptake of all of the linuxes) makes admins break out in a cold sweat at the very mention of it's name. (I quite like gentoo as it happens, but I'm weird) Anyway, that's why we think Linux is a bad idea (tm). Unfortunately, that leaves us with Solaris as our only real option. One or two of the BSDs support Niagara, but they seem to be even less mature then linux. We've installed it on a test basis, to see how it goes really. We're considering breaking it into multiple LDOMs - logical Solaris domains, which are kinda like Xen virtual machines, only with Sun logos everywhere. See here for more info: http://www.sun.com/bigadmin/hubs/ldoms/ And here for a fancy PDF about them (even has pictures): http://www.sun.com/blueprints/0207/820-0832.pdf An idea is that we'd have a bigish LDOM for logins and apache (because we want users to be able to log into the WWW machine), another for MySQL, another for secondary LDAP, and another for maybe jakarta/tomcat/glassfish/whatever java webapp server we end up using. The LDAP and MySQL ones would obviously be smaller than the Apache and login one. The main benefit of this is security, really. Of course it means remembering 4 more root passwords, but that's a minor issue :) That's all really. Anyone have any feedback? We're completely open to suggestions on this, nothing has been decided with murphy at this point. -Andrew -- RedBrick/Gamessoc/Filmsoc Webmaster 08/09
For what it's worth, you're dead right to avoid being on a minority OS/platform. I've been through this with Tru64 on Alpha and the amount of problems you encounter are not funny - there are more productive ways to spend your days! In my opinion, Solaris is the way to go on the T2000 - Sun have a serious vested interest in making it run well. From an admin (and to a far lesser extent user) point of view, it's also worth having Solaris experience, less so today than previously, but it is still used for a lot of larger (read important and expensive) Unix/Oracle/ERP deployments. 'just my biased opinion... Regards, Fergus. Andrew Martin wrote:
Hi all,
There's been talk about what's going on with murphy recently, and, we're not quite sure yet, so figured this'd be a good place to discuss it.
At the moment, it's sitting there idle, with Solaris 10 installed. Linux was taken off it for a number of reasons..
* It seems to be exhibiting kernel bugs (random system freezes for absolutely no reason. Unaccessable via SSH or serial console/ALOM). * We (redbrick) seemed to be the only people anywhere actually using Linux on a production T2000. This worries us greatly, as it means vendors are unlikely to put too much time into fixing bugs. * A number of bits of software were known to be buggy, and we had some of them patched ourselves. (For example, libmysqlclient15off) * Ubuntu 8.04 doesn't support SPARC at all, which kinda ties back into the second point. People aren't interested in supporting it. Debian has a SPARC port that apparently supports a Niagara CPU, but there seems to be nobody (like, absolutely nobody, anywhere) using it in production. I only came across one reference to one guy using it in testing. For a morning's worth of googling, this isn't encouraging. * Gentoo (who seem to have the biggest Niagara uptake of all of the linuxes) makes admins break out in a cold sweat at the very mention of it's name. (I quite like gentoo as it happens, but I'm weird)
Anyway, that's why we think Linux is a bad idea (tm).
Unfortunately, that leaves us with Solaris as our only real option. One or two of the BSDs support Niagara, but they seem to be even less mature then linux.
We've installed it on a test basis, to see how it goes really. We're considering breaking it into multiple LDOMs - logical Solaris domains, which are kinda like Xen virtual machines, only with Sun logos everywhere.
See here for more info: http://www.sun.com/bigadmin/hubs/ldoms/
And here for a fancy PDF about them (even has pictures): http://www.sun.com/blueprints/0207/820-0832.pdf
An idea is that we'd have a bigish LDOM for logins and apache (because we want users to be able to log into the WWW machine), another for MySQL, another for secondary LDAP, and another for maybe jakarta/tomcat/glassfish/whatever java webapp server we end up using. The LDAP and MySQL ones would obviously be smaller than the Apache and login one.
The main benefit of this is security, really. Of course it means remembering 4 more root passwords, but that's a minor issue :)
That's all really. Anyone have any feedback? We're completely open to suggestions on this, nothing has been decided with murphy at this point.
-Andrew
On Mon, May 26, 2008 at 08:43:29PM +0100, Andrew Martin wrote lots. I agree with everything he wrote :) -- Andrew Harford CA3 Class Rep System Administrator, DCU Networking Society Ordinary Member, Societies & Publications Committee 6,000 miles and all the dope I could smoke still couldn't separate me from my problems. And this was good dope. I mean it was growing everywhere. Oh my God! This one time we got so baked we ended up eating all the food at the food the World Health Organization had airlifted in. --Brian Griffin
On Mon, May 26, 2008 at 08:43:29PM +0100, Andrew Martin wrote:
* We (redbrick) seemed to be the only people anywhere actually using Linux on a production T2000. This worries us greatly, as it means vendors are unlikely to put too much time into fixing bugs.
I'm running another one too :-) but it's still got an original dapper on it.
* Ubuntu 8.04 doesn't support SPARC at all, which kinda ties back into the second point. People aren't interested in supporting it. Debian has a SPARC port that apparently supports a Niagara CPU, but there seems to be nobody (like, absolutely nobody, anywhere) using it in production. I only came across one reference to one guy using it in testing. For a morning's worth of googling, this isn't encouraging.
Yeah, it sucks :/ But using Solaris isn't so bad, handy for playing with dtrace and so on.
That's all really. Anyone have any feedback? We're completely open to suggestions on this, nothing has been decided with murphy at this point.
Sounds cool! -- colmmacc@redbrick.dcu.ie PubKey: colmmacc+pgp@redbrick.dcu.ie Web: http://devnull.redbrick.dcu.ie/
Hm.. We're likely going to be setting up the LDOMs this weekend to see how it goes. We need murphy to run apache, user logins (for web development), glassfish/tomcat and preferably backup instances of some critical services (LDAP, maybe DNS). Oh, and databases. Here's my propsed layout.. Control domain: 4GB RAM, 4VCPUs (1 physical CPU) * It seems if you want to use ZFS for storing the LDOMs, 4GB RAM for the control domain is essential. We could go lower and give more RAM to other domains if we wanted to stick with old fashioned file systems though. Apache/httpd/logins: 8GB RAM, 16 VCPUs (4 physical CPUs) Service backups: 1GB RAM, 4VCPUs (1 physical CPU) MySQL/PGSQL: 3GB RAM, 8VCPUs (2 physical CPUs) Unless I'm adding wrong, that makes 16GB of RAM and 32 VCPUs/8 physical. I'm not sure if we really want to go with ZFS for the control domain, we could just use a lesser filesystem and give more RAM to apache and maybe databases? Aside from that though... feedback welcome.. Thoughts? -Andrew On Wed, May 28, 2008 at 10:53:39AM +0100, Colm MacCarthaigh wrote:
On Mon, May 26, 2008 at 08:43:29PM +0100, Andrew Martin wrote:
* We (redbrick) seemed to be the only people anywhere actually using Linux on a production T2000. This worries us greatly, as it means vendors are unlikely to put too much time into fixing bugs.
I'm running another one too :-) but it's still got an original dapper on it.
* Ubuntu 8.04 doesn't support SPARC at all, which kinda ties back into the second point. People aren't interested in supporting it. Debian has a SPARC port that apparently supports a Niagara CPU, but there seems to be nobody (like, absolutely nobody, anywhere) using it in production. I only came across one reference to one guy using it in testing. For a morning's worth of googling, this isn't encouraging.
Yeah, it sucks :/ But using Solaris isn't so bad, handy for playing with dtrace and so on.
That's all really. Anyone have any feedback? We're completely open to suggestions on this, nothing has been decided with murphy at this point.
Sounds cool!
-- colmmacc@redbrick.dcu.ie PubKey: colmmacc+pgp@redbrick.dcu.ie Web: http://devnull.redbrick.dcu.ie/
-- RedBrick/Gamessoc/Filmsoc Webmaster 08/09
On Wed, May 28, 2008 at 12:33:52PM +0100, Andrew Martin wrote:
I'm not sure if we really want to go with ZFS for the control domain, we could just use a lesser filesystem and give more RAM to apache and maybe databases?
It's awful lot of ram to use to have xfs on just 2 fairly disks, I'd want to be convinced that we're getting some real benifit from the ZFS a. -- Andrew Harford CA3 Class Rep System Administrator, DCU Networking Society Ordinary Member, Societies & Publications Committee I could crush him like an ant. But it would be too easy. No, revenge is a dish best served cold. I'll bide my time until ... Oh, what the hell. I'll just crush him like an ant. -- Montgomery Burns
Yeap, that was my thinking too. We could give away 2 or 3GB of that to the web server/login server. The major benefit is the ability to snapshot/clone with ease, but that's just administrative nicities when you think about it. As for resource allocation, we can actually add or remove resources (CPU/RAM) to each LDOM after it's been created anyway, and it'll take effect the next time that LDOM is rebooted. On Wed, May 28, 2008 at 01:13:01PM +0100, Andrew Harford wrote:
On Wed, May 28, 2008 at 12:33:52PM +0100, Andrew Martin wrote:
I'm not sure if we really want to go with ZFS for the control domain, we could just use a lesser filesystem and give more RAM to apache and maybe databases?
It's awful lot of ram to use to have xfs on just 2 fairly disks, I'd want to be convinced that we're getting some real benifit from the ZFS
a.
-- Andrew Harford CA3 Class Rep System Administrator, DCU Networking Society Ordinary Member, Societies & Publications Committee
I could crush him like an ant. But it would be too easy. No, revenge is a dish best served cold. I'll bide my time until ... Oh, what the hell. I'll just crush him like an ant. -- Montgomery Burns
-- RedBrick/Gamessoc/Filmsoc Webmaster 08/09
On Wed, May 28, 2008 at 01:56:04PM +0100, Andrew Martin wrote:
Yeap, that was my thinking too. We could give away 2 or 3GB of that to the web server/login server. The major benefit is the ability to snapshot/clone with ease, but that's just administrative nicities when you think about it.
As for resource allocation, we can actually add or remove resources (CPU/RAM) to each LDOM after it's been created anyway, and it'll take effect the next time that LDOM is rebooted.
This all seems too nice and easy, there must be a catch somewhere ;) -- Andrew Harford CA3 Class Rep System Administrator, DCU Networking Society Ordinary Member, Societies & Publications Committee What we were after now was the old surprise visit. That was a real kick and good for laughs and lashings of the old ultraviolence. -- Alex (A Clockwork Orange)
The whole thing runs Solaris :P On Wed, May 28, 2008 at 02:01:16PM +0100, Andrew Harford wrote:
On Wed, May 28, 2008 at 01:56:04PM +0100, Andrew Martin wrote:
Yeap, that was my thinking too. We could give away 2 or 3GB of that to the web server/login server. The major benefit is the ability to snapshot/clone with ease, but that's just administrative nicities when you think about it.
As for resource allocation, we can actually add or remove resources (CPU/RAM) to each LDOM after it's been created anyway, and it'll take effect the next time that LDOM is rebooted.
This all seems too nice and easy, there must be a catch somewhere ;)
-- Andrew Harford CA3 Class Rep System Administrator, DCU Networking Society Ordinary Member, Societies & Publications Committee
What we were after now was the old surprise visit. That was a real kick and good for laughs and lashings of the old ultraviolence. -- Alex (A Clockwork Orange)
-- RedBrick/Gamessoc/Filmsoc Webmaster 08/09
Hi all, We've gotten LDOMs up and running on murphy \o/ It's currently laid out as follows: -------------------------------------------------------- Control 136.206.15.14 murphy.redbrick 4VCPUs (1 core), 1GB RAM This is what has become of our original solaris install -------------------------------------------------------- Webserver 136.206.15.31 murphy-ldg1.redbrick (not set yet) 16VCPUs (4 cores), 10GB RAM To be a webserver/login machine? -------------------------------------------------------- (Unused) 12 VCPUs (3 cores), 5GB RAM Maybe a databases LDOM, and another one for tiny services? -------------------------------------------------------- This is just a starting point, and the result of me playing with things. I'd suggest a few things.. one, that we give the name "murphy.redbrick" to the webserver/login machine, as that's what we'll be letting users log into. Also give it the .14 IP. Give all LDOMs an IP in the .30-.40 range (so, give control .30, and webserver .31 *and* .14). We can reassign resources as we see fit without reinstalling things. Memory can be added or removed to a domain with a reboot of that domain, and cores/VCPUs can even be hotswapped :) And we can have up to 32 of these things. Although I'd be against assigning VCPUs in chunks of less than four, otherwise you'll get contention for core resources (4 VCPUs/threads per core). The "ldg1" thing is just a suggestion from the sun LDOM manual.. maybe give the control domain -ldg0 or something. Then the next ldom ldg2..etc.. I'd still like us to add some more LDOMs to play with and do things with. We saved a load of resources on the control domain by not using ZFS. LDOM images are being stored in /ldoms/ldgX/bootdisk.img on the control domain. LDOM management stuff is stored in /opt/SUNWldm/bin on the control domain, and controlled by the service ldmd. Serial access to the guest ldoms is available by telnetting to localhost:5000 from the control domain (we should *never* let users log in here). You'll also notice that Solaris has gone into nazi mode, being strict about password strength, and things, because it's the control domain. To any admins interested in setting up an LDOM to play with, read the LDOM manual :) We're using version 1.0.3 if anything asks. Oh, also, the latest T2000 firmware update (which ldoms required) has made our CD-ROM drive disappear. Go figure. Nothing's been installed in the guest domain that's been set up yet. Em...yes..that's about it I think. I'm half asleep so this may not be the most coherent email I've written in a while. -Andrew On Wed, May 28, 2008 at 02:18:27PM +0100, Andrew Martin wrote:
The whole thing runs Solaris :P
On Wed, May 28, 2008 at 02:01:16PM +0100, Andrew Harford wrote:
On Wed, May 28, 2008 at 01:56:04PM +0100, Andrew Martin wrote:
Yeap, that was my thinking too. We could give away 2 or 3GB of that to the web server/login server. The major benefit is the ability to snapshot/clone with ease, but that's just administrative nicities when you think about it.
As for resource allocation, we can actually add or remove resources (CPU/RAM) to each LDOM after it's been created anyway, and it'll take effect the next time that LDOM is rebooted.
This all seems too nice and easy, there must be a catch somewhere ;)
-- Andrew Harford CA3 Class Rep System Administrator, DCU Networking Society Ordinary Member, Societies & Publications Committee
What we were after now was the old surprise visit. That was a real kick and good for laughs and lashings of the old ultraviolence. -- Alex (A Clockwork Orange)
-- RedBrick/Gamessoc/Filmsoc Webmaster 08/09
_______________________________________________ Admin-discuss mailing list Admin-discuss@lists.redbrick.dcu.ie http://lists.redbrick.dcu.ie/mailman/listinfo/admin-discuss
-- RedBrick/Gamessoc/Filmsoc Webmaster 08/09
On Mon, Jun 02, 2008 at 11:14:09PM +0100, Andrew Martin wrote:
Hi all,
We've gotten LDOMs up and running on murphy \o/
Great work, sorry i wasn't more help :(
I'd suggest a few things.. one, that we give the name "murphy.redbrick" to the webserver/login machine, as that's what we'll be letting users log into. Also give it the .14 IP. Give all LDOMs an IP in the .30-.40 range (so, give control .30, and webserver .31 *and* .14).
Yes, i completly agree.
The "ldg1" thing is just a suggestion from the sun LDOM manual.. maybe give the control domain -ldg0 or something. Then the next ldom ldg2..etc..
Seems perfectly logical (pun intended)
LDOM images are being stored in /ldoms/ldgX/bootdisk.img on the control domain.
I don't really know what that means.
LDOM management stuff is stored in /opt/SUNWldm/bin on the control domain, and controlled by the service ldmd. Serial access to the guest ldoms is available by telnetting to localhost:5000 from the control domain (we should *never* let users log in here).
Indeed, we need to be quite sure this is secure. Also, can that port be changed to something obscure?
Em...yes..that's about it I think. I'm half asleep so this may not be the most coherent email I've written in a while.
Em, it was fine. You should probably not be so negative about your emails in your emails however ;) a. -- Andrew Harford System Administrator, DCU Networking Society Ordinary Member, Societies & Publications Committee Let's look at this thing from a... um, from a standpoint of status. What do we got on the spacecraft that's good? -- Gene Kranz (Apollo 13)
LDOM images are being stored in /ldoms/ldgX/bootdisk.img on the control domain.
I don't really know what that means.
On the control domain, there's a file, /ldoms/ldg1/bootdisk.img, that's /dev/dsk/c0t0d0 on murphy-ldg1 :)
LDOM management stuff is stored in /opt/SUNWldm/bin on the control domain, and controlled by the service ldmd. Serial access to the guest ldoms is available by telnetting to localhost:5000 from the control domain (we should *never* let users log in here).
Indeed, we need to be quite sure this is secure. Also, can that port be changed to something obscure?
Security through obscurity makes baby jesus cry :p It's not accessable from outside of localhost, I checked. It can be changed though if you want to. -- RedBrick/Gamessoc/Filmsoc Webmaster 08/09
On Mon, Jun 02, 2008 at 11:28:50PM +0100, Andrew Martin wrote:
On the control domain, there's a file, /ldoms/ldg1/bootdisk.img, that's /dev/dsk/c0t0d0 on murphy-ldg1 :)
Well, that makes perfect sense :p -- Andrew Harford System Administrator, DCU Networking Society Ordinary Member, Societies & Publications Committee Did you hear that Meg? Guys can marry other guys now. So...this is awkward, but I mean, if they can do that, that is pretty much it for you, isn't it? I mean you as well pack it in. Game over. --Stewie Griffin
Hi guys, Right, so murphy has been set up with LDOMs. I've set up networking in a somewhat sane way (v4 anyway). See wonderful pretty diagram that I drew while learning xfig: http://www.redbrick.dcu.ie/~werdz/murphy_ldoms.png ldg0 and ldg1 have been set up as shown. ldg2 doesn't exist yet. ldg1's main hostname is murphy, so when users log into it, it's the same as before for them (username@murphy:~$). Just to keep things simple really. I've also mounted NFS on ldg1, was far easier then I had thought.. I'd heard rumours that it eats little children, when in reality you just have to use the vers=3 mount option and it seems to work perfectly. Currently trying to get LDAP working. There seems to be a way to get the native solaris client to work nicely with OpenLDAP, but unfortunately our schema is a custom mess of a thing that doesn't even work with newer versions of linux. (our LDAP server is running in an ubuntu 6.06 chroot on carbon, because the version that comes with ubuntu 8.04 chokes on our schema). So, yes, making solaris talk to that's a problem. Once that's done, I'd like to start setting up our own apache and doing some benchmarking/performance testing to find an optimal configuration. Then, once we've figured that out, it would seem to make sense to write a few build scripts to help keep apache up to date. Erh..yes..that's about it for now.. Anyone have any comments or suggestions? I'm gonna start a new thread about our LDAP mess. -Andrew n Mon, Jun 02, 2008 at 11:44:07PM +0100, Andrew Harford wrote:
On Mon, Jun 02, 2008 at 11:28:50PM +0100, Andrew Martin wrote:
On the control domain, there's a file, /ldoms/ldg1/bootdisk.img, that's /dev/dsk/c0t0d0 on murphy-ldg1 :)
Well, that makes perfect sense :p
-- Andrew Harford System Administrator, DCU Networking Society Ordinary Member, Societies & Publications Committee
Did you hear that Meg? Guys can marry other guys now. So...this is awkward, but I mean, if they can do that, that is pretty much it for you, isn't it? I mean you as well pack it in. Game over. --Stewie Griffin
_______________________________________________ rb-admins mailing list rb-admins@lists.redbrick.dcu.ie http://lists.redbrick.dcu.ie/mailman/listinfo/rb-admins
-- RedBrick/Gamessoc/Filmsoc Webmaster 08/09
On Mon, Jun 09, 2008 at 11:22:45PM +0100, Andrew Martin wrote:
nice diagram :) -- Andrew Harford System Administrator, DCU Networking Society Ordinary Member, Societies & Publications Committee During high school, I played junior hockey and still hold two league records: most time spent in the penalty box; and I was the only guy to ever take off his skate and try to stab somebody. --Happy Gilmore
On Mon, Jun 09, 2008 at 11:22:45PM +0100, Andrew Martin wrote:
Erh..yes..that's about it for now..
HUZZAH \o/
Anyone have any comments or suggestions?
I'm gonna start a new thread about our LDAP mess.
Any chance we can simply dump the moany whorebag? Don't tell anyone and simply take it on faith that they own the account they log into ;) No one will notice :) John
-Andrew
n Mon, Jun 02, 2008 at 11:44:07PM +0100, Andrew Harford wrote:
On Mon, Jun 02, 2008 at 11:28:50PM +0100, Andrew Martin wrote:
On the control domain, there's a file, /ldoms/ldg1/bootdisk.img, that's /dev/dsk/c0t0d0 on murphy-ldg1 :)
Well, that makes perfect sense :p
-- Andrew Harford System Administrator, DCU Networking Society Ordinary Member, Societies & Publications Committee
Did you hear that Meg? Guys can marry other guys now. So...this is awkward, but I mean, if they can do that, that is pretty much it for you, isn't it? I mean you as well pack it in. Game over. --Stewie Griffin
_______________________________________________ rb-admins mailing list rb-admins@lists.redbrick.dcu.ie http://lists.redbrick.dcu.ie/mailman/listinfo/rb-admins
-- RedBrick/Gamessoc/Filmsoc Webmaster 08/09
_______________________________________________ Admin-discuss mailing list Admin-discuss@lists.redbrick.dcu.ie http://lists.redbrick.dcu.ie/mailman/listinfo/admin-discuss
--
On Mon, May 26, 2008 at 08:43:29PM +0100, Andrew Martin wrote:
An idea is that we'd have a bigish LDOM for logins and apache (because we want users to be able to log into the WWW machine), another for MySQL, another for secondary LDAP, and another for maybe jakarta/tomcat/glassfish/whatever java webapp server we end up using. The LDAP and MySQL ones would obviously be smaller than the Apache and login one.
Myself and Cian have been talking about this a bit more, and we're thinking that containers might be more appropiate than LDOMS for this. As the LDOMS all have seperate kernels etc. there's a much larger amount of work involved in keeping the thing up to date etc. I'm just not convinced that it's worth that. The LDOMS do give us a greater amount of seperation and isolation, but realistically if apache goes nuts and crashes it's server very few people will be worrying about databases potentially dying. The added bonus of this approach is we would be able to downgrade the firmware on Murphy, as the most recent version (needed for LDOMs) has caused the cdrom drive to disappear :/ a. -- Andrew Harford System Administrator, DCU Networking Society Ordinary Member, Societies & Publications Committee I'm a man who discovered the wheel and built the Eiffel Tower out of metal and brawn. That's what kind of man I am. You're just a woman with a small brain. With a brain a third the size of us. It's science. --Ron Burgundy
participants (5)
-
Andrew Harford -
Andrew Martin -
Colm MacCarthaigh -
Fergus Donohue -
John