From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org) by nuthatch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1FwYUK-0004iT-1L for garchives@archives.gentoo.org; Sat, 01 Jul 2006 05:57:52 +0000 Received: from robin.gentoo.org (localhost [127.0.0.1]) by robin.gentoo.org (8.13.7/8.13.6) with SMTP id k615uZsk016631; Sat, 1 Jul 2006 05:56:35 GMT Received: from ender.volumehost.net (adsl-69-154-123-202.dsl.fyvlar.swbell.net [69.154.123.202]) by robin.gentoo.org (8.13.7/8.13.6) with ESMTP id k615q4fd032194 for ; Sat, 1 Jul 2006 05:52:04 GMT Received: from localhost (localhost [127.0.0.1]) by ender.volumehost.net (Postfix) with ESMTP id D407C15B94 for ; Sat, 1 Jul 2006 05:52:03 +0000 (UTC) Received: from ender.volumehost.net ([127.0.0.1]) by localhost (ender.volumehost.net [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 25642-09 for ; Sat, 1 Jul 2006 05:52:02 +0000 (UTC) Received: from monster.krakrjak.com (ip70-178-219-141.ks.ks.cox.net [70.178.219.141]) (using SSLv3 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ender.volumehost.net (Postfix) with ESMTP id 141E715B59 for ; Sat, 1 Jul 2006 05:52:02 +0000 (UTC) From: "Boyd Stephen Smith Jr." To: gentoo-user@lists.gentoo.org Subject: Re: [gentoo-user] Linux Cluster Date: Sat, 1 Jul 2006 00:51:36 -0500 User-Agent: KMail/1.9.3 References: In-Reply-To: X-Eric-Conspiracy: There is no conspiracy Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart1237024.TSJEmm9GJm"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200607010051.44525.bss03@volumehost.net> X-Virus-Scanned: amavisd-new at volumehost.net X-Archives-Salt: b8fa716b-dd40-4efb-9344-8abdef5acf1b X-Archives-Hash: 6e317b968e340366a966a6339518a8fc --nextPart1237024.TSJEmm9GJm Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Thursday 25 May 2006 14:13, "Bruno Lustosa" =20 wrote about '[gentoo-user] Linux Cluster': > - Distributed filesystem, so that all machines can share the same > filesystem. Something like RAID-over-ethernet. You probably want RH's GFS (there are probably other cluster-aware=20 filesystems available for linux that I'm not aware of) and some sort of=20 external storage that allows you to hook two machines to it. You might=20 also look into multipathing, that would help in case of a cable failure. =46or maximum availability, you want your enclosure to have two scsi disk=20 controllers, each with two separate scsi ports (these ports are on=20 different chains). You'll hook each of the two computers into cluster to=20 one port on each controller and then use multipathing to tell linux both=20 scsi paths are the same device. You'll have a second external storage=20 connected the same way and software use software mirroring. Then,=20 partition the mirror set (you could also partition at the external=20 storage, but then you have to update the partitions on each storage) and=20 lay GFS down. At this point, you don't lose connectivity to your storage if a cable, an=20 hba, an enclosure, a controller, or a computer goes down. Of course, the=20 controllers will handle RAID 5 or RAID 6 so you won't lose even a single=20 path in case of HD failure. GFS should allow concurrent access --=20 possibly even with multiple r/w mounts. ext2/3, jfs, xfs, reiserfs, and=20 even reiser4 are not cluster aware so they will only work properly in the=20 configuration with multiple r/o mounts *OR* a single r/w mount. > - Load balancing. Tasks should migrate between nodes. HP's ServiceGuard for linux is the only software I know that will do this=20 (for this *sure* there are other commerical solutions), and there is still= =20 some small amount of downtime when a task migrates, so they aren't=20 automatically generated. Also, some software (IIRC, WebLogic) is able to exist in a clustered=20 environment with some method to sync state across individual nodes=20 (possibly using the clustered FS) so that instead of=20 jobs/packages/daemons/tasks migrating it just runs on all nodes all the=20 time. The second option (a cluster-aware program) is usually preferable, because= =20 the program itself is better at determining what state needs to be shared,= =20 so you get less intra-node communication and less downtime in case a node=20 fails. *However*, an external failover/load-balancer may either be your=20 only solution (if you are already attached to a certain, non-cluster-aware= =20 program) or provide better behavior in the case the program is buggy=20 (especially if it's failure mode corrupts and/or brings down other nodes). > - Redundancy, so that the death of a machine doesn't take the cluster > or any processes down. I believe there's a userland implementation of the CARP protocol that may=20 work for linux. It allows 2 (or more) machines on the same network to=20 share an IP and failover and/or load-balance handling packets directed to=20 that IP. > So, anyone doing linux clusters? Not personally, but I was looking into them some during my last job. =20 (Trying to get a customer to switch to linux.) =2D-=20 "If there's one thing we've established over the years, it's that the vast majority of our users don't have the slightest clue what's best for them in terms of package stability." =2D- Gentoo Developer Ciaran McCreesh --nextPart1237024.TSJEmm9GJm Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.4 (GNU/Linux) iD8DBQBEpg1wq72nDbhDXToRAiklAKCHOKIfKEIhsY3f0WZ6g8YEQ3+X9wCff29z Ws0jBxltWJSKmpESwl/e2Cw= =Gy7S -----END PGP SIGNATURE----- --nextPart1237024.TSJEmm9GJm-- -- gentoo-user@gentoo.org mailing list