From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org)
	by nuthatch.gentoo.org with esmtp (Exim 4.60)
	(envelope-from <gentoo-user+bounces-45238-garchives=archives.gentoo.org@gentoo.org>)
	id 1FwYUK-0004iT-1L
	for garchives@archives.gentoo.org; Sat, 01 Jul 2006 05:57:52 +0000
Received: from robin.gentoo.org (localhost [127.0.0.1])
	by robin.gentoo.org (8.13.7/8.13.6) with SMTP id k615uZsk016631;
	Sat, 1 Jul 2006 05:56:35 GMT
Received: from ender.volumehost.net (adsl-69-154-123-202.dsl.fyvlar.swbell.net [69.154.123.202])
	by robin.gentoo.org (8.13.7/8.13.6) with ESMTP id k615q4fd032194
	for <gentoo-user@lists.gentoo.org>; Sat, 1 Jul 2006 05:52:04 GMT
Received: from localhost (localhost [127.0.0.1])
	by ender.volumehost.net (Postfix) with ESMTP id D407C15B94
	for <gentoo-user@lists.gentoo.org>; Sat,  1 Jul 2006 05:52:03 +0000 (UTC)
Received: from ender.volumehost.net ([127.0.0.1])
 by localhost (ender.volumehost.net [127.0.0.1]) (amavisd-new, port 10024)
 with LMTP id 25642-09 for <gentoo-user@lists.gentoo.org>;
 Sat,  1 Jul 2006 05:52:02 +0000 (UTC)
Received: from monster.krakrjak.com (ip70-178-219-141.ks.ks.cox.net [70.178.219.141])
	(using SSLv3 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by ender.volumehost.net (Postfix) with ESMTP id 141E715B59
	for <gentoo-user@lists.gentoo.org>; Sat,  1 Jul 2006 05:52:02 +0000 (UTC)
From: "Boyd Stephen Smith Jr." <bss03@volumehost.net>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] Linux Cluster
Date: Sat, 1 Jul 2006 00:51:36 -0500
User-Agent: KMail/1.9.3
References: <b9e0c3fe0605251213n18aefb5di921b2f78debc06b1@mail.gmail.com>
In-Reply-To: <b9e0c3fe0605251213n18aefb5di921b2f78debc06b1@mail.gmail.com>
X-Eric-Conspiracy: There is no conspiracy
Precedence: bulk
List-Post: <mailto:gentoo-user@lists.gentoo.org>
List-Help: <mailto:gentoo-user+help@gentoo.org>
List-Unsubscribe: <mailto:gentoo-user+unsubscribe@gentoo.org>
List-Subscribe: <mailto:gentoo-user+subscribe@gentoo.org>
List-Id: Gentoo Linux mail <gentoo-user.gentoo.org>
X-BeenThere: gentoo-user@gentoo.org
Reply-to: gentoo-user@lists.gentoo.org
MIME-Version: 1.0
Content-Type: multipart/signed;
  boundary="nextPart1237024.TSJEmm9GJm";
  protocol="application/pgp-signature";
  micalg=pgp-sha1
Content-Transfer-Encoding: 7bit
Message-Id: <200607010051.44525.bss03@volumehost.net>
X-Virus-Scanned: amavisd-new at volumehost.net
X-Archives-Salt: b8fa716b-dd40-4efb-9344-8abdef5acf1b
X-Archives-Hash: 6e317b968e340366a966a6339518a8fc

--nextPart1237024.TSJEmm9GJm
Content-Type: text/plain;
  charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

On Thursday 25 May 2006 14:13, "Bruno Lustosa" <bruno.lists@gmail.com>=20
wrote about '[gentoo-user] Linux Cluster':
> - Distributed filesystem, so that all machines can share the same
> filesystem. Something like RAID-over-ethernet.

You probably want RH's GFS (there are probably other cluster-aware=20
filesystems available for linux that I'm not aware of) and some sort of=20
external storage that allows you to hook two machines to it.  You might=20
also look into multipathing, that would help in case of a cable failure.

=46or maximum availability, you want your enclosure to have two scsi disk=20
controllers, each with two separate scsi ports (these ports are on=20
different chains).  You'll hook each of the two computers into cluster to=20
one port on each controller and then use multipathing to tell linux both=20
scsi paths are the same device.  You'll have a second external storage=20
connected the same way and software use software mirroring.  Then,=20
partition the mirror set (you could also partition at the external=20
storage, but then you have to update the partitions on each storage) and=20
lay GFS down.

At this point, you don't lose connectivity to your storage if a cable, an=20
hba, an enclosure, a controller, or a computer goes down.  Of course, the=20
controllers will handle RAID 5 or RAID 6 so you won't lose even a single=20
path in case of HD failure.  GFS should allow concurrent access --=20
possibly even with multiple r/w mounts.  ext2/3, jfs, xfs, reiserfs, and=20
even reiser4 are not cluster aware so they will only work properly in the=20
configuration with multiple r/o mounts *OR* a single r/w mount.

> - Load balancing. Tasks should migrate between nodes.

HP's ServiceGuard for linux is the only software I know that will do this=20
(for this *sure* there are other commerical solutions), and there is still=
=20
some small amount of downtime when a task migrates, so they aren't=20
automatically generated.

Also, some software (IIRC, WebLogic) is able to exist in a clustered=20
environment with some method to sync state across individual nodes=20
(possibly using the clustered FS) so that instead of=20
jobs/packages/daemons/tasks migrating it just runs on all nodes all the=20
time.

The second option (a cluster-aware program) is usually preferable, because=
=20
the program itself is better at determining what state needs to be shared,=
=20
so you get less intra-node communication and less downtime in case a node=20
fails.  *However*, an external failover/load-balancer may either be your=20
only solution (if you are already attached to a certain, non-cluster-aware=
=20
program) or provide better behavior in the case the program is buggy=20
(especially if it's failure mode corrupts and/or brings down other nodes).

> - Redundancy, so that the death of a machine doesn't take the cluster
> or any processes down.

I believe there's a userland implementation of the CARP protocol that may=20
work for linux.  It allows 2 (or more) machines on the same network to=20
share an IP and failover and/or load-balance handling packets directed to=20
that IP.

> So, anyone doing linux clusters?

Not personally, but I was looking into them some during my last job. =20
(Trying to get a customer to switch to linux.)

=2D-=20
"If there's one thing we've established over the years,
it's that the vast majority of our users don't have the slightest
clue what's best for them in terms of package stability."
=2D- Gentoo Developer Ciaran McCreesh

--nextPart1237024.TSJEmm9GJm
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.4 (GNU/Linux)

iD8DBQBEpg1wq72nDbhDXToRAiklAKCHOKIfKEIhsY3f0WZ6g8YEQ3+X9wCff29z
Ws0jBxltWJSKmpESwl/e2Cw=
=Gy7S
-----END PGP SIGNATURE-----

--nextPart1237024.TSJEmm9GJm--
-- 
gentoo-user@gentoo.org mailing list