From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org)
	by finch.gentoo.org with esmtp (Exim 4.60)
	(envelope-from <gentoo-cluster+bounces-420-garchives=archives.gentoo.org@gentoo.org>)
	id 1JADcc-0004xN-Jv
	for garchives@archives.gentoo.org; Thu, 03 Jan 2008 00:07:43 +0000
Received: from robin.gentoo.org (localhost [127.0.0.1])
	by robin.gentoo.org (8.14.2/8.14.0) with SMTP id m0306l81024995;
	Thu, 3 Jan 2008 00:06:47 GMT
Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.176])
	by robin.gentoo.org (8.14.2/8.14.0) with ESMTP id m0306jp7024990
	for <gentoo-cluster@lists.gentoo.org>; Thu, 3 Jan 2008 00:06:46 GMT
Received: by wa-out-1112.google.com with SMTP id k34so9255114wah.10
        for <gentoo-cluster@lists.gentoo.org>; Wed, 02 Jan 2008 16:06:45 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=gamma;
        h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references;
        bh=VC007gWMVk62KMNdzOMyMT/DW6HU3iJu3lnH6HWNNeI=;
        b=l+5UCka5qbeboF63vyrn8p8AEUrySTOFBlGY8t3LBAWZIB8+Iq0DfftpIgOAH8pVYGaOWe6zV2ytBg3Kzb6OmTq+96rAr6tWpG0+Tmz+aVHp8b2fXPtaWbb04BKehXNIj4Uqc94Z+DS+Y5KHJ7Pr3jKv1Sbds+Cx9eXL5K91e2g=
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references;
        b=I1hhwPO0fAJTmJ9i/WYN8gyJ+Ppog/g8ICXerpFXHlSSZ/xKqql/zC5crHVP6+mlJo9TbKxO/w/G2sm2XSOW3q90l1cZrjX4fYkMzXE16t4u0is1+1fMSJIdh52rzIJdyM16ukogrwz92DjsUifKNaqukfmeXlIKJI9Od3xJ1Ng=
Received: by 10.114.159.1 with SMTP id h1mr2957133wae.122.1199318805435;
        Wed, 02 Jan 2008 16:06:45 -0800 (PST)
Received: by 10.114.110.16 with HTTP; Wed, 2 Jan 2008 16:06:45 -0800 (PST)
Message-ID: <5b7094580801021606y6a65804ck115731926f2ba0a8@mail.gmail.com>
Date: Wed, 2 Jan 2008 16:06:45 -0800
From: "Brian Budge" <brian.budge@gmail.com>
To: gentoo-cluster@lists.gentoo.org
Subject: Re: [gentoo-cluster] openib, no /dev/infiniband
In-Reply-To: <20080102221153.1DE4E2391C5@ece06.nas.nasa.gov>
Precedence: bulk
List-Post: <mailto:gentoo-cluster@lists.gentoo.org>
List-Help: <mailto:gentoo-cluster+help@gentoo.org>
List-Unsubscribe: <mailto:gentoo-cluster+unsubscribe@gentoo.org>
List-Subscribe: <mailto:gentoo-cluster+subscribe@gentoo.org>
List-Id: Gentoo Linux mail <gentoo-cluster.gentoo.org>
X-BeenThere: gentoo-cluster@gentoo.org
Reply-to: gentoo-cluster@lists.gentoo.org
MIME-Version: 1.0
Content-Type: multipart/alternative; 
	boundary="----=_Part_35661_480033.1199318805422"
References: <5b7094580801021339n22db7c35y8580642c784d2c17@mail.gmail.com>
	 <20080102221153.1DE4E2391C5@ece06.nas.nasa.gov>
X-Archives-Salt: 0c2e22bd-ed96-4f26-9cb7-a0cf733c6041
X-Archives-Hash: c6bc7ccc028be837a2416782de8f9564

------=_Part_35661_480033.1199318805422
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Hi Bryan -

I don't seem to have a 40-ib.rules in any of my /etc/udev/rules.d on any
node.

My /sys/class/infiniband directory contains mthca0, which contains:
> ls -la /sys/class/infiniband/mthca0/
total 0
drwxr-xr-x 3 root root    0 Jan  2 20:54 .
drwxr-xr-x 3 root root    0 Jan  2 20:54 ..
-r--r--r-- 1 root root 4096 Jan  2 21:07 board_id
lrwxrwxrwx 1 root root    0 Jan  3 00:01 device ->
../../../devices/pci0000:20/0000:20:0a.0/0000:21:00.0
-r--r--r-- 1 root root 4096 Jan  2 21:07 fw_ver
-r--r--r-- 1 root root 4096 Jan  2 21:07 hca_type
-r--r--r-- 1 root root 4096 Jan  2 21:07 hw_rev
-rw-r--r-- 1 root root 4096 Jan  2 21:07 node_desc
-r--r--r-- 1 root root 4096 Jan  2 21:07 node_guid
-r--r--r-- 1 root root 4096 Jan  2 21:06 node_type
drwxr-xr-x 3 root root    0 Jan  2 21:07 ports
lrwxrwxrwx 1 root root    0 Jan  3 00:01 subsystem ->
../../../class/infiniband
-r--r--r-- 1 root root 4096 Jan  2 21:07 sys_image_guid
--w------- 1 root root 4096 Jan  2 20:54 uevent

I don't have any ib modules loaded at all on any node.  All of my kernel
modules are built into the kernel:

CONFIG_INFINIBAND=y
CONFIG_INFINIBAND_USER_MAD=y
CONFIG_INFINIBAND_USER_ACCESS=y
CONFIG_INFINIBAND_USER_MEM=y
CONFIG_INFINIBAND_ADDR_TRANS=y
CONFIG_INFINIBAND_MTHCA=y
CONFIG_INFINIBAND_MTHCA_DEBUG=y
# CONFIG_INFINIBAND_IPATH is not set
CONFIG_INFINIBAND_AMSO1100=y
# CONFIG_INFINIBAND_AMSO1100_DEBUG is not set
CONFIG_MLX4_INFINIBAND=y
CONFIG_INFINIBAND_IPOIB=y
# CONFIG_INFINIBAND_IPOIB_CM is not set
CONFIG_INFINIBAND_IPOIB_DEBUG=y
# CONFIG_INFINIBAND_IPOIB_DEBUG_DATA is not set
# CONFIG_INFINIBAND_SRP is not set
# CONFIG_INFINIBAND_ISER is not set


Thanks,
  Brian

On Jan 2, 2008 2:11 PM, Bryan Green <bryan.d.green@nasa.gov> wrote:

> "Brian Budge" writes:
> >
> > Hi all -
> >
> > I'm new to infiniband and still getting my feet wet.  I am admining a
> very
> > small cluster of 5 nodes, and have recently installed infiniband HCAs.
>  I
> > have the infiniband modules built into the kernel, and I am using the
> > openib-userspace package in the gentoo-science overlay.
> >
> > The strange thing with my situation is that I have infiniband working
> with
> > openmpi on 4 of my 5 nodes, but the 5th one is a mystery.
> >
> > All 4 working nodes have a /dev/infiniband directory that look roughly
> like
> > this:
> >
> > crw-rw---- 1 root root 231,  64 Dec 31 09:13 issm0
> > crw-rw-rw- 1 root root 231, 224 Dec 31 09:13 ucm0
> > crw-rw---- 1 root root 231,   0 Dec 31 09:13 umad0
> > crw-rw-rw- 1 root root 231, 192 Dec 31 09:13 uverbs0
> >
> >
> > But the 5th node doesn't, which could indicate the problem (it isn't
> > completely the problem, as I tried making those nodes myself to match,
> but
> > it doesn't help).  I'm just not sure what the difference is, because I
> > installed them all the same way, they all have the same hardware, and
> they
> > are all running the same kernel.
>
> The '/dev/infiniband' subdir is created by the udev rules in
> '/etc/udev/rules.d/40-ib.rules'
>
> Does the '/sys/class/infiniband' directory exist?
> If so, what does it contain?  What loaded modules with an 'ib_' prefix
> does
> lsmod report?
>
> -bryan
>
> --
> gentoo-cluster@gentoo.org mailing list
>
>

------=_Part_35661_480033.1199318805422
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Hi Bryan -<br><br>I don&#39;t seem to have a 40-ib.rules in any of my /etc/udev/rules.d on any node.<br><br>My /sys/class/infiniband directory contains mthca0, which contains:<br>&gt; ls -la /sys/class/infiniband/mthca0/<br>
total 0<br>drwxr-xr-x 3 root root&nbsp;&nbsp;&nbsp; 0 Jan&nbsp; 2 20:54 .<br>drwxr-xr-x 3 root root&nbsp;&nbsp;&nbsp; 0 Jan&nbsp; 2 20:54 ..<br>-r--r--r-- 1 root root 4096 Jan&nbsp; 2 21:07 board_id<br>lrwxrwxrwx 1 root root&nbsp;&nbsp;&nbsp; 0 Jan&nbsp; 3 00:01 device -&gt; ../../../devices/pci0000:20/0000:20:
0a.0/0000:21:00.0<br>-r--r--r-- 1 root root 4096 Jan&nbsp; 2 21:07 fw_ver<br>-r--r--r-- 1 root root 4096 Jan&nbsp; 2 21:07 hca_type<br>-r--r--r-- 1 root root 4096 Jan&nbsp; 2 21:07 hw_rev<br>-rw-r--r-- 1 root root 4096 Jan&nbsp; 2 21:07 node_desc
<br>-r--r--r-- 1 root root 4096 Jan&nbsp; 2 21:07 node_guid<br>-r--r--r-- 1 root root 4096 Jan&nbsp; 2 21:06 node_type<br>drwxr-xr-x 3 root root&nbsp;&nbsp;&nbsp; 0 Jan&nbsp; 2 21:07 ports<br>lrwxrwxrwx 1 root root&nbsp;&nbsp;&nbsp; 0 Jan&nbsp; 3 00:01 subsystem -&gt; ../../../class/infiniband
<br>-r--r--r-- 1 root root 4096 Jan&nbsp; 2 21:07 sys_image_guid<br>--w------- 1 root root 4096 Jan&nbsp; 2 20:54 uevent<br><br>I don&#39;t have any ib modules loaded at all on any node.&nbsp; All of my kernel modules are built into the kernel:
<br><br>CONFIG_INFINIBAND=y<br>CONFIG_INFINIBAND_USER_MAD=y<br>CONFIG_INFINIBAND_USER_ACCESS=y<br>CONFIG_INFINIBAND_USER_MEM=y<br>CONFIG_INFINIBAND_ADDR_TRANS=y<br>CONFIG_INFINIBAND_MTHCA=y<br>CONFIG_INFINIBAND_MTHCA_DEBUG=y
<br># CONFIG_INFINIBAND_IPATH is not set<br>CONFIG_INFINIBAND_AMSO1100=y<br># CONFIG_INFINIBAND_AMSO1100_DEBUG is not set<br>CONFIG_MLX4_INFINIBAND=y<br>CONFIG_INFINIBAND_IPOIB=y<br># CONFIG_INFINIBAND_IPOIB_CM is not set
<br>CONFIG_INFINIBAND_IPOIB_DEBUG=y<br># CONFIG_INFINIBAND_IPOIB_DEBUG_DATA is not set<br># CONFIG_INFINIBAND_SRP is not set<br># CONFIG_INFINIBAND_ISER is not set<br><br><br>Thanks,<br>&nbsp; Brian<br><br><div class="gmail_quote">
On Jan 2, 2008 2:11 PM, Bryan Green &lt;<a href="mailto:bryan.d.green@nasa.gov">bryan.d.green@nasa.gov</a>&gt; wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="Ih2E3d">&quot;Brian Budge&quot; writes:<br>&gt;<br>&gt; Hi all -<br>&gt;<br>&gt; I&#39;m new to infiniband and still getting my feet wet. &nbsp;I am admining a very<br>&gt; small cluster of 5 nodes, and have recently installed infiniband HCAs. &nbsp;I
<br>&gt; have the infiniband modules built into the kernel, and I am using the<br>&gt; openib-userspace package in the gentoo-science overlay.<br>&gt;<br>&gt; The strange thing with my situation is that I have infiniband working with
<br>&gt; openmpi on 4 of my 5 nodes, but the 5th one is a mystery.<br>&gt;<br>&gt; All 4 working nodes have a /dev/infiniband directory that look roughly like<br>&gt; this:<br>&gt;<br>&gt; crw-rw---- 1 root root 231, &nbsp;64 Dec 31 09:13 issm0
<br>&gt; crw-rw-rw- 1 root root 231, 224 Dec 31 09:13 ucm0<br>&gt; crw-rw---- 1 root root 231, &nbsp; 0 Dec 31 09:13 umad0<br>&gt; crw-rw-rw- 1 root root 231, 192 Dec 31 09:13 uverbs0<br>&gt;<br>&gt;<br>&gt; But the 5th node doesn&#39;t, which could indicate the problem (it isn&#39;t
<br>&gt; completely the problem, as I tried making those nodes myself to match, but<br>&gt; it doesn&#39;t help). &nbsp;I&#39;m just not sure what the difference is, because I<br>&gt; installed them all the same way, they all have the same hardware, and they
<br>&gt; are all running the same kernel.<br><br></div>The &#39;/dev/infiniband&#39; subdir is created by the udev rules in &#39;/etc/udev/rules.d/40-ib.rules&#39;<br><br>Does the &#39;/sys/class/infiniband&#39; directory exist?
<br>If so, what does it contain? &nbsp;What loaded modules with an &#39;ib_&#39; prefix does<br>lsmod report?<br><br>-bryan<br><font color="#888888"><br>--<br><a href="mailto:gentoo-cluster@gentoo.org">gentoo-cluster@gentoo.org
</a> mailing list<br><br></font></blockquote></div><br>

------=_Part_35661_480033.1199318805422--
-- 
gentoo-cluster@gentoo.org mailing list