public inbox for gentoo-cluster@lists.gentoo.org
 help / color / mirror / Atom feed
From: "Brian Budge" <brian.budge@gmail.com>
To: gentoo-cluster@lists.gentoo.org
Subject: [gentoo-cluster] openib, no /dev/infiniband
Date: Wed, 2 Jan 2008 13:39:37 -0800	[thread overview]
Message-ID: <5b7094580801021339n22db7c35y8580642c784d2c17@mail.gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 2303 bytes --]

Hi all -

I'm new to infiniband and still getting my feet wet.  I am admining a very
small cluster of 5 nodes, and have recently installed infiniband HCAs.  I
have the infiniband modules built into the kernel, and I am using the
openib-userspace package in the gentoo-science overlay.

The strange thing with my situation is that I have infiniband working with
openmpi on 4 of my 5 nodes, but the 5th one is a mystery.

All 4 working nodes have a /dev/infiniband directory that look roughly like
this:

crw-rw---- 1 root root 231,  64 Dec 31 09:13 issm0
crw-rw-rw- 1 root root 231, 224 Dec 31 09:13 ucm0
crw-rw---- 1 root root 231,   0 Dec 31 09:13 umad0
crw-rw-rw- 1 root root 231, 192 Dec 31 09:13 uverbs0


But the 5th node doesn't, which could indicate the problem (it isn't
completely the problem, as I tried making those nodes myself to match, but
it doesn't help).  I'm just not sure what the difference is, because I
installed them all the same way, they all have the same hardware, and they
are all running the same kernel.

All 5 nodes have the same thing in the /sys/class/infiniband directory.

Here's the mpirun I am trying:

mpirun -np 2 -mca btl self,openib -machinefile burn_machine_file ./loadtest
[burn-3][0,1,1][btl_openib_component.c:437:init_one_hca] error obtaining
device context for mthca0 errno says No such file or directory

--------------------------------------------------------------------------
WARNING: There were errors during IB HCA initialization on host 'burn-3'.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There is at least on IB HCA found on host 'burn-3', but there is
no active ports detected. This is most certainly not what you wanted.
Check your cables and SM configuration.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------

Any help would be appreciated!  Thanks.

  Brian

[-- Attachment #2: Type: text/html, Size: 2538 bytes --]

             reply	other threads:[~2008-01-02 21:41 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-02 21:39 Brian Budge [this message]
2008-01-02 22:11 ` [gentoo-cluster] openib, no /dev/infiniband Bryan Green
2008-01-03  0:06   ` Brian Budge
2008-01-03  0:41     ` Bryan Green
2008-01-03  0:53       ` Brian Budge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5b7094580801021339n22db7c35y8580642c784d2c17@mail.gmail.com \
    --to=brian.budge@gmail.com \
    --cc=gentoo-cluster@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox