              Scalable Subnet Administration (SSA)
                  Version 0.0.9 Release Notes
                          August 2015

===============================================================================
Table of Contents
===============================================================================
1. Overview
2. Building and Installing
3. Known Limitations/Issues

===============================================================================
1. Overview
===============================================================================
These are the release notes for Scalable Subnet Administration (SSA)
release 0.0.9. SSA is composed of several user space software modules.

SSA forms a distribution tree with up to 4 layers. At the top of
the tree is the core layer which is coresident with the OpenSM.
Next layer in the tree is the distribution layer, which fans out to
the access layer. Consumer/compute node (ACMs) are at the lowest layer
of the tree. The size of the distribution tree is dependent on
the number of compute nodes.

SSA distributes the SM database down the distribution tree to the
access nodes. The access nodes compute the SA path record ("half-world")
database for their client (compute) nodes and populates the caches
in the ACMs. "Half-world" means paths from the client (compute) node
to every other node in the IB subnet.

New 0.0.9 features are kernel IP address support and admin support.
Kernel IP support allows for the IPv4 ARP and/or IPv6 neighbor caches
in the kernel to be prepopulated by SSA ACM.
admin support consists of ssadmin utility and admin support in SSA nodes.
The ssadmin utility is used to monitor, debug and configure the SSA
layers: core, distribution, access, and ACM.
0.0.9 also supports ConnectX-4 (as well as Connect-IB). This was
a limitation in the 0.0.8 release.

SSA has a Mellanox community page which has more information
(https://community.mellanox.com/docs/DOC-2136).


1.1 SSA 0.0.9 Prerequisites
---------------------------

1.1.1 Kernel 3.11 or later

SSA requires a kernel which contains AF_IB address family support.
This went into upstream kernel 3.11 so any kernel/distro 3.11 or
later is sufficient.

Some known distros with recent enough kernels for SSA:
Fedora Core (Rawhide, FC19 or later)
OpenSuSE 13.2 uses 3.16 going for 3.17
SLES 12.0 is 3.12 based
Ubuntu 14.04 is 3.13 based

Note that both RHEL 7.1 and RHEL 7.0 use 3.10 so these do not support SSA

Tested on:
Ubuntu 12.01.1
Kernel version: 3.12.0-031200-generic

Stable kernels with recent patch for SSA include 3.14, 3.18, and 3.19:
commit c2be9dc0e0fa59cc43c2c7084fc42b430809a0fe
Author: Ilya Nelkenbaum <ilyan@mellanox.com>
Date:   Thu Feb 5 13:53:48 2015 +0200

    IB/core: When marshaling ucma path from user-space, clear unused fields
    
    When marshaling a user path to the kernel struct ib_sa_path, we need
    to zero smac and dmac and set the vlan id to the "no vlan" value.
    
    This is to ensure that Ethernet attributes are not used with
    InfiniBand QPs.
    
    Fixes: dd5f03beb4f7 ("IB/core: Ethernet L2 attributes in verbs/cm structures")
    Signed-off-by: Ilya Nelkenbaum <ilyan@mellanox.com>
    Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
    Signed-off-by: Roland Dreier <roland@purestorage.com>



1.1.2 libibumad 1.3.10.2

1.1.3 OpenSM 3.3.17 or later
If not running PerfMgr, OpenSM 3.3.17 or later is sufficient.
If running PerfMgr, OpenSM 3.3.19 is needed.

1.1.4 libibverbs 1.1.8

1.1.5 librdmacm 1.0.20 (AF_IB and keepalive support) or later

Note that librdmacm contains 4 AF_IB capable examples: rstream,
ucmatose, riostream, and udaddy.

1.1.6 User space library for HCAs

1.1.6.1 libmlx4 1.0.6

1.1.6.2 libmlx5 1.0.2

1.1.7 glib-2.0


1.2 SSA 0.0.9 Contents
----------------------
SSA contains the following 3 packages:
- OpenSM SSA 0.0.9 plugin (libopensmssa)
- ibssa 0.0.9 executable (for distribution and access nodes)
- ibacm 1.0.8.2 executable (for consumer/compute nodes)
- scripts and configuration files


1.3 OpenMPI with AF_IB Support
------------------------------
Not included with SSA so needs building and installing
Part of upcoming 2.0 release
On mainline of OpenMPI github tree
To build OpenMPI for use with SSA, configure as follows before building:
./configure --enable-openib-rdmacm-ibaddr --enable-mpirun-prefix-by-default --with-verbs=/usr/local --disable-openib-connectx-xrc


===============================================================================
2. Building and Installing
===============================================================================
On core nodes, libibumad, OpenSM, libibverbs, librdmacm, and HCA specific
library must be built and installed prior to libopensmssa.  

On distribution or access nodes, libibumad, libibverbs, librdmacm, and HCA
specific library must be built and installed prior to SSA. 

On consumer nodes, libibumad, libibverbs, librdmacm, and HCA specific
library must be built and installed prior to ACM.

Once the prerequisites are built and installed, the relevant SSA
tar ball(s) is/are then built and installed via:
./autogen.sh && ./configure && make && make install
in libopensmssa, distrib, and acm directories.


OpenSM (on core nodes) needs to be configured as follows in Opensm
configuration file (typically opensm.conf):
#
# Event Plugin Options
#
# Event plugin name(s)
event_plugin_name opensmssa

# Options string that would be passed to the plugin(s)
event_plugin_options (null)


SSA configuration is then performed as follows:
Core nodes have ibssa_core_opts.cfg file.
Distribution nodes have ibssa_opts.cfg file.
ACM/consumer nodes have ibacm_opts.cfg file. 
IP support is configured in core nodes via ibssa_hosts.data file.
Follow instructions in those files.

On ACM nodes, ib_acme can be run with -A and -O options
to generate ibacm_opts.cfg and ibacm_addr.data files
for that machine/system. This is only needed to be done
once (at initial install time).


===============================================================================
3. Known Limitations/Issues
===============================================================================
Only x86_64 processor architecture has been tested/qualified.

Only single P_Key (full default partition - 0xFFFF) currently supported.

Virtualization (alias GUIDs) and QoS (including QoS based routing - e.g.
torus-2QoS) is not currently supported.

Only rudimentary testing with qib (verified keepalives).

mlx4_core HW2SW_MPT -16 error requires update to recent firmware
(internal build 2.33.1230 or later, GA build 2.33.5000
for ConnectX-3 Pro).

If running with OpenSM PerfMgr, need OpenSM 3.3.19 or later.
Possible seg fault in PerfMgr was fixed there.

ACM is only tested in SSA acm_mode and not ACM acm_mode.

IP and name tables in ACM caches do not have records removed
currently; only added or replaced.

