The status for your InfiniBand Host Channel Adapter (HCA) can be found using the ‘ibstat’ command.
# ibstat CA 'mlx4_0' CA type: MT4099 Number of ports: 1 Firmware version: 2.10.0 Hardware version: 0 Node GUID: 0x0002c9030031fdc0 System image GUID: 0x0002c9030031fdc3 Port 1: State: Active Physical state: LinkUp Rate: 40 Base lid: 1 LMC: 0 SM lid: 1 Capability mask: 0x0251486a Port GUID: 0x0002c9030031fdc1 Link layer: InfiniBand
For proper operation you are looking for ‘State: Active‘ and ‘Physical State: LinkUp’
The physical state field indicates the state of the cable. This is very similar to the link state on Ethernet. The values you’ll see in this field are as follows:
There is no connection from this card to another card or switch. Check to make sure cable is installed and the device on the other end of the cable is on and working properly.
There is link and connection between this node and the device at the other end of the cable. This doesn’t mean it’s configured and ready to send data, just that the physical connection is up.
The state shows if the HCA port is up, and if it’s been discovered by the subnet manager.
There is no physical connection between the HCA card in this node and the device at the other end of the cable. This is almost always seen when ‘Physical State’ shows the value ‘Polling.’
Physical connection has been made between the HCA in this node and the device at the other end of the cable, but it hasn’t been discovered by the subnet manager. You need to make sure you have a managed switch, or more likely that the ‘opensm‘ process is running on a node in your cluster.
The physical connection is up and working, and the port has been discovered by the subnet manager. The port is in a normal operational state.
The rate is the speed at which the port is operating. This should match the speed of the slowest device between the node’s HCA and the device at the other end of the cable. For example if you have a QDR card and a DDR switch, the speed will be DDR and not QDR.