Questions about RDMA tunneled atomics

69 Views Asked by At

I am currently reading the RDMA development manual and I found an advanced feature called TUNNELED ATOMICS: https://docs.nvidia.com/networking/display/rdmacore50/tunneled+atomic. However, I found the explanation on this website to be vague and hard to understand. And there seems to be no detailed information about this feature on the internet.

I wonder if anyone could provide more detailed information about this operation?

1

There are 1 best solutions below

0
Chester Gillon On

This answer records an investigation in trying to understand what the Tunneled Atomic operation does.

Downloaded MLNX_OFED_SRC-4.9-7.1.0.0.tgz from Linux Drivers, which are the sources from the NVIDIA MLNX_OFED Download Center. Manually unpacked some of the SRPMS source RPMs to look at the code and supporting comments.

libibverbs-41mlnx1-OFED.4.9.3.0.0.49710.src.rpm which contains the source for the user space ibverbs library has:

  1. The C source code has IBV_EXP_ACCESS_TUNNELED_ATOMIC, IBV_EXP_DEVICE_ATTR_TUNNELED_ATOMIC, IBV_EXP_TUNNELED_ATOMIC_SUPPORTED and tunneled_atomic_caps. However, are unable to find any comments explaining what they do.

  2. man/ibv_exp_reg_mr.3 which is the source for the man page for the ibv_exp_reg_mr function:

    if IBV_EXP_ACCESS_TUNNELED_ATOMIC is set, part of the MSB's of the address is set to a predefined value in order to generate a command in the CPU and do atomic for RDMA WR operations in the memory controller instead of accessing the user's memory (DMA). That is saying IBV_EXP_ACCESS_REMOTE_WRITE must be set as well to support this opration from remote side. This operation is only eligible when tunneled_atomic_caps from ibv_exp_query_device has bit mask IBV_EXP_TUNNELED_ATOMIC_SUPPORTED set.

    Where the above is a bit more information that in Tunneled Atomic in the online OFED documentation.

mlnx-ofa_kernel-4.9-OFED.4.9.7.1.0.1.src.rpm contains the source for the OFED Kernel drivers. They contains references to tunneled atomics, but again can't seem to find any comments explaining what they do.

The Kernel source file has the following mlnx-ofa_kernel-4.9/drivers/net/ethernet/mellanox/mlx5/core/main.c, which is the only call to set_tunneled_operation:

#ifdef HAVE_PNV_PCI_AS_NOTIFY
static void mlx5_as_notify_init(struct mlx5_core_dev *dev)
{
    struct pci_dev *pdev = dev->pdev;
    u32 log_response_bar_size;
    u64 response_bar_address;
    u64 asn_match_value;
    int err;

    if (!mlx5_core_is_pf(dev))
        return;

    if (!MLX5_CAP_GEN(dev, tunneled_atomic) &&
        !MLX5_CAP_GEN(dev, as_notify))
        return;

    err = pnv_pci_enable_tunnel(pdev, &asn_match_value);
    if (err)
        return;

    err = set_tunneled_operation(dev, 0xFFFF, asn_match_value, &log_response_bar_size, &response_bar_address);
    if (err)
        return;

    if (!MLX5_CAP_GEN(dev, as_notify))
        return;

    err = pnv_pci_set_tunnel_bar(pdev, response_bar_address, 1);
    if (err)
        return;

    dev->as_notify.response_bar_address = response_bar_address;
    dev->as_notify.enabled = true;
    mlx5_core_dbg(dev,
              "asn_match_value=%llx, log_response_bar_size=%x, response_bar_address=%llx\n",
              asn_match_value, log_response_bar_size, response_bar_address);
}
#endif

Where the above function which references tunneled atomics and tunneled operation is inside conditional compilation on HAVE_PNV_PCI_AS_NOTIFY being defined. Looking at the build scripts HAVE_PNV_PCI_AS_NOTIFY is only defined when the function pnv_pci_enable_tunnel is present in asm/pnv-pci.h which is an include file inside /arch/powerpc/include. I.e. specific to the PowerPC architecture. On searching think pnv_pci_enable_tunnel is used to support the Coherent Accelerator Processor Interface (CAPI) on IBM's Power8 processors.

Summary

I haven't specifically answered the question of what the RDMA tunneled atomics do, but could be for use with CAPI, which itself is used in large data center computers.