I'm learning how to use RDMA via Inifniband and one problem I'm having is using a connection with more than 1 thread because I cant figure out how to create another completion queue so the work completions get mixed up between the threads and it craps out, how do I create a queue for each thread using the connection?
Take this vomit for example:
void worker(struct ibv_cq* cq){
while(conn->peer_mr.empty()) Sleep(1);
struct ibv_wc wc{};
struct ibv_send_wr wr{};
memset(&wr, 0, sizeof wr);
struct ibv_sge sge{};
sge.addr = reinterpret_cast<unsigned long long>(conn->rdma_memory_region);
sge.length = RDMA_BUFFER_SIZE;
sge.lkey = conn->rdma_mr->lkey;
wr.wr_id = reinterpret_cast<unsigned long long>(conn);
wr.opcode = IBV_WR_RDMA_READ;
wr.sg_list = &sge;
wr.num_sge = 1;
wr.send_flags = IBV_SEND_SIGNALED;
struct ibv_send_wr* bad_wr = nullptr;
while(true){
if(queue >= maxqueue) continue;
for(auto i = 0ULL; i < conn->peer_mr.size(); ++i){
wr.wr.rdma.remote_addr = reinterpret_cast<unsigned long long>(conn->peer_mr[i]->mr.addr) + conn->peer_mr[i]->offset;
wr.wr.rdma.rkey = conn->peer_mr[i]->mr.rkey;
const auto err = ibv_post_send(conn->qp, &wr, &bad_wr);
if(err){
std::cout << "ibv_post_send " << err << "\n" << "Errno: " << std::strerror(errno) << "\n";
exit(err);
}
++queue;
conn->peer_mr[i]->offset += RDMA_BUFFER_SIZE;
if(conn->peer_mr[i]->offset >= conn->peer_mr[i]->mr.length) conn->peer_mr[i]->offset = 0;
}
int ne;
do{
ne = ibv_poll_cq(cq, 1, &wc);
} while(!ne);
--queue;
++number;
}
}
If I had more than one of them they would all be receiving each others work completions, I want them to receive only their own and not those of other threads.
The completion queues are created somewhere outside of this code (you are passing in an
ibv_cq *). If you'd like to figure out how to create multiple ones, that's the area to focus on.However, the "crapping out" is not (just) happening because completions are mixed up between threads: the
ibv_poll_cqandibv_post_sendfunctions are thread safe. Instead, the likely problem is that your code isn't thread-safe: there are shared data structures that are accessed without locks (conn->peer_mr). You would have the same issues even without RDMA.The first step is to figure out how to split up the work into pieces. Think about the pieces that each thread will need to make it independent from the others. It'll likely be a single
peer_mr, a separateibv_cq *, and a specific chunk of yourrdma_mr. Then code that :)