I am writing custom linux driver that needs to DMA memory around between multiple PCIE devices. I have the following situation:
- I'm using dma_alloc_coherent to allocate memory for DeviceA
- I then use DeviceA to fill the memory buffer.
Everything is fine so far but at this point I would like to DMA the memory to DeviceB and I'm not sure the proper way of doing it.
For now I am calling dma_map_single for DeviceB using the address returned from dma_alloc_coherent called on DeviceA. This seems to work fine in x86_64 but it feels like I'm breaking the rules because:
dma_map_single is supposed to be called with memory allocated from kmalloc ("and friends"). Is it problem being called with an address returned from another device's dma_alloc_coherent call?
If #1 is "ok", then I'm still not sure if it is necessary to call the dma_sync_* functions which are needed for dma_map_single memory. Since the memory was originally allocated from dma_alloc_coherent, it should be uncached memory so I believe the answer is "dma_sync_* calls are not necessary", but I am not sure.
I'm worried that I'm just getting lucky having this work and a future kernel update will break me since it is unclear if I'm following the API rules correctly. My code eventually will have to run on ARM and PPC too, so I need to make sure I'm doing things in a platform independent manner instead of getting by with some x86_64 architecture hack.
I'm using this as a reference: https://www.kernel.org/doc/html/latest/core-api/dma-api.html
dma_alloc_coherent()acts similarly to__get_free_pages()but as size granularity rather page, so no issue I would guess here.dma_mapping_error()afterdma_map_single()for any platform specific issue.dma_sync_*()helpers are used by streaming DMA operation to keep device and CPU in sync. At minimumdma_sync_single_for_cpu()is required as device modified buffers access state need to be sync before CPU use it.