Are there any separate Texture Caches in AMD GPUs of xDNA architecture?

60 Views Asked by At

I have a testbench that captures data about several types of caches, including texture cache. Since I decided to verify whether acquired data is correct or not, I would like to know if AMD GPUs of xDNA (thus, RDNA and CDNA) architecture have any sort of separate texture cache because separate Texture LD/ST Units, as well as Texture Filtering units are clearly visible in several block diagrams of compute unit (CU) across different versions of xDNA architectures.

RDNA 1 RDNA 2 RDNA 3 CDNA 1 CDNA 2

If they do not, where is texture data saved? In Local Data Share (LDS)?

1

There are 1 best solutions below

0
huseyin tugrul buyukisik On

Summary of just a guess: it can use LDS, L0 and optionally its internal swizzle-optimized cache(or not cache but an addressing calculator).

One of images from RDNA3 shows "texture filter unit" between ray-accelerator and L0. Ray-accelerator is neighboring shared-memory (LDS).

Although I'm not sure (because did not even try), Directx-12 or Vulkan must have some API that creates textures with ray-tracing backend rather than just "load"-ing a buffer from video-RAM. So when you "load" a texture-point from surface of a "mirror" object in a scene, it actually computes BVH collisions of rays & triangles/quads instead of just reading a texture buffer.

Ray-accelerator keeps the BVH-node data on LDS. LDS is low-latency (2-3 cycles for RDNA3).

For plain textures, the filtering is optional too. Without filtering, it bypasses the swizzling acceleration, directly accesses L0 cache. If not found, then gets data from L1, then L2. So uniform access should be faster than sampling. But when sampling/filtering is required, then it must have an internal cache with smaller cache-line to not waste bandwidth / to increase hit ratio or at least it should be able to swizzle the cache banks of L0.