CPU cache lines are typically 64-bytes. When a CPU (say modern Intel processor) reads a cache line from memory, does the CPU read from 64-byte aligned blocks of memory, or any contiguous 64-byte block? What is the alignment behavior?
In other words, does the CPU only read blocks starting at addresses where the low 6 bits are zero? Or another/any alignment?
The processor's memory subsystem will load an entire cache-aligned block of memory when any location within that block is read, if the the address maps to a cacheable location in the address space (e.g. ordinary RAM). If the read is of more than one byte and crosses a cacheline boundary, more than one cacheline can be read from external memory. For example, a read of a 16 bit short int starting at the last byte in a cacheline will cause two whole cachelines to be read from external memory.