Possible ways of resolving expensive cache flush problem when storing big amounts of data.
On RISC CPUs we can use cache management instructions for creating dirty and flushing/storing cache lines.
On modern x86 besides of using write combined-regions via MTRR, also we can use MOVNTQ SIMD instruction for storing data into memory. This instruction bypasses the on-chip cache, and sends data directly into a write combining buffer. And because the MOVNTQ allows the CPU to avoid reading the old data from the memory destination address, MOVNTQ can effectively double the total write bandwidth. (note that an SFENCE is required after the data is written, to flush the write buffer). see:Using Block Prefetch for Optimized Memory Performance