Cache to the Future

December 31, 2012

Proposal: Use macroscopic branch prediction to cache data before it’s ever requested.

Pipelined microprocessor architectures use a technique called “ Branch Prediction “ to keep their pipelines full of instructions; without this it would be necessary to load the instructions from memory one at a time, rendering the pipeline mostly useless.

Generic 4-stage pipeline

A pipeline is something like a cache, in that it improves performance by storing instructions in fast, on-chip memory instead of having to load them one-at-a-time from slower system memory. This works fine when the processor is executing instructions one after another, but when a branch is encountered (a piece of code that can go one of two ways, like an “if” statement in a programming language) the only way to keep filling the pipeline is to predict which way the branch will go and pre-load the instructions predicted to be selected.

2-bit saturating counter

When the prediction is correct, it means the next instruction executes from the pipeline at the highest possible speed. When it is incorrect, the next instruction is loaded from slow system memory, but this is no worse than what would have been if no prediction were made at all.

So it’s clear that at least trying to predict the outcome of a branch is worthwhile, but what does this have to do with caching, since caches contain data that has already been loaded?

What I’m proposing is to apply the model and logic of branch prediction techniques to higher levels of programming, and to use these techniques to select the outcome of user decisions before the user selects them for the purpose of loading data into a cache, resulting in increased response time the first time the data is requested. I’m hoping to carry out some experiments to determine the value of this technique in the next few weeks, I’ll update this post when I have some results to report.