I see. but how will latency affect branch prediction? By worsening the miss prediction hit by having to wait more for instruction fetches?
All depends on whether the alternate branch is in the cache -- In the case of small branches (if/else, switch, loops), the alternative branch is already in the cache, so there ought to be no discernable effect compared to a "normal" memory system. Rumors say the 8 CPUs are divided into two groups of four, with 2megs shared cache (L3, I guess, maybe L2) between each four cores, so there's plenty of room for code and data, especially considering that most of the "big-data" tasks will be offloaded to GPU in many cases, and so won't compete for space.
That said, there definately will be higher latencies to memory, its just that code that's efficient otherwise is going to amortize that initial latency pretty well I think. If you're jumping around in code or data willy-nilly, you'll probably feel it, but you'll just do your best to avoid that. In any event, it remains to be seen what the impact or "reasonalbe" code might be.