The exact implementation of the C++ Standard is an implementation detail: it varies from compiler to compiler, from platform to platform, etc...
Now, even though you could in theory use a split stack for C++, major implementations use a contiguous segment of memory (of varying size).
This contiguity and frequent reuse do indeed easily reap the benefits of caches, however it is not a panacea either. Actually, you can also create artificial scenarios for cache bounces: if your L1 cache is small (32k ?) and has 2-ways associativity, then you can easily craft a scenario that requires accessing the L2 cache. Just use a 64k array on your stack (it's small enough not to blow it up), and then access data at 0, 16k, 32k, and 48k repeatedly in a loop: it should trigger lots of evictions and requires fetches from L2 cache.
So, it is not really that the stack itself is so cache-friendly, but rather than its usage is predictable and well-known. You could reap the same cache benefits with a custom-made allocator (though allocation would be slightly slower).
On the other hand, there are other advantages and disadvantages to using the stack:
- disadvantage: if you attempt to consume too much of it, you get a Stack Overflow.
- disadvantage: if you overwrite an array on the stack, you might corrupt the stack itself, and it is a debug nightmare (it's also used by so called Stack Smashing attacks).
- advantage: C++ has specific patterns (RAII, SBRM) that take advantage of the behavior of the stack. Deterministic "undo" actions are a joy to program with.
So in the end I would be wary of deciding between stack and heap solely based on potential cache behavior.