32bit call,function caller does the 'push',callee uses parameters by %ebp+offset. This is straight forward.
But in 64 bit,caller will first move parameters to 4 registers, then callee fetch them from registers and put into memory before use. I really feel here the usage of "shadow space"(https://en.wikipedia.org/wiki/X86_calling_conventions) via registers is redundant. We can directly use push so the parameters are in right memory location, why bother registers to store/restore them again?
What's the consideration behind this design? X64 restriction or limitation?