8

My question is simple, Why do x86 architecture use two stack registers (esp ; ebp)?

The length of stack frame has been determined during compile time, then I think we can just use one register (for example esp) to access the stack, and maintain the base address which used to be in the ebp register on the stack (or in other memory regions, but that would incur more performance penalty)

Is it possible?

lllllllllllll
  • 8,519
  • 9
  • 45
  • 80
  • 4
    Compilers regularly do what you propose _when the stackframe is fixed_, as you point out. However, both VLAs and `alloca()` can alter the stackframe dynamically; Thus it becomes useful to have both `esp` and `ebp`. – Iwillnotexist Idonotexist Dec 05 '14 at 04:23
  • @IwillnotexistIdonotexist, hello, thank you for the reply. IMHO, `VLA` and `alloca()` are quite rare situations and I also doubt compilers would produce assembly code containing them. Why would x86 designers waste 1/8 general register to handle this rare case? – lllllllllllll Dec 05 '14 at 04:30
  • 1
    They don't; The functions that don't use `alloca()` & co. (which is the vast majority) usually don't "waste" the register, while the functions that do, do. And yes, compilers do produce the corresponding code if you use VLAs as in, say, `int foo(int n){int bar[n]; /* ... */ return bar[0];}` or call `alloca()`. It's difficult to track where's the top of the stackframe otherwise. – Iwillnotexist Idonotexist Dec 05 '14 at 04:34
  • 3
    ebp is **not** [stack register](http://stackoverflow.com/questions/22516370/what-is-the-purpose-of-ebp-in-the-following-code) but [frame pointer](http://stackoverflow.com/questions/579262/what-is-the-purpose-of-the-frame-pointer). And it's not necessary either http://stackoverflow.com/questions/20695203/behaviour-of-ebp-and-esp-in-stacks-using-function-with-parameter?lq=1 http://stackoverflow.com/questions/14666665/trying-to-understand-gcc-option-fomit-frame-pointer – phuclv Dec 05 '14 at 09:44
  • 1
    In 16-bit x86 code (8086, 8088, 186, 286) `[bp+disp]` (or `[ss:bx]`, `[ss:si+disp]` or `[ss:di+disp]`) must be used to address the stack, as addressing forms `[sp]` and `[sp+disp]` do not exist in x86. `[esp]` is available in 386 code and `[rsp]` in x86-64 code. See my answer http://stackoverflow.com/questions/14881174/instruction-sequence-that-does-the-same-thing-as-push/14881784#14881784 for 16-bit x86 examples in which `[bp]` is used. – nrz Dec 05 '14 at 12:05
  • possible duplicate of [Is the EBP register (base frame pointer) only for Debugging in x86?](http://stackoverflow.com/questions/11252587/is-the-ebp-register-base-frame-pointer-only-for-debugging-in-x86) – Ciro Santilli OurBigBook.com May 31 '15 at 18:57

2 Answers2

11

This all depends on the calling convention of course but it generally goes like this.

The stack pointer is able to be moved arbitrarily based on whatever you need to push onto, or pop off from, the stack at any given point in time. This can happen anytime within a function, as you need to temporarily save some data on the stack.

The base pointer is generally set to the same value for any given stack depth, and is used to access passed parameters (on one side) and local variables (on the other side). It's also used to quickly move the stack pointer upon exiting a function.

The reason it's done this way is to simplify the code so that you're not having to reference stack content based on a possibly-changing stack pointer. Using the base pointer considerably eases the task of code generation (you don't have to know what the stack pointer is at any given time, just use the base pointer, which remains the same for the duration of the function).

Without that, code that wanted to push two copies of a local variable to call the next function, would look like this:

    mov    eax, [esp+16]    ; get var1
    push   eax              ; push it
    mov    eax, [esp+20]    ; get var1 again
    push   eax
    call   _somethingElse

Put aside the fact that you wouldn't reload eax in this case, the point I'm trying to make is that the relative location of items from a moving stack pointer can needlessly complicate matters.

For example, here's a function coded in assembly that follows a common calling convention:

_doSomething:
    push   ebp              ; stack current base pointer
    mov    ebp, esp         ; save previous stack pointer
    sub    esp, 48          ; incl 48 bytes local storage

    ; do whatever you want here, including changing
    ;  esp, as long as it ends where it started.

    mov    esp, ebp         ; restore previous stack pointer
    pop    ebp              ; and previous base pointer

    ret                     ; then return

_callIt:
    mov    eax, 7
    push   eax              ; push parameter for function
    call   _doSomething
    add    esp, 4           ; get rid of pushed value
    :

If you follow the code, you can see that ebp inside the function body is a fixed reference point with [ebp] (contents of ebp) being the return address, [ebp+4] being the pushed value of 7, and [ebp-N] being the local storage for _doSomething, where N varies from 1 to 48.

This is the case regardless of how many items are pushed or popped within the body of the function.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • Hello paxdiablo, thank you for your answer! I know the concept of 'stack' pointer and 'base' pointer, but I am just curious, why use two registers when designing x86? x86 is limited of registers comparing to ARM, then why waste two registers on accessing stack though? – lllllllllllll Dec 05 '14 at 04:23
  • This assembly code can be rewritten by only using one stack register. – lllllllllllll Dec 05 '14 at 04:31
  • 3
    @computereasy: yes, it can but, if you're going to be _changing_ that stack pointer during the function call, the code generation will be much more complex - local and parameter access will need to be aware of how `esp` has changed during the function. – paxdiablo Dec 05 '14 at 04:35
  • That's a very good point! "need to be aware of how esp has changed during the function", it can't, right? Compiler should not have the information of `esp`'s runtime value during compiling time... – lllllllllllll Dec 05 '14 at 04:38
  • 3
    @computereasy: not it's _value,_ no, but it can possibly know the relative _offset_ of things _from_ the stack pointer. But it's one more thing it has to track and, maybe, just maybe, you've got inline assembly that screws up the compiler's assumptions :-) – paxdiablo Dec 05 '14 at 04:42
  • 5
    A good compiler can easily keep track of the apparant position of the stack pointer, and adjust offsets into the frame to access what it needs. Even if there is "inline assembler", the compiler generally sees that an converts the instructions; if there's a "mov ," the compiler is going to replace that by "mov ,k[esp]" anyway and it knows the right value of k. So this really isn't a problem. OP asks if x86 designers insisted on *two* stack pointers; they didn't. There is one support by hardware instructions (ESP), and EBP is only by convention. – Ira Baxter Dec 05 '14 at 04:45
  • @IraBaxter, hello Ira, thank you for the answer. Could you please explain more about the your answer? Be specified, what do you mean by saying "There is one support by hardware instructions (ESP), and EBP is only by convention", what *convention*? – lllllllllllll Dec 05 '14 at 04:56
  • 3
    @computereasy By that it's meant that there are special instructions that modify `esp` _specifically_ (`push`/`pop`/`call`/`ret`), and in general x86 assumes `esp` to hold the stack pointer. No equivalents exist for other registers; That is, there is no `push` that alters `ebp` instead. `ebp` was simply _chosen_ (well, a long time ago, more-or-less designed) to hold the base pointer, but in principle any other register (`esi`, `edi`, ...) could store the base pointer. Not so with the stack pointer in `esp`, because the stack instructions alter _only_ `esp` (can't specify another register). – Iwillnotexist Idonotexist Dec 05 '14 at 05:03
  • 2
    @computereasy: Iwillnotexist Idonotexist says it pretty clearly. The x86 CPU uses ESP in a variety of special instructions. EBP IIRC is only special in its use in ENTER and LEAVE instructions, and no compiler uses those anymore. The convention is chosen by the compiler; it may or may not use EBP as a frame pointer. (I think both the MS and GCC C compilers have "omit frame pointer" options). – Ira Baxter Dec 05 '14 at 05:23
2

Why does x86 architecture use two stack registers (esp ; ebp)?

According to @gsg's answer to related question "Do any languages / compilers utilize the x86 ENTER instruction with a nonzero nesting level?" the x86 architecture was designed 30 years ago as "Pascal machine".

See The 8086/8088 Primer, by Stephen P. Morse, Chapter 8: High-Level-Language Programming (Pascal) for rationale from one of the chip designers.

As such it incorporates hardware support for nested and recursive subroutines (procedures, functions) as in the Pascal programming language this was an important aspect of the structured programming paradigm supposed to produce code that was easier to read/write/maintain.

The hardware support in the form of special CPU instructions enabled to generate code using less instructions. Less instructions usually also meant faster code.

Is it possible to implement the stack frame variable access on another Turing complete machine without the use of ebp register?

Yes, but with different execution time

Community
  • 1
  • 1
xmojmr
  • 8,073
  • 5
  • 31
  • 54