3

The first instruction I issued without fail at the start of every program for a DEC PDP-8 Minicomputer was CLA CLL to clear the accumulator and link (overflow) bit.

That simple instruction doesn't seem to exist in 8086 range of processors and I have seen a lot of discussion on various technical websites on the quickest way to do it, like XORing it with itself.

Is this already handled somewhere by the processor logic? So it's guaranteed to be 0 before the program starts?

phuclv
  • 37,963
  • 15
  • 156
  • 475
Android
  • 79
  • 1
  • 6
  • 3
    There is no `eax` or `%eax` register in 8086. `eax` and other 32-bit appeared recently in 386, not before. Anyway, ANDing a register with itself doesn't zero it. XORring a register with itself does it. `xor` is the most common way, eg. `xor ax,ax`, but AFAIK in all Intel processors `xor` and `sub` are equally fast, so `sub ax,ax` is as fast as `xor ax,ax`. – nrz Nov 25 '13 at 18:12
  • Uf sorry i didnt see you already print all my answers :) Delete my answer – Igor Nov 25 '13 at 18:18
  • 3
    Whatever for would you need to zero the EAX (or any other register) at startup? Zero it when it's necessary, e.g. when returning 0 from a function, or when calling something that needs it zeroed. Clearing the registers "just in case" only wastes processor cycles and code space. It's likely something else will overwrite it very soon anyway. – Igor Skochinsky Nov 25 '13 at 18:23
  • 2
    There's been progress, clearing the accumulator is no longer necessary. – Hans Passant Nov 25 '13 at 18:27
  • sub (E)AX,(E)AX is the closest equivalent since it clears both the accumulator and carry flag. Note that the x86 series, unlike the PDP-8, has a whole slew of registers and flags and no easy way of clearing everything. In practice you virtually always rely on the operating system to put the system into a usable state. – doynax Nov 25 '13 at 18:29
  • @doynax: The standard idiom `xor eax,eax` also clears CF and OF, setting FLAGS the same way `sub` would (for similar reasons to [`test eax,eax` vs. `cmp eax,0`](https://stackoverflow.com/a/38032818/224132)). Most but not all CPUs recognize `sub same,same` as a zeroing idiom (handled more efficiently, no false dependency on the old value) so xor is preferred. [What is the best way to set a register to zero in x86 assembly: xor, mov or and?](https://stackoverflow.com/q/33666617) – Peter Cordes Oct 06 '20 at 20:13

4 Answers4

2

For a 8086, the fastest and shorter way to clear the AX register (not EAX) is by issuing some ALU instruction that performs the operation. That is:

xor ax,ax  ; opcode: 29 C0

Or

sub ax,ax  ; opcode: 31 C0

Shortest because the regular mov ax,0 needs 3 bytes: B8 00 00, one byte more. Fastest because xor and sub use 3 clock cycles. mov uses 4 cycles.

On the other hand, xor and sub will alter the flags, while mov won't. Sometimes you won't mind changing the flags when a register is needed to be cleared, sometimes you won't want to change the flags. About code clearness, the xor/sub "trick" is widely known, and compilers do use it to fast register clear, so any assembler programmer will realize what you want to do.

mcleod_ideafix
  • 11,128
  • 2
  • 24
  • 32
  • Thank you, this is the answer I was looking for, Technically complete and clearly explained – Android Nov 26 '13 at 15:27
  • 1
    Isn't the opposite ? xor = 0x31 and sub = 0x29 ? – tigrou Jun 23 '16 at 15:36
  • Note further that `xor` of a register with itself is detected as an idiom in the frontend of modern x86 processors and generates no microinstruction at all; it's latency is 0. – fuz Jun 24 '19 at 15:41
  • For details on how xor-zeroing is handled efficiently in modern CPUs, see [What is the best way to set a register to zero in x86 assembly: xor, mov or and?](https://stackoverflow.com/q/33666617). In actual 8086, it just ran the same as any xor instruction and was used for code-size reasons. (Because on 8086 and especially 8088, performance was mostly determined by code fetch bottlenecks.) – Peter Cordes Oct 06 '20 at 20:15
1

It is rather irrelevant what is in the processor registers when the program starts (except for x86 stack pointer). If you don't like garbage in your registers, simply set them to values you like.

This may cost you all of 10 instructions at the start of your code. In practice you don't need to initialize registers you aren't immediately using so it is only 1 or 2 instructions. More importantly, most assembly programs are much bigger than this so nobody cares.

You can initialize your registers by zeroing them (xor reg,reg/sub reg,reg/mov reg,0) if you insist.

It is worth knowing that the processor can take advantage of certain instructions. On modern Intel chips (not necessarily x86), "xor reg,reg" breaks pipeline dependencies on "reg", enabling faster code, so you should stick to using such instructions where you can.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
0

xor and mov do different things. xor will set condition bits and mov will not. If your goal is to set a register to a specific value, mov will clearly communicate that to the reader of your software.

An optimizing assembler can sometimes safely convert mov reg, 0 to xor reg, reg or sub reg, reg.

sub reg, reg and xor reg, reg are not necessarily going to execute equally fast all of the time. Subtraction requires carrying and exclusive-or does not. Carrying is usually implemented in an extremely efficient manner and the added time it takes is difficult to measure, so for practical purposes people tend to refer to them as having equally fast execution.

Michael J. Gray
  • 9,784
  • 6
  • 38
  • 67
  • Thanks to Michael J. gray, doyanx and Hans Passant for good and appropriate replies. When I find out to Indicate my question has been answered I will use it. Yhanks again guys, 'I'll be back' as somebody once said. Andy – Android Nov 25 '13 at 18:59
  • 2
    In fact xor will be recognized by the CPU and treated specially so it won't have any false dependancy. I don't know if sub also has the same treatment – phuclv Dec 21 '13 at 11:21
  • 3
    Some CPUs only recognize xor-zeroing as special, not `sub same,same`. ([What is the best way to set a register to zero in x86 assembly: xor, mov or and?](https://stackoverflow.com/q/33666617)). That's the only reason they wouldn't execute at equal speed. In the general case for non-zeroing uses of XOR and SUB, they do have the same performance as each other on any given CPU: 1 cycle latency, run on any ALU execution port. See https://www.uops.info/table.html or https://agner.og/optimize/. Synchronous pipeline stage logic means the CPU can't use the simpler XOR result any sooner. – Peter Cordes Oct 06 '20 at 20:22
0

It is also recommended to use a "xor reg,reg" for a 32 bit register, if we are using first the 16 bit part of the same register and if want to change the access to 32 bit in our following instructions.
Example:

LOCATION DW 3

xor eax, eax
mov ax, [LOCATION] ; first a 16 bit access to the lower part of EAX
shl eax, 2         ; following by a 32 bit access

The xor-instruction enable a faster execution of our code. I think a movzx-instruction will be slower execute.

Dirk