Adding a byte from memory to the AX register

Question

I'm currently trying to figure out how to add the first byte in memory pointed to by the pointer register SI to the current contents of the AX register.

So if SI holds some address, and the values in memory at that address are: 00 and 01, I'm looking to add just 00 to the AX register.

The first instruction my assembly-noobish self tried was add ax, byte ptr [SI] but of course, no dice, as I'm trying to add operands of different sizes.

My current workaround is

mov dx,0000h             ;empty the contents of dx
mov dl,byte ptr [si]     ;get the value of the first byte in a register
add ax,dx                ;perform the originally desired addition

But this is incredibly wasteful and really hurts my executed instructions count (this is part of a subroutine that runs many times).

I'm limited to the 8086 instruction set so this question/answer by Peter Cordes which suggests movzx to condense my first two lines is unfortunately not viable.

`xor dx, dx` will be a shorter encoding. if you have to reduce the instruction count entirely, you'll have to find some register whose upper 8 bits are already zero. Maybe there's a register you're not using and you can just leave the upper 8 bits zero for the lifetime of the program. — Raymond Chen, Jul 06 '21 at 01:18
Thank you i'll try to implement this and close the question if it works. — CompSciStudent, Jul 06 '21 at 01:29
It seems that the instruction count was reduced only by a small amount (the instructor has supplied a driver program that performs grading and counts instructions executed as part of that metric) Unfortunately, also because of the driver program (which is the call*er* of my subroutine [callee]) I'm unsure as to which register might be empty for the lifetime of the program. Thank you for the help - it seems I'll have to look in my decisionmaking logic to reduce count further. — CompSciStudent, Jul 06 '21 at 01:51
I think there needs to be a formal answer for me to close this question - which I'm happy to do, because this did help me lower my instruction count with the information i gave. — CompSciStudent, Jul 06 '21 at 02:38

score 2 · Accepted Answer · edited Jul 06 '21 at 17:40

As you say, if you can assume a 386-compatible CPU, a good option (especially for modern CPUs) is movzx dx, byte ptr [mem] / add ax, dx. If not, I guess we can pretend we're tuning for a real 8086, where code size in bytes is often more important than instruction count. (Especially on 8088, with its 8-bit bus.) So you definitely want to use xor dx, dx to zero DX (2 bytes instead of 3 for mov reg, imm16), if you can't avoid a zeroing instruction altogether.

Hoist the zeroing of DX (or DH) out of any loop, so you just mov dl, [mem] / add ax, dx. If the function only does it once, you may need to (manually) inline the function in call sites that call it in a loop, if it's small enough for that to make sense. Or pick a register where callers are responsible for having the upper half zero.

As Raymond says, you can pick any other register whose high half you know to be zero at that point in your function. Perhaps you could mov cx, 4 instead of mov cl, 4 if you happened to need CL=4 for something else earlier, but you're done with CX by the time you need to add into AX. mov cx, 4 is only 1 byte longer, so you get CH zeroed with only 1 extra byte of code-size. (vs. xor cx, cx costs 2 bytes)

Another option is byte add/adc, but that isn't ideal for code size. (Or performance on later CPUs.)

  add al, [mem]      ; 2 bytes + extra depending on addr mode
  adc ah, 0          ; 3 bytes

So that's 1 byte more than if you already had a spare upper-zeroed register:

  mov  dl, [mem]     ; 2 bytes (+ optional displacement)
  add  ax, dx        ; 2 bytes

But on the plus side, add/adc doesn't need any extra register at all.

With the pointer in SI, it's worth looking for ways to take advantage of lodsb if you're really optimizing for code-size. That does mov al, [si] / inc si (or instead dec si if DF=1), but without affecting FLAGS. So you'd want to add into a different register.

xchg ax, reg is only 1 byte, but if you need two swaps it may not pay for itself if you actually have to return in AX, not some other register.

The `lodsb` pointer increment or decrement is of course like `dec si` or `inc si` in its effect on `si`. But it does not affect any of the status flags. So it is more similar to `lea si, [si + 1]` or `lea si, [si - 1]` — ecm, Jul 06 '21 at 07:39
@ecm: Thanks, forgot to mention the no-FLAGS part. I think "inc but without FLAGS" is clearer, especially for beginners, than an actual emulation involving `lea`. There are other Q&As with answers involving `lea`, at least for push/pop, probably also `lods` and others. — Peter Cordes, Jul 06 '21 at 07:45

Adding a byte from memory to the AX register

1 Answers1