0

I am currently trying to write a function in assembly and i want to move 128 bits of a string located at the memory address stored in rdx into the xmm1 register.

If i use movdqa xmm1, [rdx], i get a access violation exception while reading at position 0xFFFFFFFFFFFFFFFF.

If i try to use movdqu xmm1, [rdx] instead, i dont get the exception. The problem is if i use movdqu, the order of the bits is inverted.

So i do not know why i get an exception when using movdqa but not when i am using movdqu

Ben
  • 199
  • 1
  • 13
  • 1
    try aligning memory to 16 bytes (so that string starts at address which is multiple of 16) – Alexander Zhak Oct 11 '16 at 20:07
  • i dont know exactly what you mean, but my string is exactly 16 bytes long – Ben Oct 11 '16 at 20:09
  • 2
    Unless you're running bare metal or in kernel mode, this is not an address you own. Furthermore, it's unaligned. Finally, it is likely to be the result of a `mmap()` that failed (or, god forbid, you assign `-1` to a pointer *manually* in your code). – EOF Oct 11 '16 at 20:17
  • i am passing a string from c to my assembly function and if i debug the programm, i can enter the adress in rdx manually and i can see my string in the correct memory location – Ben Oct 11 '16 at 20:21
  • 1
    If you expect this to work you'd have to align the string in your _C_ code to a 16-byte boundary (using a __declspec alignment attribute). However if you want to figure out if this is in fact alignment causing an issue - try changing the instruction to MOVDQU as a test. Oh I see you did that according to your question. The bytes being reversed would be correct because of endianness. The first character should be in the lowest byte of xmm1 and the last byte in the Most significant byte of xmm1 – Michael Petch Oct 11 '16 at 21:36
  • Had your data been properly aligned and you used _MOVDQA_ the bytes would still appear the same way in _XMM1_. – Michael Petch Oct 11 '16 at 21:45
  • `The problem is if i use movdqu, the order of the bits is inverted.` That's a separate problem. If your address was aligned, you'd get identical results from MOVQDU and MOVDQA. – Peter Cordes Oct 12 '16 at 05:09

1 Answers1

2

Most of this has been said in the comments already, but let me summarise. There are three problems raised by your code/question:

1) MOVDQA requires the addresses it deals with ([rdx] in your case) to be aligned to a 16-byte boundary and will trigger an access violation otherwise. This is what you are seeing. Alignment to a 16-byte (DQWORD) boundary means that, using your example, you should read from e.g. 0xFFFFFFFFFFFFFFF0 rather than 0xFFFFFFFFFFFFFFFF, because the latter number is not divisible by 16.

2) The address you use, 0xFFFFFFFFFFFFFFFF, is almost certainly invalid.

3) Provided you use MOVDQA to read from a valid 16-byte-aligned memory location, the results (in xmm1 in your case) will be IDENTICAL to when you use MOVDQU. The only relevant difference between the two here is that movdqU allows you to read from Unaligned (hence the U) memory whereas movdqA requires a (16-byte) Aligned memory location. (The latter case will often be faster, but I don't think you need to worry about that at this stage.)

PhiS
  • 4,540
  • 25
  • 35
  • Thanks, your right. If i align with __delspec, i can use MOVDQA. Also, i am quite new to assembly, so i did not know that the "reversed" bits problem is actually the expected behaviour. – Ben Oct 12 '16 at 15:56
  • 1
    MOVDQA and MOVDQU have identical performance if the address is aligned at run-time, on Nehalem and newer CPUs. These days, they differ only in their behaviour with unaligned addresses: either a fault or an unaligned load (which is only slower if it crosses a cache-line boundary, or much worse on pre-Skylake, a page boundary). – Peter Cordes Oct 12 '16 at 16:31
  • @PeterCordes. Thanks -- Yep, hence my "often" (and I couldn't be bothered to double-check that I remembered the details correctly). – PhiS Oct 12 '16 at 16:56