The overall design of what you seem to be trying to do with inline asm is broken.
This is not how inline asm works. This is probably an X-Y problem; there's something you want your code to do, and you've picked a non-viable approach.
I mean only 128 bits are loaded and other half is all zeroes.
Sounds like GCC did a veroupper somewhere, probably at a function call boundary, between your asm statements. You didn't tell GCC that YMM10 was an output you expected to read later. (Kind of similar to how GCC doesn't push registers around my inline asm function call even though I have clobbers is using inline asm incorrectly). In this case GCC stepped on your data; in other cases you could destroy some data GCC had put there and was going to read again later.
You could tell GCC about the data coming out of your asm statement with another __m256i variable, perhaps a register __m256i ymm10 asm("ymm10") if you really want to convince the compiler to make worse asm instead of just letting it keep __m256i variables in registers like it normally does.
But seriously don't. You can look at GCC's asm output with gcc -S foo.c -o- | less or whatever. (Don't forget the usual -O3 -march=native or whatever). How to remove "noise" from GCC/clang assembly output? Using your own vmovdqa instructions on some of the YMM registers, while GCC uses other YMM registers for its own purposes, is just going to make worse asm. https://gcc.gnu.org/wiki/DontUseInlineAsm
See also https://stackoverflow.com/tags/inline-assembly/info for guides and docs that explain how to user GNU C inline asm correctly. (But you probably don't need asm at all.)
Note that "x" (addr) requires GCC to already have __m256i addr in another YMM register, so it's not even "loading" from memory, it's just copying YMM registers after GCC already loaded it from memory if necessary. That's why I said so strongly that this is pointless.