0

Hi I’m trying to write code that do pattern search, and rank top3 most popular patterns, and how many times it counts patterns. I can make code which ranks top2 patterns. but i can't do rank3 one.

(I allocated registers in this way)
R7 : count of top1 pattern
R8 : top1 pattern
R10 : count of top2 pattern
R11 : top2 pattern

this is my actual code that works

void ex3(){
int result[4];

example3(0x00,0x100, 0x0, result);
sendstr("Top 1 pattern : ");
printDecimal(result[0]);
sendstr("\n");

sendstr("Top 1 pattern count : ");
printDecimal(result[1]);
sendstr("\n");

sendstr("Top 2 pattern : ");
    printDecimal(result[2]);
sendstr("\n");

    sendstr("Top 2 pattern count : ");
printDecimal(result[3]);
sendstr("\n");

}


PRESERVE8
AREA Ex3, CODE, READONLY

EXPORT  example3

example3
    STMFD       sp!,{r4-r9,lr}              
    MOV         R4, r2                          
    MOV         R6, R3
    MOV         R7, #0
    MOV         R8, #0

Loop2   
    MOV         r3, #0                          
    MOV         r9, r0

Loop
    LDRB        r5, [r9], #1                        
    CMP         r4, r5                          
    ADDEQ       r3, r3, #1                      
    CMP         r9, r1                          
    BLS         Loop
    CMP         R3, R7
    BLT         Com2




Com1

    MOVGT   R10,R7
    MOVGT   R11,R8
    MOVGT   R7, R3                          
    MOVGT   R8, R4
    B               Here

Com2
    CMP         R3,R10
    BLT         Here
    MOVGT   R10,R3
    MOVGT   R11,R4


    CMP         R4, #0XFF
    ADDLT       R4, R4, #1
    BLT         Loop2

    STR         r8, [r6]
    STR         r7, [r6,#4]

    STR         r11, [r6,#8]
    STR         r10, [r6,#12]


    LDMFD       sp!,{r4-r9,lr}

    MOV         PC, lr

    END

but when i tried rank3 codes with same logic, just changed register allocation in this way

r7 count of top1 pattern  
r8 top1 pattern    

r9 top2 count  
r10 TOP2    

r11 top3 count  
r12 top3    

it shows strange result because of (in my think) wrong register allocate(I need more empy register...). what is easy way or right way to solve register Lack?

PRESERVE8
AREA Ex3, CODE, READONLY

EXPORT  example3

example3
    STMFD       sp!,{r4-r9,lr}              
    MOV         R4, r2                          
    MOV         R6, R3
    MOV         R7, #0
    MOV         R8, #0

Loop2   
    MOV         r3, #0                          
    MOV         r9, r0

Loop
    LDRB        r5, [r9], #1                        
    CMP         r4, r5                          
    ADDEQ       r3, r3, #1                      
    CMP         r9, r1                          
    BLS         Loop
    CMP         R3, R7
    BLT         Com2




Com1
    MOVGT   R11,R9
    MOVGT   R12,R10
    MOVGT   R9,R7
    MOVGT   R10,R8
    MOVGT   R7, R3                          
    MOVGT   R8, R4
    B               Here

Com2
    CMP         R3,R9
    BLT         Com3
    MOVGT   R11,R9
    MOVGT   R12,R10
    MOVGT   R9,R3
    MOVGT   R10,R4
    B               Here


Com3
    CMP         R3,R11
    MOVGT   R11,R3
    MOVGT   R12,R4


Here
    CMP         R4, #0XFF
    ADDLT       R4, R4, #1
    BLT         Loop2

    STR         r8, [r6]
    STR         r7, [r6,#4]

    STR         r10, [r6,#8]
    STR         r9, [r6,#12]
    STR         r12, [r6,#16]
    STR         r11, [r6,#20]


    LDMFD       sp!,{r4-r9,lr}

    MOV         PC, lr

    END
Salieri
  • 73
  • 8

1 Answers1

3

How would a compiler do it? Try writing some C to do the same thing, and get the compiler to generate assembly language (if you're using GCC it's gcc -S). That's a good way to learn about effective assembly language; not everything the compiler ever does is the most efficient it can be, but it will always work and be logical, and should be reasonably easy to follow if optimisation is disabled.

If you run out of registers completely, your only option is to use the stack for local storage, and either push and pop registers as needed or allocate a bit of stack space and LDR/STR values from that area as required. However in your case you are not calling any other functions from within your function so there is no reason to avoid r0-r3 or r12, which are call-clobbered.

Note that the code you have presented contains at least one bug, because you are using r10-r11 and are not preserving their contents via the STMFD and LDMFD instructions.

cooperised
  • 2,404
  • 1
  • 15
  • 18
  • 1
    I like to recommend `gcc -O1` at least, because `-O0` treats everything as `volatile` for consistent debugging. The extra stores/reloads between every statement are a lot of noise to wade through that makes it hard to follow. [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) has x86 examples, but everything there applies to ARM (https://godbolt.org/ has ARM gcc) – Peter Cordes Nov 07 '18 at 00:30
  • Probably a good call, at the very least worth a try for comparison with `-O0`. – cooperised Nov 07 '18 at 12:24