1

I'm trying to read a csr register using function macro

I have a struct array that contains name and address of csr registers

typedef struct csr_lists
{
    int address;
    const cahr* name;
} csr_lists;

csr_lists list[] = 
{
    {0xc00, "CYCLE"},
    ...
};

And the function I made:

#define csr_read(csr)                               \
({                                                  \
    register uint32_t v;                            \
    __asm__ __volatile__ ("csrr %0, %1"             \
                  : "=r" (v)                        \
                  : "n" (csr)                       \
                  : "memory");                      \
    v;                                              \
})

So it is called like uint64_t value = csr_read(list[i].address); or uint64_t value = csr_read(0xc00);

And the compiler gives me following errors

csr.h:68:2: error: asm operand 1 probably doesn’t match constraints [-Werror]
   68 |  __asm__ __volatile__ ("csrr %0, %1"    \
      |  ^~~~~~~
csr.c:127:25: note: in expansion of macro ‘csr_read’
  127 |         value[i] = csr_read(list[i].address);
      |  
csr.h:68:2: error: impossible constraint in ‘asm’
   68 |  __asm__ __volatile__ ("csrr %0, %1"    \
      |  ^~~~~~~
csr.c:127:25: note: in expansion of macro ‘csr_read’
  127 |         value[i] = csr_read(list[i].address);
      |   
cc1: all warnings being treated as errors
/mnt/d/project/riscv32-linux/buildroot-2021.02.10/output/host/lib/gcc/riscv32-buildroot-linux-gnu/9.4.0/../../../../riscv32-buildroot-linux-gnu/bin/ld: ./libhpm.so: undefined reference to `csr_read'

How can I fix this problem?

edit) The problem happens at these lines

#define MAX    7
for (i = 0; i < MAX; i++)
{
    value[i] = csr_read(list[i].address);
}
lemoncake
  • 41
  • 6
  • 1
    Compiles fine for me with GCC10 with a constant operand. https://godbolt.org/z/aoMMax138. Obviously an [`"n"` constraint](https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html) can only work with a compile-time constant since it has to become a literal number in the asm at compile-time, so `csr_read(list[i].address)` is going to require optimization enabled, and constant propagation of `i` to be possible, e.g. in a loop that gets fully unrolled. – Peter Cordes May 05 '22 at 09:14
  • @PeterCordes Thank you for the comment. But I still have the same problem even after I used -funoll-all-loops options. – lemoncake May 05 '22 at 12:27
  • Oh right, `csr_lists list[]` is a non-`const` global, so constant-propagation is only possible if you enabled `-flto` or maybe only `-fwhole-program`. Or better just make it `const` or `static const`. – Peter Cordes May 05 '22 at 12:33
  • @PeterCordes Sorry to bother you again. I have a question, -flto option works very well, but making list[] a const static global variable gives me same issues. gcc options I'm using is: -c -Wall -Werror -fpic. What could go wrong? – lemoncake May 05 '22 at 14:10
  • Those options don't include optimization, so I'm not surprised there isn't sufficient constant-propagation through a named variable. Unless `i` is also `const`. – Peter Cordes May 05 '22 at 14:11
  • @PeterCordes How could `i` be `const`? – lemoncake May 05 '22 at 14:16
  • IDK, it's your program. Your options are to enable optimization (with a `const` array), or use `const` on *everything* involved in the expression including the local vars. At `-O0`, all non-`const` variables are [similar to `volatile` as far as the optimizer is concerned](https://stackoverflow.com/questions/53366394/why-does-clang-produce-inefficient-asm-with-o0-for-this-simple-floating-point), defeating constant-propagation. I thought it was clear I was recommending building with `-Og` at least. – Peter Cordes May 05 '22 at 14:17
  • Works fine for me (https://godbolt.org/z/z8oYcsaaW) with optimization enabled and the array being a global `const`. – Peter Cordes May 05 '22 at 14:22
  • @PeterCordes I unrolled the loop using `#pragma`, sorry to not mention that. Anyway, using `#pragma` or `-funroll-all-loop` option changes nothing. I made `list` and `list.address` const, but not `i` and 'value' for they are used in for loop. Ah, thank you for the comment. optimization option worked for me as well, thank you – lemoncake May 05 '22 at 14:22
  • `-funroll-all-loops` doesn't override the `-O0` default and stop the compiler from treating all variables like `volatile`. [g++ O1 is not equal to O0 with all related optimization flags](https://stackoverflow.com/q/53175916). So everything in [Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?](https://stackoverflow.com/q/53366394) still applies. Even if the compiler did unroll the loop without assuming any known length, there would still be iterations without a constant `i`. – Peter Cordes May 05 '22 at 14:27

2 Answers2

1

Maybe this snippet will be helpful to the next person looking at this.

I'm sure the original question has been well resolved by now. The main problem is you have to use a compile time constant for the 'csr' value, you can not use a register/variable to provide this. Which means you can not put it in a loop and get the CSR register number from a variable in C.

The specific ASM instruction that is generated is encoded to include the CSR register number, the CSR to work with is never taken from a general purpose register, it is encoded into the instruction itself.

Only the values (the contents of the CSR register) that are read from written to, always use a general purpose register. Allowing the value to be taken from or returned to, a variable in C.

The other matter is the use of the "memory" clobber constraint in the question is probably not what you want. This instructs the compiler to perform a barrier around the CSR access, but this contradicts a general claim (I am making here) that CSR access instructions have no side-effect that would alter memory in a way the compiler does not expect. Think of it more like an IO device, you are altering hardware state not memory state.

Most userspace of use of CSR instruction never need a memory barrier, like timers/counters/FPU-state and doing so will result in the compiler generating less optimal code. Due to excessive invalidation and reloading of memory state.

There will be specific situations where a CSR access does need a barrier, but they will be the exception and you should perform that explicitly using a separate C language statement. Rather than lower the performance of a macro primitive that has many use cases.

The macro include the usage information, since the compiler is helpful to provide the code with the compile error message and this will remind the user of this particular quirk of usage.

#define CSRR_READ(v, csr)                           \
/* CSRR_READ(v, csr):
 * csr: MUST be a compile time integer 12-bit constant (0-4095)
 */                                             \
__asm__ __volatile__ ("csrr %0, %1"             \
              : "=r" (v)                        \
              : "n" (csr)                       \
              : /* clobbers: none */ );
Darryl Miles
  • 4,576
  • 1
  • 21
  • 20
  • A `"memory"` clobber is appropriate if there are any CSR reads that should be ordered wrt. surrounding loads/stores. e.g. if there was a CSR that counted TLB misses or something. Note that private local variables aren't affected by a `"memory"` clobber (escape analysis can prove that they're not globally reachable, so arbitrary asm that didn't get their address passed to it can't access them.) So the store/reload cost is often not too high, depending on the details of the surrounding code. But sure, RISC-V has a lot of registers, so the compiler might have quite a few vars live in regs. – Peter Cordes Feb 24 '23 at 19:04
  • I might name it `CSRR_READ_NOBARRIER` to remind readers that this version isn't ordered wrt. non-volatile accesses. (If I recall, `asm volatile` is ordered wrt. access to `volatile int` etc. variables, not just other `asm volatile` statements). The potentially surprising thing should be explicit, to avoid the situation where someone's staring at the C source and not seeing the problem when trying to debug something caused by a store happening later than in the source. CSR writes are probably much more likely to need ordering than reads. – Peter Cordes Feb 24 '23 at 19:10
  • I might use an explicit memory ordering statement `BARRIER();`, before and/or after the use of a CSR access in those limited scenarios where it is necessary, and doing this better documents exactly what is happening and important around this particular CSR access. But regular U-mode use cases, reading timers,FPU state,time-of-day where the majority of CPU time is spent don't need it baked into their macros. Most people don't write operating systems dealing with SATP/PMP/TLB invalidation and how memory ordering maybe relevant, this should not colour all other uses of CSR is a point to consider. – Darryl Miles Feb 25 '23 at 18:26
  • 1
    I agree with not baking it in to a macro, I'm just saying that having the macro name remind users that it's not ordered is probably a good thing to avoid corner cases. Maybe not in the name if you don't expect any user-space uses to need ordering, but at least in a comment in the macro definition as a reminder in case it ever becomes a problem. – Peter Cordes Feb 25 '23 at 18:28
-1

Here is my code. actually, the code should changed on runtime. "csrr" instruction is 0x~~~02573 and MSB is ID of CSR (12bits) So, I tried change the code area to change CSR address.

.highcode means non-flash in my linker script.

__attribute__((naked))
__attribute__((section(".highcode")))
uint32 crs_Read()
{
    uint32 nRet;
    asm volatile("csrr %0, mtvec":"=r"(nRet));
    asm volatile("ret");
    return nRet;
}


void mcu_CsrRead(uint8 argc, char* argv[])
{
        uint32 nAddr = CLI_GetInt(argv[1]);
        *((uint32*)crs_Read) = (nAddr << 20) | 0x02573;
        asm volatile("fence");
        uint32 nVal = crs_Read();
}
Engin
  • 1
  • 1
  • It's officially unsupported to put anything except a Basic asm statement in a `naked` function. And clang refuses to compile it at all. I assume this doesn't work in a debug build, since it would use stack space without reserving it, and I don't think the standard RISC-V calling convention has a red zone. Oh actually GCC copies to `s1` and back, violating the calling convention. And doesn't emit a `ret` for `return nRet`, only a `mv` to `a0` (the return-value register) so this is super broken: execution falls off the bottom of the function. https://godbolt.org/z/PE66hqYTj – Peter Cordes May 25 '23 at 12:41
  • https://gcc.gnu.org/onlinedocs/gcc/RISC-V-Function-Attributes.html#index-naked-function-attribute_002c-RISC-V is the GCC documentation. clang 16 errors out with `error: non-ASM statement in naked function is not supported`, for good reason. Allocate an executable buffer somewhere and store this instruction + a `ret` in it, and call it through a function pointer. Since you're always going to store new machine code before calling it, you don't need inline asm to set its initial contents. Or if you do want to do that, use `asm("csrr a0, mtvec; ret");` as the body of the `naked` function. – Peter Cordes May 25 '23 at 12:45
  • @PeterCordes Yes you are right! 'ret' is needed. (buggy code...) Anyway this code is running correctly in riscv-none-embed-gcc (xPack GNU RISC-V Embedded GCC i386) 10.2.0 – Engin May 25 '23 at 13:42
  • You can [edit] your answer to fix it. – Peter Cordes May 25 '23 at 13:53