0

My Understanding on ARM MMU is low and trying to understand how Page table is organised in ARM MMU.

Page table is created at system boot up time and can be thought of as linear one dimensional array where each entry is 4 byte long but I don't understand what is 1 MB SECTION corresponds to in Page table?

If we have three level of page table, would our first level page table(PGD) contains this 1 MB section or would it (1 MB section) be only part of third level page table (PTE)?

Also what it meant by following statement:

The start of the page table is 0x40200000, each entry is 4 bytes long and each entry corresponds to a 1MB section of memory. The first entry in the page table (0x40200000) represents what happens when you try to access memory between the range of 0x00000000 to 0x00100000, the second entry (4 bytes into the table at 0x40200004) represents the memory range 0x00100000 to 0x00200000, etc.

Therefore to find the page table associated with area of memory 0x40200000 to 0x40200000+1MB (which is the 402nd MB of memory) – you need to traverse the page table list by 402 entries – but each entry is 4 bytes long therefore you need to time this number by 4.

Amit Singh Tomar
  • 8,380
  • 27
  • 120
  • 199
  • Technically, you wouldn't have 1MB sections at all with 3 levels of page table - 1MB sections implies the "classic" 2-level short descriptor format, whereas 3 levels implies LPAE long descriptor format, where the block sizes are 2MB and 1GB. There could also be some ambiguity here as to whether "section" is being used in the more general sense of "region", or the specific ARM terminology for "a level 1 block entry" - a level 1 table entry still describes a 1MB region of memory, but points to a level 2 table which further divides that 1MB into 4KB pages. – Notlikethat Oct 02 '15 at 21:35
  • Thanks for your comments, Is right to say that in 1st level PT you could have 1 GB block/section as an PT entry which maps to Physical address, in that case 2nd and 3rd levels can be skipped or in 1st level PT we could have pointer as an entry to 2nd level tables? – Amit Singh Tomar Oct 02 '15 at 22:41
  • This is a related question; [How does ARM emulate the dirty bit...](http://stackoverflow.com/questions/32943129/how-does-arm-linux-emulate-the-dirty-accessed-and-file-bits-of-a-pte) and [Page table entry (PTE) descriptor in Linux kernel for ARM](http://stackoverflow.com/questions/16909101/page-table-entry-pte-descriptor-in-linux-kernel-for-arm). You need to specify which PTE (the Linux or the ARM) and if you are using LPAE (large address space) and probably what particular architechure. You tagged **linux-kernel** and where the information came from can make it mean different things. – artless noise Oct 06 '15 at 15:43
  • Thanks @artlessnoise for looking into it, I am not aware about the difference between Linux and ARM PTE? So was not sure about these tags. – Amit Singh Tomar Oct 07 '15 at 07:10

1 Answers1

1

First off there is a table in a known location in ram using physical addresses which drive the mmu. I prefer the diagram versions of the docs not the tables.

When the mmu is enabled and there is an access from the processor, fetch, or data, read or write, some number of bits from that virtual address are extracted and added to the base address of the above mmu table. the contents of the memory location in the mmu table tell the mmu what to do with that access. Some entries that is the only access and some you have to do a second mmu table access again taking some bits from the first level access and adding those to the above mmu table base address to get the second entry.

At the end of the day you have this mmu table with a base address (which must be aligned on some boundary). bits from the virtual address from a processor access are used to index into that table. the final value from the table indicates the replacement address bits to convert virtual to physical address, plus some control bits that indicate cachable or not, and permissions and such things. They apply terminlogy to that like 1MB or whatever based ideally on the amount of ram that one entry in the table controls. But of course the 16MB feature is misleading because you need 16 1MB entries in order to use it (but if you ignore the 16mb terminology and strictly look at address bits and where they come from or go to you know from that how much ram is controlled by each mmu table entry).

so what they are saying is that for some table that someone has created at some point in time. The physical address for the location of the table is 0x40200000. the 1MB size entries (for arm I think you look at the lower two bits of the entry to figure out what type it is but I would have to look and different architectures added different features so there is not one rule that fits for all, you have to look at docs for your core) are 4 bytes or one word wide and AFAIK all the entries in the mmu table are one word wide. so they are saying that one particular entry is for the 1MB of ram in virtual address space from 0x00000000 to 0x000FFFFF and the mmu table maps it to some physical address. the next entry they are showing you could be the 1MB virtual address space 0x00100000 to 0x001FFFFF and so on.

old_timer
  • 69,149
  • 8
  • 89
  • 168
  • ok, Thanks for your answers.Can I think like each one word(32 bit) entry contains a pointer to 1 MB of section in Main Memory (Physical address).If CPU generates a Virtual address 0x00000100 , its mapped in that 1 MB of section in Main Memory and can be referenced from first entry in Page table? – Amit Singh Tomar Oct 02 '15 at 20:23
  • so randomly selecting an armv6 trm VA[31:20] determines the index into the word sized level one page table. for address 0x00000100 VA[31:20] is all zeros so the first entry in the table is for that 1MB address space. the lower two bits and bit 18 in the one I am looking at determine what to do from there. if the lower two bits of the data in the entry in the mmu table are 2'b10 and bit 18 is a 0 then it is a 1MB page, if 18 is a 1 then it is a 16mb page but you need 16 entries. anyway for the 2b10 on the lower bits then the physical bits and control comes from that word – old_timer Oct 02 '15 at 21:13
  • for address 0x001xxxxx then VA[31:20] is 0x001 so that is the second entry in the mmu table (offset VA[31:20]*4 = 0x4) – old_timer Oct 02 '15 at 21:14
  • if you think of the mmu table as an array of (32 bit) words, then VA[31:20] is the index into that array a 0x000 is index 0 or word[0] the first one, index 1 is word[1] or the second one and so on... – old_timer Oct 02 '15 at 21:15
  • if for example, using the trm I am looking at (They do vary and are legacy modes) if the VA[31:20] first lookup entry has the lower two data bits being 2b01 then you have to look at the coarse page table which it appears you take the data bits [31:10] from the first lookup those are the offset into the mmu table, you add to that offset VA[19:12] and the lower two bits of that second entry tell you the size of the page, have the replacment physical address bits and so on. – old_timer Oct 02 '15 at 21:19
  • for your first time using an (ARM) mmu I highly recommend using the 1MB page entries – old_timer Oct 02 '15 at 21:20
  • Thanks, it helped.One more doubt if Page table is represented as three level hierarchy, would the PGD contains these 1 mb section reference/pointer? – Amit Singh Tomar Oct 02 '15 at 21:27
  • My experience with ARM is that *ALL* of the page table entries in are in the same physical memory space the 0x40200000 memory in your example, when you choose an offset for the second and third level entries you have to choose offsets in this same mmu table that are not being or going to be used by something else. so for example if you want to leave room for all the possible addresses in the arm then the first 4096 entries or 16Kbytes are first level lookups. so your secondary entries would have to be somewhere after that. for each secondary and tertiary entry – old_timer Oct 02 '15 at 21:39
  • you need to leave enough room for the worst case, so a coarse page table entry is indexed by VA[19:12] so FOR EACH, the worst case is 256 entries or 1Kbytes and that 1Kbytes has to be somewhere after your first level entries, I dont remember if it has to be aligned on a 1K boundary. In the diagram I am looking at there are no third level table entries after the first or second lookup in the table the mmu can deliver a physical address and permissions. – old_timer Oct 02 '15 at 21:43
  • Ok, What I understood is first level page table will allocated from 0x40200000 to 16k bytes and then after second and third level tables will be allocated , right? Which diagram are you looking at? – Amit Singh Tomar Oct 02 '15 at 21:47
  • My guess is this is why if for example you want to use mmap on an ARM to punch through from application layer to physical address space, you are more likely to get a pointer if you ask for 1MB sized and aligned or multples of 1MB than to ask for say access to a 4096 byte space. even if your peripheral is only one address or a small number of addresses ask for the whole 1MB range around it and you mmap() tends to succeed more. Just a theory have not dug into why the smaller requests fail. – old_timer Oct 02 '15 at 21:49
  • Figure 5-5 Backwards-compatible section, supersection, and page translation from the mpcore trm – old_timer Oct 02 '15 at 21:50
  • no you can think of it that way but the mmu simply takes that 0x40200000 base address and adds on (or does it *OR* on?) the remaining parts. initially VA[32:20]<<2 is added to 0x40200000 then if the lower bits are 01 then the mmu looks at 0x40200000+(L1D[31:10]<<2)+(VA[19:12}<<2). You as the programmer get to pick the LD1[31:10] bits to make them not hit any other entries (or maybe you want them to hit another entry, but whatever) in the 0x40200000 space. – old_timer Oct 02 '15 at 21:54
  • it is probably wise to avoid the first 16K if you are on a system that has that much to spare in a place where you can find an aligned 16K that is out of the way of the heap, stack, bss, data, text. ideally fast on chip ram. but you probably dont have 100% of the address space consumed, probably large gaps, so you could allow for some number of smaller blocks and re-use some of the holes in the 16K, depends on how desperate for space you are. not hard to do an experiment to prove me right or wrong here. – old_timer Oct 02 '15 at 21:57
  • Ok, May be totally a different question but where this HYP page table gets allocated from, would it be referenced from PET's? – Amit Singh Tomar Oct 02 '15 at 21:57
  • HYP? PET? not sure what you mean – old_timer Oct 02 '15 at 22:55
  • By HYP , I meant Hypervisor mode page table and By PTE I meant 3rd level page table. – Amit Singh Tomar Oct 03 '15 at 13:54
  • I am (re-)writing an mmu example for the raspberry pi and hopefully I have not lead you off track. It currently appears to me that the second level lookup contains the complete top portion of the second level address it is NOT an offset to the TLB base. So you can still place it anywhere that makes sense IMO (where something else isnt), so if conserving space you can fill in gaps in the first level table for address spaces that will never be used. – old_timer Oct 13 '15 at 02:00
  • grr, even worse you wanted arm64 yes? I mentioned I was looking at an armv6 (arm 32 bit) trm. so now I see why I didnt understand/see the terms you were using. If this legacy mmu discussion was not useful then we can wipe out my answer and all of these comments and start over. – old_timer Oct 13 '15 at 02:01