17

Say I allocate a big memory (40MB) with mmap using /dev/zero as follows.

fd = open("/dev/zero", O_RDWR); 
a = mmap (0, 4096e4, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FILE, fd, 0);

What I understand is that the kernel will initialize memories to zero as the pages are brought into the physical memory (I suppose the modern Linux kernels use Demand paging). So for example, when the first page is touched and therefore brought into the physical memory, kernel will initialize all of its 4096 bytes to zero, then when the second page is touched, it does the same and so on.

Is my understanding correct?

NPE
  • 486,780
  • 108
  • 951
  • 1,012
MetallicPriest
  • 29,191
  • 52
  • 200
  • 356
  • As long as touch means write, yes. Otherwise, if you just read from a zero page, there is actually a single zero page, that will be COW'ed on first write. – ninjalj Dec 14 '11 at 21:25
  • 2
    This will fail hard on Android 5.0, because SELinux will block such weirdness. That's why Inkscape in chroot does not start on Android. – pelya Mar 25 '15 at 18:36

2 Answers2

14

Yes, even more the pages are only brought into existence if you touch them for the first time.

BTW, there is the flag MAP_ANONYMOUS that you can use in this situation, no need to open a file descriptor on /dev/null.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • MAP_ANONYMOUS also makes the memory 0? What if I don't want the memory to be zero, so as to avoid the overhead of initializing memories to zero? What should I use then? – MetallicPriest Dec 14 '11 at 17:00
  • 7
    @MetallicPriest Nothing. Any uninitialized non-file-backed memory which is a potential leak of the contents of some other process's or the kernel's memory. – ephemient Dec 14 '11 at 17:05
  • 3
    Use /dev/urandom instead of /dev/zero if you don't want the space filled with zeroes? :) The kernel has no business giving you memory which might contain leftovers from someone else's program running in the same memory - it is a security risk. – Jonathan Leffler Dec 14 '11 at 17:28
  • 1
    @MetallicPriest, in essence there is no overhead for the kernel to create a whole page of zeros. This is not `memset` or something like this but usually much more efficient. – Jens Gustedt Dec 14 '11 at 17:36
  • @MetallicPriest: there is no way around that. Anyway, initializing a page to 0 is pretty fast, and since you'll probably be writing more things to the page you've just written to, you should consider that cost as amortized over all the writes you do. Worst case would be writing a single byte at each page (so each write costs a lot, but since it is done by the kernel, it won't show up in your user time). – ninjalj Dec 14 '11 at 21:24
  • 2
    'kernel to create a whole page of zeros. This is not memset' - are you sure? to create a readable zero page - yes, there is no memset. But when you will try to write to yet-not-written "zero" page, the pagefault handler will call memset: [clear_page func](http://lxr.linux.no/#linux+v3.1.5/arch/x86/include/asm/page_32.h#L39); called from: handle_mm_fault->handle_pte_fault->do_anonymous_page->alloc_zeroed_user_highpage_movable->clear_user_highpage-> clear_user_page->clear_page – osgx Dec 15 '11 at 16:16
  • 1
    or called from handle_mm_fault->handle_pte_fault->do_wp_page-> alloc_zeroed_user_highpage_movable->clear_user_highpage-> clear_user_page->clear_page – osgx Dec 15 '11 at 16:25
  • 2
    @Jonathan Leffler, The kernel WON'T give someone else's page (aka a page from some other process) to any other process. I think you `did` mean this but your statement makes it sound that the user can enforce that option with some sysctl/proc entry if he/she wants to.. I personally would go with -1 instead of any file descriptor for anonymous mmap. Why'd one want to use a zero source (which is what `/dev/zero` essentially is) as the virtual page that gets "paged in" anonymously is already zeroed I believe. In fact reading `/proc//maps` we see that the anon mapped pages are zeroed already. – AjB May 14 '15 at 06:42
  • MAP_ANONYMOUS is just a flag that tells OS (at least on Linux) to ignore fd and offset argument. – jclin Mar 24 '16 at 00:00
-1

Both mmap /dev/zero and ANONYMOUS don't have file backed, also zeroed content. Look at the following link http://man7.org/linux/man-pages/man2/mmap.2.html