Kernel memory allocation uses virtual mapping initially; physical allocation defers to page faults via demand paging. Buddy allocator handles physical pages on fault. Enables overcommitment and efficiency.
User-space malloc (via brk/mmap) allocates virtual address space. Kernel maps pages in PTEs as invalid or reserved. No physical RAM until access triggers #PF (page fault).
// User malloc void* ptr = malloc(4096); // Virtual alloc, no physical yet *ptr = 42; // Triggers page fault, then physical alloc
On access, CPU raises #PF (interrupt 14). Kernel handler (e.g., do_page_fault) checks VMA (vm_area_struct), allocates physical page if valid, updates PTE. Lazy allocation saves resources.
; Page fault handler (assembly snippet)
page_fault:
push %rax ; Save regs
mov %cr2, %rdi ; Fault address in CR2
call do_page_fault
pop %rax
iretq
Physical allocator (buddy system in mm/page_alloc.c) splits/merges power-of-2 blocks. On fault, kernel calls __alloc_pages (get_free_pages) from free_area lists. Zones: DMA, Normal, Highmem.
// Kernel physical alloc (C) struct page *page = alloc_pages(GFP_KERNEL, 0); // Order 0: single page unsigned long phys_addr = page_to_pfn(page) << PAGE_SHIFT;
After alloc, kernel sets PTE (page table entry) with physical frame number (PFN), flags (present, writable). x86: 4-level paging (PML4, PDPT, PD, PT). Use pgd/p4d/pud/pmd/pte macros.
// Set PTE (C) pte_t *pte = pte_offset_map(pmd, addr); set_pte_at(mm, addr, pte, pte_mkyoung(pte_mkdirty(mk_pte(page, prot))));
Lazy alloc allows overcommit (more virtual than physical). If RAM exhausts on fault, OOM killer (out_of_memory) selects/kills process. Sysctl vm.overcommit_memory tunes behavior.
# Tune overcommit echo 1 > /proc/sys/vm/overcommit_memory // Allow overcommit
Related: Fork uses COW. Shared pages marked read-only; write fault allocates new physical page, copies data. Optimizes memory usage.
// Fork COW fault
if (write && pte_dirty(pte) && !pte_write(pte)) {
// Handle COW: alloc new page, copy, update PTE
}
For anonymous pages, kernel maps read-only zero page initially. Write fault replaces with new zeroed physical page. Saves init time.
// Zero page (global) struct page *empty_zero_page;
Minor fault: Page in memory but not mapped. Major: Disk/swap load. Buddy alloc for swap-in. vm_ops->fault for file-backed.
// Fault types
if (vmf->flags & FAULT_FLAG_MAJOR) {
// Disk I/O for page
}
Key files: mm/memory.c (do_page_fault), mm/page_alloc.c (buddy), arch/x86/mm/fault.c. Structs: vm_area_struct (VMA), mm_struct (process memory), page (physical page).
// VMA struct (C)
struct vm_area_struct {
unsigned long vm_start, vm_end;
struct mm_struct *vm_mm;
// ...
};
Kernel defers physical allocation to page faults for efficiency. Virtual alloc immediate (malloc/mmap), physical via buddy on access. Enables overcommit, COW, zero-page opts. Low-level: #PF handler, PTE updates, CR2 register.