Virtual Memory and Lazy Allocation: Why RSS Matters More Than malloc()

> $ stat metadata
Date: 2026.04.06
Time: 4 min read
Tags: [virtual-memory, linux, performance, observability, operating-systems, memory]

Memory accounting lies by default. When you call malloc(), you usually do not get physical RAM. You get a reservation in virtual address space and a kernel promise that RAM will be provided later if you touch it. If you care about production reliability, RSS matters more than malloc() counts because the kernel kills processes based on real resident memory, not on virtual promises.


The two numbers you must not confuse

MetricNameWhat it representsWhat it is not
VSZVirtual set sizeVirtual address space reserved for a processPhysical RAM in use
RSSResident set sizePhysical memory pages currently mapped into RAMTotal memory the process could use

VSZ is a credit limit. RSS is cash.


Virtual memory: malloc() is a promise, not a transfer

When you call malloc(1 * 1024 * 1024 * 1024) (1 GB), the OS typically:

  1. Reserves a range of virtual addresses for your process.
  2. Updates page tables to mark those pages as addressable.
  3. Does not allocate physical pages yet.

Result:

  • VSZ increases.
  • RSS stays flat.

This behavior is often called lazy allocation (or demand paging).


Page faults: when you “swipe the card”

Physical memory is committed when you actually touch pages.

Example trigger:

  • You write to the buffer with memset(ptr, 0, 1GB) or you store real data into the array.

At that point:

  1. The CPU accesses a virtual address.
  2. The page table has no physical frame mapped to that page.
  3. The CPU raises a page fault.
  4. The kernel allocates a physical page (commonly 4 KB), maps it, and resumes execution.
sequenceDiagram
    participant App
    participant CPU
    participant Kernel
    participant RAM

    App->>CPU: write to virtual address
    CPU->>CPU: page table lookup
    CPU-->>Kernel: page fault interrupt
    Kernel->>RAM: allocate 4 KB physical page
    Kernel->>CPU: map virtual page -> physical frame
    CPU-->>App: resume instruction

Your RSS grows one page fault at a time.


Allocation cost model: malloc() vs memset()

This is the mental model that saves you in production.

What happens when you only reserve

malloc(1GB)
  VSZ: +1GB
  RSS: +0MB

What happens when you actually touch memory

memset(ptr, 0, 1GB)
  VSZ: ~unchanged (still reserved)
  RSS: +1GB (after thousands of page faults)

The expensive part is not the reservation. The expensive part is committing pages.


Lazy allocation: why the kernel “overbooks” memory

The OS does this because programs are greedy:

  • Many processes reserve more memory than they will ever touch.
  • The kernel bets that not everyone will touch all pages at the same time.

This allows the system to run workloads whose total VSZ far exceeds physical RAM.

The failure mode is straightforward:

  1. Too many processes touch too many pages at once.
  2. Physical RAM fills.
  3. The system must reclaim memory or kill something to avoid a full machine crash.

The OOM killer: the bodyguard that cares about RSS

When Linux runs out of reclaimable memory, it invokes the OOM killer.

The OOM killer does not care about VSZ in the way engineers often expect. It primarily cares about who is consuming physical memory.

High-level behavior:

  1. Evaluate processes using an OOM scoring heuristic.
  2. Favor killing large RSS consumers (and those with high oom_score).
  3. Terminate a victim process to free RAM and keep the machine alive.

If your process crashes “randomly” under load, check RSS, not allocation counters.


Engineering takeaways (what to monitor, what to stop trusting)

What misleads teams

  • Language-level allocators and profilers showing “allocated bytes”
  • VSZ-heavy dashboards
  • Counting malloc() calls as a proxy for risk

You can show 10 GB “allocated” in a profiler while using 2 GB of RSS. The reverse can also happen if a process touches far more memory than expected.

What to monitor in production

  • Process RSS at the OS level
  • Page fault rates:
    • minor vs major faults
  • Memory pressure and OOM signals:
    • oom_score, oom_kill events

Why leaks hide

A “leak” can be:

  • Virtual address space growth (reserved but never touched): VSZ grows, RSS does not.
  • Real resident growth (touched pages retained): RSS grows until the kernel intervenes.

If you only watch virtual promises, you will miss real failure risk.


Key takeaways

  • malloc() is usually a reservation, not a RAM allocation. VSZ can grow without RSS moving.
  • Touching memory triggers page faults and commits physical pages (commonly 4 KB), growing RSS.
  • Lazy allocation overbooks because most processes never spend their full virtual budget.
  • When the bet fails, Linux kills by physical reality: the OOM killer targets processes based on RSS consumption and OOM scoring.
  • For reliability, monitor RSS and page fault rates. Stop treating VSZ or allocator counters as the truth.

[ RELATED_LOGS ]

TTFB: -- ms LOAD: -- s PAYLOAD: -- kb