UUID primary keys look perfect for distributed systems: globally unique, locally generated, and no coordination needed. Many teams switch from auto-increment integers to UUIDv4 and everything works fine for a while, right up until the table hits millions of rows and insert latency falls off a cliff. The problem is not uniqueness. The problem is how random IDs fight the storage engine’s physical layout.
How clustered indexes and B+ trees actually store your rows
If you’re using MySQL with InnoDB (and most modern relational engines behave similarly), tables are stored as clustered indexes:
- The primary key is the table: row data lives in the leaf nodes of a B+ tree, sorted by primary key.
- There is no separate “heap” of rows; the clustered index is the physical layout.
With a sequential primary key (AUTO_INCREMENT integer, timestamp-based ID):
- New rows have keys greater than any existing key.
- Inserts append to the right-most leaf page.
- When a page (typically
16 KB) fills:- The engine allocates a new page next to it.
- Subsequent inserts continue on the new right-most page.
This is sequential I/O, which:
- Is cache-friendly.
- Minimizes random writes.
- Keeps pages dense and the B+ tree shallow.
What random UUIDv4 keys do to a B+ tree
UUIDv4 is cryptographically random:
- There is no ordering relationship between consecutive values.
- Example progression:
102b4a... → f912c3... → 0a19b2...
For each insert, InnoDB must:
- Traverse the B+ tree to find the exact leaf page where that random key belongs.
- Insert the row in the middle of the page to keep keys sorted.
If the target page is full, the engine performs a page split.
Page splits: from cheap appends to expensive random I/O
When a 16 KB page is full and a new key must be inserted in the middle:
- Allocate a new
16 KBpage. - Move roughly half of the rows from the original page into the new page.
- Update parent/internal nodes to point to the new page (B+ tree rebalancing).
- Insert the new row into the appropriate page.
Instead of:
- one cheap append to the right-most page,
you get:
- multiple random writes,
- extra CPU for tree maintenance,
- and more depth in the B+ tree.
Over millions of inserts, random UUIDv4 keys turn a compact tree into a fragmented one.
Fragmentation and buffer pool churn
Page splits have two compounding side effects:
- Half-empty pages:
- After a split, both pages are roughly
50%full. - Continuous random inserts rarely fill those gaps perfectly.
- The clustered index starts to look like Swiss cheese.
- After a split, both pages are roughly
- Bloated on-disk and in-memory footprint:
- A logical
50 GBdataset can occupy~100 GBon disk, solely due to fragmentation. - The buffer pool (RAM) caches pages, not rows:
- Half-empty pages mean half your cached memory holds empty space.
- Cache hit ratio declines; the engine must fetch more pages from disk.
- A logical
The net result:
- Insert throughput degrades.
- Simple range queries touch more pages.
- The engine spends its life evicting and loading half-empty pages.
Disks are cheap. RAM and IOPS are not.
Mechanically sympathetic alternatives
You do not have to go back to a single auto-increment sequence on one node. You just need primary keys that cooperate with the storage engine.
There are two production-grade patterns that restore mechanical sympathy.
1. Time-sorted identifiers: UUIDv7 or ULID
Time-ordered IDs (e.g., UUIDv7, ULID, Snowflake-style IDs) encode:
- A timestamp prefix (e.g., first
48 bits). - Followed by random bits for uniqueness.
Properties:
- From the application side:
- Globally unique.
- Can be generated locally on any node.
- From the database side:
- Keys are monotonically increasing by time.
- New rows land on the right-most pages, just like integers.
Result:
- Inserts are sequential again.
- Page splits are rare and predictable (only at the right edge).
- Fragmentation shrinks; buffer pool utilization improves.
2. Internal vs external keys: split physical and logical IDs
If you must use UUIDv4 for:
- security (unpredictable identifiers),
- obfuscation for external clients,
you can decouple the storage key from the public key.
Pattern:
- Use an
AUTO_INCREMENT BIGINTas the clustered primary key. - Store the external
UUIDv4in a separate column with aUNIQUEsecondary index. - In APIs and URLs, only expose the UUID.
Why this works:
- The clustered index stays sequential and dense.
- The UUID uniqueness constraint lives in a secondary index, which:
- does not dictate the physical row order,
- can tolerate more random I/O without destroying the main layout.
You get:
- Fast inserts and scans.
- Stable physical layout.
- External surfaces that never reveal internal integer IDs.
Architecture view: how ID choice affects inserts
graph TD
APP["Application Nodes"] -->|"generate IDs"| IDGEN["ID Strategy"]
subgraph IDStrategy ["ID Strategy"]
IDSEQ["Sequential / time-ordered IDs<br/>(AUTO_INCREMENT, UUIDv7, ULID)"]
IDRND["Random UUIDv4"]
end
IDGEN --> IDSEQ
IDGEN --> IDRND
IDSEQ -->|"append to right-most leaf"| BSEQ["B+ Tree (clustered index)<br/>sequential inserts, few splits"]
IDRND -->|"insert into middle pages"| BRND["B+ Tree (clustered index)<br/>random inserts, frequent splits"]
BRND --> FRAG["Fragmentation & buffer pool churn"]
BSEQ --> HEALTHY["Dense pages & predictable IO"]
%% Styling to match the Brutalist dark theme
style APP fill:#111,stroke:#4ADE80,stroke-width:1px,color:#fff
style IDGEN fill:#111,stroke:#4ADE80,stroke-width:1px,color:#fff
style IDSEQ fill:#111,stroke:#4ADE80,stroke-width:1px,color:#fff
style IDRND fill:#111,stroke:#4ADE80,stroke-width:1px,color:#fff
style BSEQ fill:#111,stroke:#4ADE80,stroke-width:2px,color:#fff
style BRND fill:#111,stroke:#4ADE80,stroke-width:2px,color:#fff
style FRAG fill:#111,stroke:#EF4444,stroke-width:1px,color:#fff
style HEALTHY fill:#111,stroke:#4ADE80,stroke-width:1px,color:#fff
style IDStrategy fill:none,stroke:#333,stroke-dasharray: 5 5,color:#fff
The ID strategy directly determines whether your clustered index grows like a clean log or a fragmented random-access structure.
Why Big-O hides this problem
On paper, inserting into a B+ tree is O(log n) whether you use integers or UUIDs. In reality:
- Storage engines move data in pages (
16 KBblocks), not individual keys. - CPUs operate on cache lines and benefit from sequential access.
- Disks and SSDs are much faster at sequential writes than scattered random writes.
Random UUIDs:
- Break spatial locality.
- Cause page splits and deeper trees.
- Waste buffer pool memory on half-empty pages.
Sequential or time-ordered keys:
- Preserve locality.
- Keep trees shallow and pages dense.
- Align with how storage hardware is optimized.
Mechanical sympathy means choosing data structures and keys that match the hardware, not just the abstract algorithm.
Key takeaways
- UUIDv4 as a clustered primary key is mechanically hostile to B+ tree storage: it inflates page splits, fragmentation, and buffer pool churn.
- Time-ordered IDs (
UUIDv7,ULID, Snowflake-style IDs) give you decentralized uniqueness and sequential inserts. - When random UUIDs are non-negotiable externally, use an internal integer primary key and a secondary unique index on the UUID.
- Schema design is not just about types; it is a decision about how your disk and RAM behave at 3 AM under load.
// SPONSORSHIP
If this research saved you time or improved your architecture, consider sponsoring my work on GitHub. All sponsorships go directly toward infrastructure and further technical research.
[ Become a Sponsor ]