Big-O says arrays and linked lists are both O(N). Real hardware does not. On modern CPUs, an array can be ~3x faster than a linked list for the same traversal because of spatial locality and cache line behavior.
Big-O is the beginning, not the answer
Big-O assumes:
- every memory access costs the same time
- the CPU never stalls waiting for data
That assumption breaks on the memory wall.
Latency: cache vs RAM
| Location | Approx access time | What it means |
|---|---|---|
L1 cache | ~1 ns | fast enough to keep the pipeline fed |
Main RAM | ~100 ns | if you miss, the CPU stalls |
If your data is not in cache, your CPU waits. This is why “same Big-O” does not imply “same runtime.”
The secret unit of transfer: the cache line
A CPU does not fetch a single integer from RAM. It fetches cache lines:
- common cache line size:
64 bytes - memory is fetched as contiguous chunks
So when you read array[0]:
- the CPU fetches the cache line containing
array[0] - that same line likely includes adjacent values (e.g.
array[0]througharray[15]if integers are 4 bytes)
That turns later iterations into cache hits.
Why arrays are fast: spatial locality
Given sequential access:
- the CPU sees the access pattern as predictable
- the hardware prefetcher starts loading the next cache line early
Result:
- first iteration pays the cache miss cost
- the next ~
15iterations execute near-cache speed (~1 nseach)
Arrays win because they turn random “wait for RAM” into predictable cache hits.
The linked list penalty: pointer chasing
Linked lists break spatial locality.
What happens in the linked list loop:
- You follow a pointer from node to node.
- Each node may live anywhere on the heap.
- The next node address is not predictable.
So the CPU:
- fetches one cache line for node A
- then follows a pointer to node B
- often finds node B is not in the fetched cache line
- goes back to RAM (
~100 ns) for the next node
This is pointer chasing, and it defeats the hardware prefetcher.
Every iteration can turn into a cache miss.
Cache behavior in the real world (rule-of-thumb)
The point is not exact percentages. The point is the direction:
| Data structure | Cache hit tendency (typical) | Hardware effect |
|---|---|---|
| Array traversal | cache-friendly access pattern | high hit rate, prefetch works |
| Linked list traversal | random access pattern | low hit rate, prefetch fails |
In practice you often see:
- arrays closer to high cache residency (e.g.
~94%hits) - linked lists significantly lower (e.g.
~70%misses)
That matches the benchmark gap you can observe.
Benchmark intuition: same algorithm, different silicon outcome
For summing 100,000 integers:
- array traversal: ~
68,000 ns - linked list traversal: ~
181,000 ns
The algorithm and Big-O are equal. The cost model is not.
Engineering takeaway: choose data layout for the hardware
When you pick a data structure, you are choosing:
- how memory is laid out (
contiguousvsscattered) - whether accesses are sequential or random
- whether prefetching and cache line utilization work for you
Practical rules
- Prefer arrays or tightly packed structures for hot loops.
- Process data in the order it sits in memory (sequential access first).
- Avoid pointer-heavy designs on latency-critical paths.
- Remember: pointers are not free. Memory indirection is a hardware tax.
Mermaid: cache line utilization vs pointer chasing
graph TD
Q[CPU asks for next element] --> A{Is data contiguous}
A -- Yes Array --> M1[Cache line fetch 64 bytes]
M1 --> H1[L1 hits for next 15 ints]
H1 --> F1[Fast loop execution]
A -- No Linked List --> M2[Cache line fetch for node A]
M2 --> P[Follow pointer to node B]
P --> R2{Is node B in cache line}
R2 -- No --> RAM[RAM fetch 100 ns]
RAM --> S[CPU stalls]
R2 -- Yes --> H2[L1 hit]
Key takeaways
- Big-O hides the memory wall. Real runtime is dominated by cache residency and stalls.
- Arrays benefit from spatial locality: sequential reads use cache lines and hardware prefetchers.
- Linked lists suffer from pointer chasing, which makes the next access unpredictable and cache-unfriendly.
- In performance-critical code, choose data structures that match the hardware’s strengths: contiguous memory + sequential iteration.
// SPONSORSHIP
If this research saved you time or improved your architecture, consider sponsoring my work on GitHub. All sponsorships go directly toward infrastructure and further technical research.
[ Become a Sponsor ]