Memory Ordering in Rust: What the Compiler Guarantees (and What It Doesn't)
When you write AtomicUsize::new(0) in Rust, you're doing more than allocating an integer. You're
entering a contract with the compiler, the CPU's out-of-order execution engine, and every other core on
the chip.
The Problem with Shared State
CPUs lie. Your code says:
static FLAG:
AtomicBool = AtomicBool::new(false);
static DATA: AtomicUsize = AtomicUsize::new(0);
// Thread
A
DATA.store(42, Ordering::Relaxed);
FLAG.store(true, Ordering::Relaxed);
// Thread B
if
FLAG.load(Ordering::Relaxed) {
println!("{}", DATA.load(Ordering::Relaxed)); // might print
0
}
The store to DATA and the store to FLAG can be reordered by the CPU. Thread B may see
FLAG == true before it sees DATA == 42.
Acquire / Release
This pair is the workhorse of
lock-free programming. A Release store publishes all preceding writes. An Acquire load subscribes to
all writes that preceded the matching Release.
// Thread A
DATA.store(42,
Ordering::Relaxed);
FLAG.store(true, Ordering::Release);
// Thread B
if
FLAG.load(Ordering::Acquire) {
assert_eq!(DATA.load(Ordering::Relaxed), 42); // safe
}
SeqCst
Sequential consistency gives a single total ordering all threads agree on. On x86 nearly free.
On ARM64 costs an mfence. Start with SeqCst and loosen under measurement.
Practical Rules
Relaxedonly for counters where ordering doesn't matter.Acquire/Releasefor producer-consumer flag patterns.AcqRelon fetch_and/compare_exchange success paths.- If
unsure, start with
SeqCst.
The Rust compiler will not catch ordering bugs. Use loom to simulate
all possible thread interleavings.