BackJuly 01, 20267 min readsleep trackingApple WatchGarminwearableCentury

Wearable Sleep Tracking Accuracy: What Your Watch Can (and Can't) Tell You

How accurate is your Apple Watch, Garmin, or Oura Ring at tracking sleep? We dig into the actual science — polysomnography studies, accuracy percentages, and what you should actually pay attention to.

Wearable Sleep Tracking Accuracy: What Your Watch Can (and Can't) Tell You

Wearable Sleep Tracking Accuracy: What Your Watch Can (and Can't) Tell You

You wake up, check your watch, and it tells you that you got 47 minutes of deep sleep. Should you believe it? Is that number accurate, or is your wearable basically guessing?

These are fair questions. Sleep tracking has become one of the most-used features on Apple Watch, Garmin, Oura, and Whoop — but the accuracy of these devices varies a lot more than most people realize. Let's look at what the research actually says, where wearables do well, and where they struggle.

YouTube: Related video

How wearables track sleep

Before we talk about accuracy, it helps to understand how your watch actually figures out whether you're asleep. Wearables don't measure brainwaves — they infer sleep from other signals:

Heart rate and heart rate variability (HRV). Your heart rate drops and your HRV shifts when you fall asleep. Different sleep stages produce different HRV patterns.

Movement. Accelerometers detect when you're still (likely asleep) vs. moving (likely awake). This is why lying perfectly still on the couch can sometimes be registered as a nap.

Breathing rate. Newer devices estimate respiratory rate from subtle movements in your chest and changes in heart rhythm, which helps distinguish sleep stages.

Skin temperature. Some devices track peripheral temperature changes, which naturally fluctuate across the sleep-wake cycle.

From these inputs, algorithms estimate when you fell asleep, when you woke up, and how much time you spent in light sleep, deep sleep, and REM. It's impressive technology — but it's still an estimate.

The gold standard comparison

To measure how accurate wearables really are, researchers compare them against polysomnography (PSG) — the gold standard sleep lab test that tracks brainwaves (EEG), eye movements, eye muscle activity, breathing, and heart rhythm simultaneously.

The results are revealing. A 2025 study published in SLEEP Advances tested six popular consumer wearables — including Apple Watch Series 8, Garmin Vivosmart 4, Whoop 4.0, and Fitbit Sense — against PSG in a controlled sleep lab.

The key finding? Even the best consumer wearables get sleep stage classification right only about 60-65% of the time for four-stage classification (wake, light, deep, REM). That's a long way from perfect — but it doesn't mean the data is useless.

Here's a rough breakdown by device based on multiple studies:

Device Sleep Stage Accuracy vs PSG (approx.)
Oura Ring ~61-65% (best overall for sleep staging)
Whoop 4.0 ~60% (strong on REM, weaker on awake detection)
Apple Watch ~50-53% (better since watchOS 9 update)
Garmin ~50% (weaker on sleep stage classification)
Fitbit ~48-51% (varies by model)

What wearables are good at

It's not all bad news. Wearables actually do quite well at some things:

Sleep duration. Most devices are reasonably accurate at detecting when you're asleep vs. awake over the full night. Total sleep time estimates are generally within 30 minutes of PSG.

Sleep consistency. While the exact minute-by-minute breakdown might be off, wearables are excellent at tracking your sleep patterns over time. If your sleep duration is trending down, your device will catch it.

Heart rate and HRV during sleep. Optical heart rate sensors are quite accurate during sleep (when you're still). HRV measurements from Apple Watch, Oura, and Whoop show strong agreement with medical-grade ECG — often above 0.95 correlation. This is the data that Century AI relies on most heavily when calculating your recovery score.

Wake-up timing. Detecting when you actually got out of bed is straightforward for wearables, and this data is reliable.

Where wearables struggle

Deep sleep vs. light sleep. This is the hardest distinction for wrist-worn devices. PSG uses brainwaves to tell these apart; wearables use heart rate patterns and movement — which aren't as precise. This is why your deep sleep number might swing from 45 minutes one night to 90 the next without you feeling any different.

Brief awakenings. Everyone wakes up briefly multiple times per night (usually for less than a minute). Wearables often miss these micro-awakenings entirely, so your "awake time" is probably undercounted.

REM sleep. REM is tricky because your heart rate and breathing become irregular — similar to when you're awake. Devices that rely heavily on heart rate patterns struggle to distinguish REM from wakefulness.

Sleep onset latency. How long it takes you to fall asleep is hard for a wrist device to measure accurately. You might be lying perfectly still but wide awake, and your watch has no way to know.

Naps. Most wearables are inconsistent with nap detection. Some handle it well (Oura), others barely track naps at all (older Garmin models).

The Quantified Scientist's take

Rob, the postdoctoral researcher behind The Quantified Scientist YouTube channel, has tested dozens of wearables against reference devices in controlled conditions. His work — along with a 2026 study in collaboration with the University of Salzburg that tested 15 wearables in a sleep lab — gives one of the clearest pictures we have.

The takeaway: Oura Ring and Whoop (with its newer algorithm) lead the pack for sleep staging accuracy. Apple Watch has improved significantly with watchOS updates and performs well for total sleep time and HRV. Garmin devices, while excellent for fitness tracking, tend to lag behind in sleep stage classification — though they do fine for sleep duration and consistency tracking.

How to use sleep tracking data wisely

So should you ignore your sleep data? No. But you should use it the right way:

Focus on trends, not single nights. One night of "bad" deep sleep means very little. Two weeks of declining deep sleep might mean something. Look at the big picture.

Pay most attention to what wearables measure best. Total sleep duration, resting heart rate, and HRV during sleep are the most reliable metrics. These are the foundation of meaningful recovery insights.

Don't stress about sleep stages. Obsessing over whether you got exactly 90 minutes of REM sleep is counterproductive — and the data isn't precise enough to justify the anxiety. If you're sleeping enough hours and waking up feeling rested, you're probably doing fine.

Use it to spot patterns. This is where wearables shine. Notice that your HRV drops after alcohol? That deep sleep suffers when you eat late? That your resting heart rate trends higher during stressful work weeks? These patterns are real and actionable, even if the exact sleep stage numbers aren't perfect.

Trust how you feel. Your wearable is a tool, not a judge. If you feel great but your sleep score is low, trust your body. If you feel exhausted and your score is high, pay attention to that too — there might be something the sensors aren't catching.

Quick summary

  • Consumer wearables estimate sleep stages with about 50-65% accuracy vs. lab polysomnography
  • They're much better at total sleep duration, resting heart rate, and HRV
  • Deep sleep and REM estimates are the least reliable individual metrics
  • Oura Ring and Whoop lead in sleep staging; Apple Watch is strong for HRV and duration
  • Focus on trends, patterns, and how you feel — not single-night sleep stage numbers
  • The real value is spotting long-term patterns that affect your recovery

Century AI helps you understand your body with a daily health score, recovery score, and sleep insights — using the watch you already wear.

Century is building a calm daily health score + plan - using the watch you already wear.