Sleep Trackers: How They Work, What They Measure, and Their Limits

Consumer sleep trackers have become ubiquitous — worn on wrists, fingers, and placed under mattresses. They promise insights into your sleep stages, HRV, recovery scores, and more. Some of this data is genuinely useful. Some of it is presented with false precision. Understanding what these devices can and cannot actually measure will help you use them constructively — and avoid the growing problem of sleep anxiety caused by over-reliance on tracker data.

Types of Sleep Trackers

Wrist Wearables

The most common category. Apple Watch (watchOS 10+), Garmin (multiple models), and Fitbit/Google Pixel Watch measure sleep primarily through actigraphy (movement sensing via accelerometer), heart rate (optical photoplethysmography/PPG), and heart rate variability. Some newer models include skin temperature sensors, which add a useful circadian tracking dimension. Wrist-based measurement can be affected by wrist movement artifacts and skin contact quality.

Smart Rings

The Oura Ring (Gen 3 and 4) places sensors on the finger — a location with excellent PPG signal quality due to blood vessel density. Oura measures movement, heart rate, HRV, skin temperature, and respiratory rate. The form factor (sleeping with a ring vs. a watch) appeals to many people, and the Oura has some of the most studied accuracy of any consumer device in peer-reviewed validation research.

Under-Mattress and Bedside Sensors

Products like Withings Sleep Analyzer and Eight Sleep's Pod Pro use radar or pressure sensors without body contact. Withings additionally includes an FDA-cleared sleep apnea screening feature. These devices measure breathing and movement without any wearable requirement, which is useful for people who won't consistently wear a device. Accuracy for sleep stage identification varies and tends to be lower than wearables for some metrics.

Smartphone Apps

Apps using microphone (breath and movement sounds) or accelerometer placed on the mattress have the lowest accuracy. They are best used for approximate sleep timing and sound events (snoring detection) rather than sleep staging. They are a reasonable free starting point but not reliable enough for actionable health data.

How Consumer Sleep Trackers Work

Consumer trackers use actigraphy combined with physiological signals to infer sleep states. Here's what each sensor actually measures:

  • Accelerometer (movement): Movement is used as a proxy for wakefulness vs. sleep. Stillness = likely sleep. Movement = likely wake. This is relatively reliable for total sleep time but poor for distinguishing sleep stages (you're still during light sleep, deep sleep, and much of REM).
  • Optical heart rate (PPG): Heart rate during sleep changes across stages — lower during deep sleep, irregular during REM. Algorithms use heart rate patterns alongside movement to infer staging.
  • Heart rate variability (HRV): HRV reflects autonomic nervous system activity. During deep NREM sleep, HRV typically increases (parasympathetic dominance). During REM, HRV becomes more variable. HRV tracking over time is one of the more reliable indicators of overall recovery and physiological stress.
  • Skin temperature: Core body temperature drops during sleep onset; skin temperature (a proxy) rises. Tracking this helps detect circadian rhythm patterns and can flag illness, menstrual cycle phases, and alcohol effects on sleep.
  • SpO2 (blood oxygen): Measured by some trackers (Oura, Garmin, Apple Watch) to estimate sleep-disordered breathing events.

What Trackers Measure Accurately (and What They Don't)

Reasonably Accurate

  • Total sleep time: Generally within 20-30 minutes of PSG for most wearables
  • Sleep efficiency (percentage of time in bed actually asleep): Reasonable approximation
  • Resting heart rate: Generally accurate to within 1-2 BPM
  • HRV trends: Individual trend tracking is more meaningful than absolute values
  • Sleep timing (when you fell asleep, when you woke): Generally good

Less Accurate

  • Sleep stages: PSG validation studies consistently show 60-80% accuracy for epoch-by-epoch stage identification — better than chance, but far from clinical-grade. Confusing light sleep (N1/N2) with quiet wakefulness is particularly common.
  • Deep sleep (N3) amounts: Frequently overestimated or underestimated; the specific amounts shown should not be taken literally
  • REM detection: Better than light sleep classification, but still imperfect — particularly confusing quiet wakefulness with REM in some algorithms
  • SpO2-based sleep apnea screening: Some devices (Withings Sleep Analyzer, specific Garmin models) have FDA-cleared algorithms; most do not and should not be used as apnea screening tools
The key insight: Use your tracker for trends and overall patterns (sleep timing consistency, HRV trajectory over weeks, total sleep time trends) rather than treating individual night readouts as precise clinical data. A single night showing "only 45 minutes of deep sleep" may be inaccurate; a consistent trend showing degraded sleep across many nights is more meaningful.

Orthosomnia: The Dark Side of Sleep Tracking

A clinical phenomenon called orthosomnia — sleep anxiety caused by excessive focus on sleep tracker data — has been formally described in the literature. Patients arrive at sleep clinics not because of traditional sleep complaints but because their tracker told them their sleep score was poor, and now they lie in bed anxious about achieving better data.

Ironically, the anxiety generated by poor sleep scores can worsen sleep quality — creating a self-fulfilling cycle. If you notice that checking your sleep data makes you anxious, or you're changing behaviors reactively to individual night scores rather than long-term trends, consider reducing tracker use frequency or removing the score from your regular check-in.

SpO2 Monitoring and Sleep Apnea Screening

Some consumer trackers with SpO2 sensors can flag patterns consistent with sleep-disordered breathing. This is one of the most clinically useful applications of consumer sleep technology. Consistent readings below 90% SpO2 during sleep, or frequent oxygen desaturation events flagged by the device, are worth discussing with a physician. However:

  • No consumer device is FDA-cleared to diagnose sleep apnea (as of 2025 — this is evolving)
  • The Withings Sleep Analyzer has FDA clearance for sleep apnea screening (not diagnosis)
  • A positive or suspicious screening result warrants an in-lab or home sleep apnea test (HSAT) — not treatment based on wearable data alone

Tracker Comparison: Oura Ring vs Apple Watch vs Garmin vs Whoop

DeviceForm FactorKey StrengthsLimitationsSubscription Required
Oura Ring Gen 4RingExcellent PPG signal from finger, strong research validation, skin temperature, HRV depthNo display, requires phone, $6/mo subscription for full dataYes ($6/mo)
Apple Watch Series 9/Ultra 2Wrist watchExcellent ecosystem integration, evolving sleep features, FDA-cleared irregular rhythm notificationLarge form factor, needs nightly charging, sleep staging is basicNo (basic); optional Fitness+
Garmin (Fenix/Forerunner/Venu)Wrist watchLong battery life (days-weeks), Body Battery metric, good HRV tracking, SpO2Sleep staging accuracy similar to competitorsNo
Whoop 4.0Wrist bandStrong HRV focus, recovery scoring, 24/7 continuous HR monitoring, skin tempSubscription-only model ($30/mo), no display on deviceYes ($30/mo)
Fitbit/Google Pixel WatchWrist watchAffordable entry point, long history of sleep dataSleep staging accuracy debated; Google data privacy concernsFitbit Premium optional
Withings Sleep AnalyzerUnder-mattressNo wearable required, FDA-cleared snoring/apnea screening, breathing disturbance dataDoesn't travel with you, less HRV depth than wearablesNo

When to See a Doctor Based on Tracker Data

The following tracker findings are worth discussing with a physician even if you feel subjectively okay:

  • Consistent low SpO2 readings during sleep (frequently below 90%)
  • Device flags frequent breathing disruptions or irregular rhythms
  • Resting heart rate elevated 10+ BPM above your normal baseline for more than a week without explanation
  • HRV chronically very low and declining trend over weeks (may reflect chronic stress, illness, or autonomic dysfunction)
  • Sleep efficiency below 75% consistently (time asleep divided by time in bed)
  • Total sleep time consistently under 6 hours despite adequate time in bed (may indicate a treatable sleep disorder)

Frequently Asked Questions

Are sleep tracker sleep stages accurate?
Moderately, but with important limitations. Validation studies comparing consumer wearables to polysomnography (PSG) — the gold standard for sleep staging — consistently show 60-80% epoch-by-epoch agreement. This is better than chance but not clinically reliable. The specific minutes shown for each sleep stage should be understood as approximations with error ranges of ±15-30%. Trends across many nights are more meaningful than any individual reading.
Which sleep tracker is most accurate?
The Oura Ring and Whoop have the most peer-reviewed validation research among consumer wearables, and generally perform somewhat better in head-to-head comparisons with PSG. The finger location of the Oura Ring provides a strong PPG signal. However, all consumer devices have similar broad limitations for sleep stage accuracy — the differences between leading devices are often smaller than the gap between all of them and clinical PSG.
Can a sleep tracker detect sleep apnea?
Some devices can screen for sleep-disordered breathing patterns, but currently no consumer wearable is FDA-cleared to diagnose sleep apnea. The Withings Sleep Analyzer has FDA clearance for sleep apnea screening. If your device consistently flags breathing disruptions or low SpO2 during sleep, treat this as a reason to get a clinical sleep study — not as a diagnosis. Sleep apnea requires diagnosis by a qualified clinician and ideally a home sleep apnea test or in-lab polysomnography.
My sleep score was terrible but I feel fine. Should I be worried?
How you feel is important clinical data too. A single poor sleep score combined with feeling subjectively well is most likely a tracking artifact rather than a genuine problem. Individual sleep score nights are particularly unreliable — sensors can be affected by alcohol (which changes HRV patterns), medications, sleeping in an unusual position, or the device fitting differently. Use your subjective wellbeing and a trend over many nights to assess sleep quality rather than reacting to individual night scores.
What is HRV and why does my tracker measure it?
Heart rate variability (HRV) is the variation in time between consecutive heartbeats. Contrary to what you might expect, a healthy heart doesn't beat like a metronome — the intervals between beats vary naturally, driven by the autonomic nervous system. Higher HRV generally reflects better autonomic function, lower physiological stress, and better recovery. During quality sleep, HRV typically increases. Chronic low HRV is associated with physiological stress, overtraining, illness, and poor sleep quality. HRV trends over weeks are more useful than individual night readings.
This content is for educational purposes only. Sleep trackers are not medical devices and cannot diagnose sleep disorders. See a physician or sleep specialist if you have concerns about your sleep health.