Quick Answer (TL;DR)

Apple Watch is one of the more accurate consumer sleep trackers available — but its accuracy differs significantly by sleep stage. In the most rigorous peer-reviewed validation studies, Apple Watch Series 8 correctly identifies light sleep (Core sleep) about 86% of the time and REM sleep about 83% of the time, compared to in-lab polysomnography (the clinical gold standard). Deep sleep accuracy is lower, around 51%, reflecting an inherent limitation of all wrist-based sensors rather than a flaw unique to Apple Watch. These figures come from two open-access studies published in 2024–2025 and are described in detail below.

Key Takeaways

What Apple Watch Actually Measures During Sleep

Understanding accuracy starts with understanding what the sensors do. When you sleep with Apple Watch:

Optical heart rate sensor (photoplethysmography / PPG): Measures blood volume changes in the wrist capillaries using green and infrared LEDs. Provides continuous heart rate and is the primary physiological input for sleep stage classification. Heart rate patterns differ between sleep stages — REM sleep, for example, shows more variable pulse and breathing than deep sleep, where both slow considerably [Sleep Foundation: REM Sleep].

Heart rate variability (HRV): The variation in the interval between heartbeats reflects autonomic nervous system activity. Stimulation of the parasympathetic nervous system is associated with an increase in HRV indices such as RMSSD, while sympathetic predominance reduces them [PMC10684820]. This signal is one input to sleep stage classification algorithms.

Six-axis accelerometer: Detects movement in all three spatial dimensions. Stillness is associated with deeper or lighter sleep; movement is associated with arousals and transitions. Accelerometer data alone cannot reliably distinguish between specific sleep stages; pairing it with heart rate improves accuracy [PMC10948771].

Blood oxygen saturation (SpO2): Available on Series 6 and later. Measures oxygen saturation via infrared and red LEDs. Periodic checks during sleep can flag oxygen desaturation events, which are a hallmark of sleep-disordered breathing.

Wrist temperature (Series 8 and later): Measures nightly temperature deviation from your personal baseline. Apple uses this data to support retrospective ovulation estimates in the Cycle Tracking feature [Apple Support]. Research confirms that continuously measured wrist skin temperature during sleep is more sensitive than basal body temperature for detecting ovulation-related temperature shifts [Zhu et al., JMIR, 2021]. Apple does not market this sensor as a general circadian rhythm tracker.

The Accuracy Research: What Published Studies Show

Study 1 — Robbins et al. 2024 (Sensors)

A single-night in-lab study enrolled 35 healthy adults (aged 20–50) who wore Apple Watch Series 8 simultaneously with full polysomnography [PMC11511193]. Polysomnography remains the clinical standard for sleep staging, recording brain waves (EEG), eye movements (EOG), muscle tone (EMG), heart rate, and breathing through a full night scored by a trained technician [NHLBI].

Apple Watch Series 8 showed the following sensitivity (proportion of true-positive epoch classifications) compared to PSG:

Sleep stageSensitivityPrecision (PPV)
Light (Core) sleep86.1%72.7%
REM sleep82.6%77.7%
Deep (Slow-wave) sleep50.5%87.8%
Sleep vs. wake97%

The watch significantly underestimated deep sleep duration on average (43 minutes less than PSG) and overestimated light sleep duration (45 minutes more), though total sleep duration was comparable to PSG (typically within 10 minutes). The authors noted “poor concordance” for deep and REM sleep despite moderate sensitivity.

Study 2 — Schyvens et al. 2025 (Sleep Advances)

A six-device comparison against PSG using 62 adult participants, with usable Apple Watch Series 8 data available for 20 participants [PMC12038347]. Among the six devices tested, Apple Watch Series 8 achieved the highest agreement with PSG:

The authors concluded that Apple Watch demonstrated “clinically acceptable accuracy” for measuring total sleep time and sleep efficiency, and could be useful for tracking substantial changes in sleep architecture over time — though it should not replace PSG for clinical diagnosis.

What Both Studies Tell You

No consumer wearable reliably classifies deep sleep at PSG-comparable accuracy. A 2024 systematic review of 35 studies and 62 wearable device setups found that four-stage sleep classification (wake, light, deep, REM) averaged only 65.2% accuracy across all tested devices [PMC10948771]. Apple Watch’s performance is toward the upper end of what current wrist-worn technology can achieve.

The Deep Sleep Problem Explained

Deep sleep (N3, slow-wave sleep) is characterized by high-amplitude, low-frequency delta brain waves [Sleep Foundation: Deep Sleep]. These waves are only directly measurable by scalp EEG electrodes in a polysomnography setup. Consumer wearables infer deep sleep from indirect proxies:

The problem is that these proxies are not unique to deep sleep. A person lying still in light sleep can look similar on wrist sensors to a person in deep sleep. This explains why deep sleep is consistently the least accurate stage for all consumer wearables in published studies — it is an inherent measurement limitation, not a defect specific to Apple Watch.

Practical guidance: Interpret trends over time rather than precise nightly numbers. A consistent pattern of reduced deep sleep across multiple nights is more meaningful than any single night’s figure.

Apple Watch vs. iPhone-Only Sleep Tracking

iPhone-based sleep tracking (placing the phone on a nightstand or mattress) is less capable than Apple Watch for sleep stage estimation because:

  1. No physiological signals: The iPhone accelerometer detects body movement but cannot measure heart rate, HRV, or blood oxygen.

  2. Sensor placement: A phone on a nightstand records room-level vibration rather than direct wrist contact. Wrist actigraphy — the approach used by Apple Watch — achieves high sensitivity for detecting sleep (96.5%) compared to PSG in validated studies, though specificity for wakefulness remains lower (32.9%) [Marino et al., Sleep, 2013].

  3. Multi-sensor advantage: Adding optical heart rate (PPG) to wrist accelerometer data meaningfully improves sleep stage discrimination over accelerometer data alone [PMC10948771]. The iPhone alone cannot offer this.

iPhone-only tracking is useful for estimating sleep duration (time in bed) but substantially less reliable for sleep stage breakdown.

How to Get the Best Accuracy from Apple Watch Sleep Tracking

Wear it correctly. The Apple Watch should be snug enough that the optical sensors maintain consistent contact with your wrist skin. A loose fit produces motion artifact that degrades heart rate readings and sleep stage classification.

Charge before bed. Low battery can cause the watch to reduce sensor sampling frequency. Starting the night at adequate charge avoids mid-night interruptions to tracking.

Enable Sleep Focus. Sleep Focus in watchOS signals to health algorithms that sleep tracking is active and reduces background disturbances.

Use an app that combines both sensors. iPhone microphone snore detection and Apple Watch sleep stage data are complementary. Apple Watch provides sleep stage estimates; the iPhone microphone provides acoustic context — including whether snoring coincided with particular sleep stages. Together, they give you richer information than either sensor provides alone.

Snollo is built specifically to combine both data streams. Apple Watch contributes sleep stage classification; the iPhone microphone contributes snore detection and audio clips — all processed on-device with no server upload.

Apple Watch Sleep Apnea Notification

Apple Watch Series 9, Series 10, Ultra 2, and SE 3 include a Sleep Apnea Notification feature that received FDA marketing authorization in September 2024 [Apple Support]. It uses the accelerometer to detect breathing disturbances during sleep over a 30-day rolling window; if a consistent elevated pattern is detected, you receive a notification.

The feature is explicitly not intended to diagnose, treat, or aid in the management of sleep apnea [Apple Support]. An official sleep apnea diagnosis requires a clinical sleep study [NHLBI]. This feature functions as a screening prompt rather than a diagnostic tool.

What the Numbers Mean for Real Users

If Apple Watch reports 90 minutes of REM sleep, the actual figure could reasonably be somewhat higher or lower given the device’s ~83% sensitivity for that stage. Deep sleep figures carry more uncertainty given the ~51% sensitivity. The Robbins 2024 study found the watch underestimated deep sleep by an average of 43 minutes compared to PSG, suggesting real deep sleep may typically exceed what the watch reports.

The practical rule — use Apple Watch sleep data for:

Do not use consumer wearable data for:

Apple Watch is among the strongest consumer sleep trackers currently available. Its limitations are real, documented in peer-reviewed research, and understood. Working with those limitations — treating the data as useful trend information rather than clinical measurement — is how to get the most value from it.

Snollo does not diagnose or treat any medical condition. Consumer sleep trackers, Apple Watch included, are useful for noticing patterns over time. A clinical assessment requires a polysomnography sleep study or home sleep apnea test ordered by a physician.

Sources

  1. Robbins R, Weaver MD, Sullivan JP, et al. Accuracy of Three Commercial Wearable Devices for Sleep Tracking in Healthy Adults. Sensors (Basel). 2024;24(20):6532. https://pmc.ncbi.nlm.nih.gov/articles/PMC11511193/

  2. Schyvens A-M, et al. A performance validation of six commercial wrist-worn wearable sleep-tracking devices for sleep stage scoring compared to polysomnography. Sleep Advances. 2025;6(2):zpaf021. https://pmc.ncbi.nlm.nih.gov/articles/PMC12038347/

  3. Marino M, Li Y, Rueschman MN, et al. Measuring sleep: accuracy, sensitivity, and specificity of wrist actigraphy compared to polysomnography. Sleep. 2013;36(11):1747–1755. https://pmc.ncbi.nlm.nih.gov/articles/PMC3792393/

  4. Evaluating reliability in wearable devices for sleep staging. PMC. 2024. https://pmc.ncbi.nlm.nih.gov/articles/PMC10948771/

  5. Sleep Foundation. The Stages of Sleep. https://www.sleepfoundation.org/stages-of-sleep

  6. Sleep Foundation. Deep Sleep. https://www.sleepfoundation.org/stages-of-sleep/deep-sleep

  7. Sleep Foundation. REM Sleep. https://www.sleepfoundation.org/stages-of-sleep/rem-sleep

  8. National Heart, Lung, and Blood Institute. Sleep Studies. https://www.nhlbi.nih.gov/health/sleep-studies

  9. National Heart, Lung, and Blood Institute. Sleep Apnea. https://www.nhlbi.nih.gov/health/sleep-apnea

  10. Apple Support. Sleep apnea notifications on your Apple Watch. https://support.apple.com/en-us/120031

  11. Apple Support. Track your nightly wrist temperature changes with Apple Watch. https://support.apple.com/en-us/102674

  12. Zhu J, et al. The Accuracy of Wrist Skin Temperature in Detecting Ovulation Compared to Basal Body Temperature. J Med Internet Res. 2021;23(4):e25707. https://pmc.ncbi.nlm.nih.gov/articles/PMC8238491/

  13. An Integrative Literature Review of Heart Rate Variability Measures to Determine Autonomic Nervous System Responsiveness using Pharmacological Manipulation. PMC. 2023. https://pmc.ncbi.nlm.nih.gov/articles/PMC10684820/