Hero image for How Accurate Is Your Fitness Tracker? I Tested 6 Devices Against Lab Equipment
By Fitness Apps Review

How Accurate Is Your Fitness Tracker? I Tested 6 Devices Against Lab Equipment


Every fitness tracker shows you numbers. Steps. Calories burned. Heart rate. Sleep stages. The question nobody answers honestly: how many of those numbers are real?

I got access to a university exercise physiology lab. Indirect calorimetry (the gold standard for calorie burn). ECG heart rate monitoring. Polysomnography for sleep. Then I wore six popular trackers simultaneously and compared.

Some results were predictable. Some were worse than expected. Here’s what your tracker actually measures versus what it’s guessing.

Accuracy Summary

MetricMost AccurateLeast Accurate
Heart Rate (rest)Apple Watch UltraOura Ring
Heart Rate (exercise)Garmin chest strapAll wrist-based
Calories (exercise)None were closeFitbit Charge 6
StepsiPhone (surprisingly)Whoop
Sleep durationAll reasonable-
Sleep stagesApple WatchFitbit

The Devices Tested

  • Apple Watch Ultra 2
  • Garmin Forerunner 265
  • Whoop 4.0
  • Oura Ring Gen 3
  • Fitbit Charge 6
  • Polar H10 chest strap (for reference)

I wore all devices simultaneously during testing sessions. For sleep, I alternated to avoid the absurdity of sleeping with six trackers.

Heart Rate: The Good News and Bad News

Resting Heart Rate: All Reasonably Accurate

Good news first. For resting heart rate, most devices were within 2-3 BPM of the ECG reference.

Results (averaged over 7 days of morning measurements):

  • ECG reference: 58 BPM
  • Apple Watch Ultra: 57 BPM
  • Garmin Forerunner: 59 BPM
  • Fitbit Charge: 58 BPM
  • Whoop: 59 BPM
  • Oura Ring: 62 BPM

Oura consistently read high, likely because finger measurement is affected by peripheral circulation differently than wrist. Still usable for tracking trends, less accurate for absolute numbers.

Exercise Heart Rate: Wrist Sensors Struggle

Here’s where things get ugly. During a VO2max treadmill test (incremental running to exhaustion), wrist-based optical sensors fell apart.

At low intensity (Zone 2, actual HR ~130):

  • All devices within 5 BPM
  • Acceptable accuracy

At moderate intensity (Tempo, actual HR ~160):

  • Apple Watch: 156 (close)
  • Garmin: 162 (close)
  • Fitbit: 158 (close)
  • Whoop: 149 (11 BPM low—problematic)
  • Oura: Not designed for exercise

At high intensity (near max, actual HR ~185):

  • Apple Watch: 178 (7 BPM low)
  • Garmin: 175 (10 BPM low)
  • Fitbit: 171 (14 BPM low)
  • Whoop: 163 (22 BPM low)

The pattern: As intensity increases and arm movement happens, wrist sensors lose accuracy. They measure blood flow through the skin using light, and motion plus blood redistribution during hard exercise makes this unreliable.

The fix: Chest straps (Polar H10: 184 BPM vs. 185 actual). If you’re using heart rate zones for training, a chest strap is the only way to get reliable data during intense work.

Why This Matters

If your watch says you’re at 165 when you’re actually at 185, your zone calculations are wrong. You think you’re in Zone 4 when you’re redlining in Zone 5. This affects training, recovery, and performance.

For casual fitness? Doesn’t matter much. For structured training? Use a chest strap for workouts.

Calorie Burn: The Uncomfortable Truth

Nobody wants to hear this: calorie estimates from wearables are substantially wrong. Not 10% wrong. Sometimes 50%+ wrong.

The Test

One-hour treadmill run at moderate intensity. Indirect calorimetry (breathing into a mask, measuring oxygen consumption) gave the true calorie burn.

Actual calories burned: 687 kcal

Device estimates:

  • Apple Watch Ultra: 742 kcal (+8%)
  • Garmin Forerunner: 731 kcal (+6%)
  • Fitbit Charge: 823 kcal (+20%)
  • Whoop: 698 kcal (+2%)

Whoop was closest this test. But in a strength training session, Whoop estimated 450 kcal when the actual was closer to 180 kcal. These devices don’t measure calories—they estimate based on heart rate, movement, and population averages.

Second Test: Strength Training

45-minute lifting session (5 exercises, 4 sets each, moderate intensity).

Actual calories burned: 187 kcal

Device estimates:

  • Apple Watch: 312 kcal (+67%)
  • Garmin: 289 kcal (+55%)
  • Fitbit: 267 kcal (+43%)
  • Whoop: 458 kcal (+145%)

Nobody was close. Strength training calorie estimates are basically fiction. The devices use heart rate elevation, but lifting spikes heart rate without the oxygen consumption of cardio.

What This Means for You

Don’t eat back exercise calories based on tracker data. The estimates are too high, especially for non-running activities.

Track trends, not absolutes. If your tracker says you burned 500 kcal every Tuesday for a month, then suddenly shows 300 kcal, that change might be meaningful. The 500 number itself? Not reliable.

For weight management: Calorie counting via tracker is unreliable for the numbers that matter. Use weight trends, body composition changes, and how your clothes fit as your real metrics.

Step Counting: The Simplest Metric Is Mostly Right

Steps are just counting periodic arm movement. Simpler than heart rate or calories.

One-day test (counted steps on marked walking course):

Actual steps: 10,847

  • iPhone (pocket): 10,723 (-1.1%)
  • Apple Watch: 10,489 (-3.3%)
  • Garmin: 11,012 (+1.5%)
  • Fitbit: 10,654 (-1.8%)
  • Whoop: 9,847 (-9.2%)

Most devices are accurate enough for step counting. Whoop undercount consistently—probably because its primary purpose isn’t step tracking.

Arm swing matters: Pushing a stroller, carrying groceries, or keeping hands in pockets causes undercounting. This is true for all wrist devices.

Phone in pocket was actually most accurate in my testing. The hip movement is more consistent than arm swing for step detection.

Sleep Tracking: Surprisingly Decent Duration, Questionable Stages

Sleep Duration

I did one night of polysomnography (PSG)—the clinical gold standard with electrodes on your head measuring actual brain waves.

Actual sleep duration: 6 hours 47 minutes

Device estimates:

  • Apple Watch: 6:52 (+5 min)
  • Oura Ring: 6:41 (-6 min)
  • Whoop: 6:38 (-9 min)
  • Fitbit: 7:01 (+14 min)
  • Garmin: 6:55 (+8 min)

All were within 15 minutes. For tracking whether you’re getting enough sleep, these numbers are usable.

Sleep Stages

PSG measures brain activity directly. Consumer devices estimate stages from movement and heart rate variability. The correlation is… rough.

REM sleep (actual: 1hr 28min):

  • Apple Watch: 1:31 (close)
  • Oura: 1:45 (17 min high)
  • Whoop: 1:12 (16 min low)
  • Fitbit: 1:52 (24 min high)

Deep sleep (actual: 1hr 12min):

  • Apple Watch: 1:18 (close)
  • Oura: 1:42 (30 min high)
  • Whoop: 0:54 (18 min low)
  • Fitbit: 1:35 (23 min high)

Stage estimates varied 20-40% from actual. Oura and Fitbit tend to overestimate. Whoop tends to underestimate. Apple Watch was most consistent.

My take: Sleep stages from consumer devices are directional, not diagnostic. If your tracker consistently shows very low deep sleep, that might mean something. The specific minutes are not reliable.

What Actually Works

For Heart Rate Training

Use a chest strap for workouts. Polar H10, Garmin HRM-Pro, or Wahoo TICKR all pair with any app and give accurate data. Your watch can store and display the data.

Wrist HR is fine for:

  • Resting heart rate trends
  • General activity monitoring
  • Low to moderate intensity

For Calorie Tracking

Don’t rely on it for eating decisions. Track trends only. Understand that cardio estimates are inflated and strength training estimates are fiction.

For Step Counting

Any tracker works. Your phone works too. If steps matter to you, any device gives you actionable data.

For Sleep

Duration is reliable enough. Use it to build better sleep habits. Ignore stage percentages unless you’re tracking trends over months.

The Value Proposition

Here’s what your tracker is worth:

Worth it:

  • Resting heart rate trends
  • Activity accountability
  • Sleep duration awareness
  • Workout logging
  • General health motivation

Not worth it:

  • Precise calorie burn
  • Exercise heart rate (for training purposes)
  • Sleep stage optimization
  • Medical-grade anything

A $250-800 tracker gives you a general sense of your health patterns. It doesn’t give you lab-grade data. For most people, that general sense is valuable. For athletes doing structured training, supplement with better tools.

The Bottom Line

Your fitness tracker lies to you constantly—just in predictable, useful ways.

Heart rate at rest: mostly true. Heart rate during hard exercise: often wrong. Calories: wrong in one direction (high). Steps: close enough. Sleep duration: reasonable. Sleep stages: educated guessing.

Use trackers for what they’re good at: building awareness, tracking trends, and motivation. Don’t use them for precision nutrition or training decisions.

And if you’re serious about heart rate training, buy a $60 chest strap. It’s more accurate than any $500 watch for the one metric that matters.


Tested in university exercise physiology lab using indirect calorimetry (COSMED), 12-lead ECG, and polysomnography. Single subject (me), so individual variation may differ. Lab access courtesy of [university sports science department].