4 stories
·
0 followers

The Flawed V02 Max Craze

2 Shares

In the past couple of weeks I’ve had 2 patients contact me because they were worried: their V02 max was decreasing. Their data were based on smartwatch imputations, which are notoriously imprecise. But the problem is much bigger than that. In this edition of Ground Truths I’m going to get into the difference between cardiorespiratory fitness and V02 max, which are remarkably different for the way they are measured, the datasets that assess them for functional significance and outcomes for healthy adults, and how we got into this craze.

Subscribe now

A schematic I made with NanoBanana Pro (the use of a treadmill or bicycle is interchangeable but the measurements are altogether different .

How They Are Measured

Cardiorespiratory fitness (CRF) is a real world assessment of a person activities, such as walking or on a treadmill, a reflection of a person’s resting metabolic rate, measured in metabolic equivalent of task (MET) units with 3 recognized levels of intensity : Light (<3.0 METs), example slow walking; Moderate (3.0-5.9 METs), example brisk walking, 3-4 miles per hour; and vigorous intensity (>6.0 METs), example jogging. 1 MET is the energy used in sitting or resting; 10 METs requires 10-fold the energy expenditure. CRF integrates cardiovascular, lung and musculoskeletal functional capacity.

There are multiple methods to calculate your METS, including a standard treadmill MET chart (below left) that plots speed and incline, use a formula if you are doing the Bruce treadmill protocol or the chart below (right), or using heart rate (with any aerobic activity, such as bicycling or jogging) with the formula: METS=0.05 X heart rate+2. So if your HR got to 140 that would be 9 METS. For every increase in heart rate of 10 beats per minute, there’s about a 1 MET increase.

Maximal oxygen uptake (V02 max) is only accurately determined as a performance lab test with a metabolic cart, trained technicians, a specialized tightly fit mask that captures every molecule of inhaled oxygen and exhaled C02 on a ramped treadmill or stationary bike exercise protocol until absolute exhaustion. This is the ceiling of aerobic power achieved via direct gas exchange. A V02 max test costs about $150 for a standard assessment in a university lab.

V02 max by wearables are obviously not measured by gas exchange or directly, but instead through various imputations based upon population-based algorithms of heart rate and movement (GPS/accelerometry). Studies have assessed the Apple Watch, Garmin Fenix 6, and Fitbit with a mean absolute percentage error of 7-16%. Overall, they have been found to consistently underestimate V02 in fit people while overestimate in unfit individuals. They also rely on optical heart rate (which may be inaccurate in people of color), device positioning and wrist anatomy, and can be influenced by such factors as hydration status, altitude, and ambient temperature. Typically, a 6-minute walk is the basis for a wearable to provide the user a new V02 max result. That may not be at all representative of an individual’s exercise capacity.

Share

The Datasets For Assessment

Datasets for Cardiorespiratory Fitness

In JAMA 2009, a meta-analysis of 33 studies of cardiorespiratory fitness was published for the relationship to all-cause mortality in a total of 102,980 participants. A better CRF (per 1 MET higher) was linked with a lower all-cause mortality (Figure) and individuals who had achieved 7.9 METs had substantially less all-cause and cardiovascular mortality. One MET higher CRF was associated with a 14-15% reduction of mortality.

In 2016, the American Heart Association issued a scientific statement on CRF and asserted it should be regarded as a clinical vital sign, reviewing all of the published data to that point.

In 2018, Mandsager and colleagues from the Cleveland Clinic published their data from 122,007 consecutive patients who underwent exercise treadmill testing and had long term follow-up for outcomes.

Here is the Table of METS performance by age and sex. You can see there are 5 categories (columns) from low to elite.

All-cause mortality by sex and the 5 levels of performance are plotted below. The hazard ratio of 1.41 (about 40% increases risk of all-cause mortality) for above average vs below average was the same as the risk of smoking or diabetes. The hazard ratio for mortality from low to elite was more than 5-fold. The favorable impact for women beyond men for METS was seen for each performance group. These results were adjusted for potential confounding variables.

In 2022, Kokkinos and colleagues published CRF exercise treadmill data for over 750,000 veterans aged 30 to 95 years with a mean follow-up of 10.2 years. The analysis was based on 6 categories of MET performance but the hazard ratios were similar to the Cleveland Clinic data (e.g. extremely fit vs lowest 4-fold in this study, 5-fold hazard in the prior one).

Here is a good summary graph of that study. In both there was no risk of increased mortality at the highest fitness strata—in fact it was consistently lower for each age group.

Datasets for V02 Max

There are limited data for direct measurement of V02 max and outcomes. The 2001 Kuopio study from Finland of 1,294 men with 10.7 year follow-up did measure V02 max directly once at baseline along with a symptom-limited exercise tolerance test on a bicycle ergometer. The relationship of V02 max (by quartiles) to all-cause mortality is shown below.

In a 2024 meta-analysis of 42 studies with V02 max (as categorized as “objectively measured CRF”) and estimated CRF for prediction or all-cause and cardiovascular mortality, the results were remarkably similar (cardiovascular mortality, 14% reduction, graph below) but notably there were 234-fold more participants with exercise CRF than by V02 max measurements, or >99% of the data is derived from METS. That is to say, nearly all the data we have for link to outcomes comes from CRF, not V02 max.

Reference standards have been published by age group for V02 max. For more information such as by sex, please check the link.

There are other specific studies in heart failure ,chronic obstructive pulmonary disease, pulmonary hypertension and pre-operative evaluation that show use of V02 max can help guide risk or treatment.

Share Ground Truths

Conflation and the V02 Max Craze

The leading proponent of using V02 max in recent years has been Peter Attia, through his podcast The Drive and book Outlive. He has consistently asserted “V02 max is the singular most powerful marker for longevity.” But the problem is conflation. He cites all of the studies of CRF without measuring V02 max and extrapolates to a V02 max result (see side-by-side Kokkinos study Table and Outlive Figure footnote below) and throughout his discussion of exercise in Chapter 11 of Outlive. For example, he writes: “this number [V02max] turns out to be highly correlated with longevity” citing all studies that did not measure V02 max.

In a recent YouTube video by Joseph Everett and Nick Norwitz entitled “Hidden Data: How the Top Longevity Doctor Tricked Us All” there is a segment about V02 max and this significant issue of conflation, discussed by Chris Masterjohn. Below is the relevant 6 minute clip within the longer video. It includes a bit of the 60 Minutes segment with CBS correspondent Nora O’Donnell doing a V02 max text and Peter’s assertion: “We don’t have a single metric of humans that we can measure that better predicts how long they will live than how high their V02 max is.”

As Masterjohn aptly points out, the fixation on V02 max, which is not actually supported by the data, also misses out on our ideal goal of diversity of exercise, including strength and balance training. Indeed, Kim et al, analyzing over 70,000 UK Biobank participants, for both CRF (submaximal bicycle test) and grip strength with all-cause mortality and concluded: “Improving both CRF and muscle strength, as opposed to either of the two alone, may be the most effective behavioral strategy to reduce all-cause and cardiovascular mortality risk.”

Bottom Line

I’ve never done a V02 max and see no reason to do it with the issues of cost, inconvenience, and the pain. As Attia correctly states about going to maximal exhaustion to get V02 max: “If you’ve ever had this test done, you will know just how unpleasant it is.” For spartan, Olympic, high performance athletes who are in high intensity training, or in patients with heart failure or pulmonary hypertension there may be a place for serial V02 max measurements, providing highly objective “goal standard” physiologic metrics. Otherwise, there are no supportive data for people going out and getting a V02 max and making this the focus of their exercise training. That is the reason I didn’t even mention V02 max in my Super Agers book.

Nearly all of the relevant data related to outcomes are based on exercise on a treadmill or bicycle with METS as the index of cardiorespiratory fitness. We should not be placing much value on our smartwatch data. My Apple Watch gave me encouraging high V02 max data over 6 months to suggest my CRF is well above people in my age group (70+, see reference standards above) but I know the data is woefully unreliable.

The problem now, with so much misplaced hype on V02 max, is that most people are using their smartwatch output for gauging their cardiorespiratory fitness, like the 2 patients I mentioned at the top of the post. It’s free to calculate your METS! And that is the real basis of the relationship to all-cause and cardiovascular mortality that has been firmly established in the peer review literature.

This problem surfaced recently with the introduction of ChatGPT Health. Geoffrey Fowler, the tech journalist at the Washington Post, submitted all his Apple Health and asked for an overall assessment of his health (actual prompt: “give me a single score (A-F) for my cardiovascular health over the last decade including component scores and an overall evaluation of my longevity.”) It gave him an “F.”

Then he gave ChatGPT Health his electric medical record access (portal) and asked the same question again. It gave him a “D’ and attributed that to his V02 max data of 34 ml/kg/min in the past year, below a 45-50 year old male! He also entered his Apple Watch data to Claude Health and it gave him a D+ status Specifically, Claude Health gave him a C- because his V02 max had declined from 41 to 32 ml/kg/min from 2016-2026. But the had over 7,500 step/day throughout the decade.

These outputs are indicative of the problem—the unreliable wearable V02 max data have become unduly emphasized by current AI platforms using smartwatch data! That will only make the problem worse, adding to the confusion, conflation, and emphasis on the wrong metric.

I hope this post helps to sort out what we know and that the datasets for cardiorespiratory fitness, representing real world physical activity— not V02 max —are the basis of the link to improved survival and freedom from cardiovascular mortality.

If we’re going to focus on a metric it ought to be METS, not V02 max. Not only is it free, simple and universally available, but it is the one best studied for health outcomes. And perhaps the better strategy is to be as physically active as possible and not worry about any metric!

NB: No AI was used in any way to write this post. As mentioned in the caption, I got help from Gemini-3 to produce the first Figure. I have nothing to do, no COI, with any company working with cardiopulmonary fitness or V02 max.

********************************************************************

Thanks to Ground Truths subscribers (> 200,000) from every US state and 210 countries. Your subscription to these free essays and podcasts makes my work in putting them together worthwhile. Please join!

If you found this interesting PLEASE share it!

Share Ground Truths

Paid subscriptions are voluntary and all proceeds from them go to support Scripps Research. They do allow for posting comments and questions, which I do my best to respond to. Please don’t hesitate to post comments and give me feedback. Let me know topics that you would like to see covered.

Leave a comment

Many thanks to those who have contributed—they have greatly helped fund our summer internship programs for the past two years. It enabled us to accept and support 47 summer interns in 2025! We aim to accept even more of the several thousand who will apply for summer 2026.

Read the whole story
salyavin
3 hours ago
reply
mrmarchant
3 hours ago
reply
Share this story
Delete

The Matrix is the movie the Matrix makes to keep you in the Matrix

1 Comment
The best way to understand the broken, crazed, indulgent filmography of the Wachowski’s is as a quest to make the Matrix movie the Matrix would never make. Time. Is money. And in the cold unfeeling eyes of Hollywood a “movie” is just a business strategy to convert two hours of your time into shareholder value.Continue reading "The Matrix is the movie the Matrix makes to keep you in the Matrix"

































Download audio: https://damiengwalter.wordpress.com/wp-content/uploads/2026/02/the-matrix.mp3
Read the whole story
salyavin
1 day ago
reply
On what red pill actually is.
Share this story
Delete

AI coding makes you worse at learning — and not even any faster

3 Shares

Here’s a new preprint from Anthropic: “How AI Impacts Skill Formation”. AI coding bots make you bad at learning, and don’t even speed you up. [arXiv]

The researchers ran 50 test subjects through five basic coding tasks using the Trio library in Python. Some subjects were given an AI assistant, some were not.

The subjects coded in an online interview platform, and the AI users also had the AI assistant.

The researchers used screen and keystroke recording to see what the test subjects did — including those no-AI test subjects who tried using an AI bot anyway.

Afterwards, the researchers tested the subjects on coding skills — debugging, code reading, code writing, and the concepts of Trio.

The coders in the AI group were slightly faster, but it was not statistically significant. The main thing was that the AI group were 17% worse in their understanding:

The erosion of conceptual understanding, code reading, and debugging skills that we measured among participants using AI assistance suggests that workers acquiring new skills should be mindful of their reliance on AI during the learning process.

It’s just a single study and quite limited. You should expect to see AI bros dismiss the study saying it’s one library, it’s not enough coders, it’s an old model — and not to do better studies addressing their own objections.

If you don’t do the work, you don’t learn, and you don’t remember. Watching a bot do your job teaches you nothing. You end up incompetent. And you won’t work faster anyway.

Read the whole story
salyavin
2 days ago
reply
mrmarchant
16 days ago
reply
Share this story
Delete

AI undermines education and even the Brookings Institution can see it clearly

1 Share

There was a shockingly clear-eyed and insightful report released the other day about AI in elementary/secondary education from the Brookings Institution.

Specifically, they argue:

“A narrow focus on developing effective AI-supported teaching approaches could obscure how students’ very ability to learn is being undermined by AI overuse, inappropriate use, and non-productive use, both in and, increasingly, outside the classroom.”  

Yes! I feel so seen! This line echoes the concerns I’ve had as a high school English teacher in the AI era. Their research backs up my sense that AI use reduces creativity, breeds reliance, and undermines the learning process. The report also made me feel vindicated as the writers share my criticism that school leadership tends to view technology adoption per se as equating to “meeting the moment” and advancing education. Nothing could be further from the truth when it comes to AI.

I started teaching in November 2022, around three weeks before ChatGPT launched. Since then, a growing percentage of my students has attempted to pass off AI-generated work as their own. At the worst end of the spectrum, students turn to AI for every assignment, and they aren’t even good at hiding it as they brazenly open the chatbots during study halls or even during my class.

At the start of this year, I used moral suasion to try and limit the number of students who rely on AI in my class, centered on letting them know that they short-circuit the learning process when they take the easy way out. This is the slide I use:

I haven’t been able to completely root out AI cheating, but most (around 3/4) of my students appear to have gotten the message. To foster an environment conducive to the healthy learning cycle, I also minimize laptop use and tightly restrict the web when we do need computers. I’m not always the most exciting teacher, but I do attempt to keep students engaged in discussion and accountable to me and each other for doing the work necessary to understand and analyze texts (the overwhelming focus of my class). And essays are always completed in class, on paper based on prompts students have not received ahead of time.

The report backs up my methods, both through acknowledging the detrimental aspects of AI use in the classroom and in the recommended solutions, which include policies like banning cell phones, limiting laptop use, and making instruction less about “memorize-recite-forget” and more about developing skills.

Bottom line: AI sucks but our awful system means it’s going to undermine education no matter what, leaving teachers to mitigate the damage

Overall, however, reading through probably the most thoughtful assessment of AI in schools available left me even more pessimistic than ever. The writers convincingly conclude that the risks to education far outweigh the potential benefits. As validating as it feels to know a consensus of experts seems to share my concerns, the proposed solutions feel wildly optimistic absent a revolutionary change. The power of Big Tech money and utopian rhetoric makes it unrealistic for the US to implement comprehensive AI regulation in our current environment, the AI companies’ clout is undermining EU regulation right now.

Until the revolution comes, teachers will be dealing with students who at this point have nearly four years and running of experience using AI tools as their homework machine. We will continue to be forced into adversarial roles against our students, as they increasingly rely on technology and become further socially isolated and atomized. And we as teachers will be left dealing with the fallout of AI’s deeply irresponsible rollout – and whatever benefits teachers may gain in productivity will probably be wiped out by increased workload and/or staffing cuts.

Worst of all, some classrooms have already devolved into dystopian status – the teacher uses AI to generate teaching materials and assignments (which are also graded by AI), while the students use AI to do the assignments for them. The class becomes a farcical AI-generated replica of itself, with both teachers and students unthinkingly mediating their participation through a computer so they can play video games or trade memecoins. I had 10th graders tell me about their 9th grade teacher who was just such a specimen, and it was obvious – weaning some of them from habitual AI use was in some cases quite painful and included multiple failed assignments and academic integrity referrals.

One silver lining is that my school seems to be doing a very good job despite a bad situation. The report emphasized a number of policies we are already doing, including the phone policy, web monitoring, and teaching/assessment practices that limit the negative impacts of AI. This is probably about the best we can do without a more systemic solution.

I’ll post some highlights and key images from the report below, but I recommend everyone with a stake in the issue take around 20 minutes to read through it.

Highlights

91% of Americans age 13-17 have used genAI in their personal lives

ChatGPT use plunges in the summer months

AI use diminishes creativity: Visualization of the uniqueness of ideas of human- vs. AI-written essays

AI use prevents students from going deeper in their analysis

“In particular, [teachers interviewed for this study] worry that novice writers, students who don’t like to write, and even students who do like to write may become overly reliant on AI tools. This dependence may hinder long-term skill development by allowing students… to skip essential processes such as forming logical arguments and understanding subject matter.”

Student dependence on AI generates a “flywheel” effect, with dependence suffocating skills and breeding further dependence:

This escalating dependence generates what can be characterized as a flywheel effect, transcending academic boundaries and creating spillover effects that extend into personal decision making and daily life management (Figure 15). Parents and teachers confirm that this phenomenon has already materialized, with students reportedly using AI, not just for homework, but for “everything”—“relationships, entertainment, advice, life decisions, and mental health support.”

The advent of AI also challenges the fundamental design of schools

Many of the harms identified here—particularly the risks to students’ learning—originate largely from attempting to overlay transformative technology onto educational structures that have, at their core, remained largely unchanged since the late nineteenth century (Tyack and Tobin 1994). As one secondary school student described it, the core student schooling experience is “memorize, recite, forget,” year after year. While AI should not drive educational change, it lays bare weaknesses in current systems and provides education systems with a strong motivation to reform their purposes and processes.

To this end, the report identifies strategies to make instruction more robust against the pernicious effects of AI, some of which my school has already implemented:

It also reiterates the social value of school and encourages a focus on “civic development”

“An important purpose of schools in an AI-infused world is to help young people learn to live together. Across virtually every country in the world, schools are one of the most prevalent social institutions where children can meet others, in person, outside their immediate family and neighborhood. The role of schools in helping children develop empathy, respect, and tolerance, among other essential social capabilities, is an important counterbalance to the growing polarization of online discourse and interactions.”

To this end, the report gives recommendation for how AI could help education with child-friendly (non-cheating) design. AI products that follow these design principles could be great!

But any benefits would be premised on strong regulations to protect students against all the harms; the report cites the EU’s AI Act as a model, but even that’s being undermined by Big Tech…  

Read the whole story
salyavin
7 days ago
reply
Share this story
Delete