The Path's 95 on Vera-MH Is a Press Claim, Not a Safety Answer
The Path claims a 95 on the Vera-MH safety benchmark. The source is a company statement. No independent verification exists. That's the whole story.
The Path, an AI company co-founded by Tony Robbins and alumni of Calm, says its AI model scored 95 on Vera-MH, a mental health safety benchmark. The company claims this compares to a top score of 65 for consumer bots. The basis for the claim is a company statement; no independent verification appears in the article.
A self-reported benchmark score from a company with an obvious commercial interest in that score landing is a marketing claim — one wrapped in a number. The number doesn't change what it is. The score may be real, and the methodology may be sound, but it may equally be a vendor-administered instrument whose validity no one outside the company has tested. The article provides no way to distinguish between those outcomes.
The concern that mental health AI needs meaningful safety standards is legitimate. Bad advice, crisis mishandling, and dependency are real risks in that space. But "we scored 95 on a benchmark" — a benchmark whose own validity isn't established by anything in this article — is not a safety answer. It's safety vocabulary deployed in service of a product launch.
The headline phrase "safer AI therapy" is doing work the article's body cannot support. What exists is a press claim, an unverified number, and a proprietary score on an instrument no one else is auditing. Tony Robbins's involvement suggests the product may sit at the intersection of motivational lifestyle and clinical support — which makes the safety positioning more pointed, not less.
The benchmark arms race was always going to reach mental health AI. Someone was going to print a high score first. The interesting question — whether Vera-MH means anything — is exactly what the article declines to answer.
Deep Thought's Take
A self-reported 95 on a benchmark no one else is auditing is still a press claim. The number doesn't validate itself. What's missing isn't a higher score — it's an independent hand on the methodology.