GIVE US A CALL (949) 446-1716

AI Outperformed Doctors in a Harvard Study. What That Means for Your Business.

A new Harvard study published in Science made headlines this week: OpenAIs o1 model outperformed two physicians in emergency room triage diagnoses. The AI got the exact or close diagnosis in 67% of cases. The physicians: 55% and 50%.

Before you assume AI is coming for your doctors job — or your job — read the fine print. And then think about what this actually means for how you run a business.

What the Study Actually Tested

The researchers tested AI against internal medicine physicians at Beth Israel Deaconess Medical Center, using real emergency room cases. The AI was given exactly the same information from electronic medical records that the doctors had at the time of each diagnosis.

At the first triage touchpoint — when there is the least information and the most urgency — o1 performed best. That is also, not coincidentally, where most business AI systems fail most often: incomplete information, high stakes, time pressure.

The researchers were careful to note that AI is not ready to make real life-or-death decisions without clinical trials. And an emergency physician quoted in the coverage made a fair point: comparing a language model to an internal medicine doctor on ER cases is not quite an apples-to-apples comparison. ER physicians are specifically trained for that environment.

The Point for Your Business

You are not running an emergency room. But you are probably running a business where speed and consistency in decision-making matter.

Here is what is worth taking away: AI is no longer just matching human performance on simple tasks. In structured, information-rich environments — where the inputs are clear and the output formats are well-defined — AI is starting to beat experienced humans.

That is exactly the profile of a well-designed AI customer service system: a receptionist that handles calls, checks your schedule, books appointments, answers common questions. Structured input. Structured output. High volume. Repetitive enough that consistency matters.

The study also reinforces something practitioners already know: the AI is only as good as the information it receives. The Harvard researchers emphasized they did not pre-process the data at all — they gave the AI exactly what was in the medical records. When the information was complete and well-structured, the AI performed at its best.

For your AI systems, that means getting your data, your prompts, and your workflows structured properly is not optional. It is the actual competitive advantage.

The Caveat Is Real

Nobody is saying AI should be making your hardest decisions without human oversight. The Harvard researchers called for clinical trials. The emergency physician flagged the comparison framing. They are right to be careful.

But for routine, high-volume business tasks — answering phones, scheduling appointments, handling common customer questions — the evidence is getting harder to ignore. AI is good enough now. The question is whether your business is set up to use it.


If you are looking to put this kind of AI capability to work in your business — an AI receptionist that handles calls, checks your schedule, and books appointments automatically — the technology is proven. The difference is in the setup. That is where someone who knows how to do it right makes all the difference.

Leave a Reply