The candidate the AI ranked last, and what I found at 11pm

Open notebook with a ranked candidate list, a single name circled near the bottom in pencil, beside a printed AI screening transcript, warm evening light on cream paper

It was 11pm on a Thursday. I was on my couch with a pool of 31 transcripts and had worked through 19 of them.

I had three candidates above 75 who looked right. The search was a mid-level project delivery role, not unusual work, and those three had the track records the rubric rewarded. I was close to closing the shortlist and emailing the hiring manager in the morning.

I scrolled to the bottom of the ranked list out of habit. Wanted to see how low the floor was.

His score was 43.

what the score was measuring

Three jobs in roughly four years. Fourteen months at the first, eleven at the second, eighteen at the third. The model saw that and did what models do: it weighted tenure as a signal of commitment and gave him a low number. I've seen enough of these to know the model isn't wrong to flag it. Short tenure is real risk. It compounds.

I almost closed the tab.

I opened the transcript instead, I think because I wanted to understand what a genuine low-40s candidate actually sounded like. Research more than hope.

The first thing I noticed was that he was slow at the start. Not uncertain. Just slow, deliberate. He paused before answering the second question for about six seconds, which in a voice screening reads as a long time. Most candidates who've prepared fill that space quickly. He didn't.

Then he answered.

The first job: the company was acquired eight months into his contract. The project he'd been hired to deliver was cancelled in the integration process. He described a specific conversation with his manager about whether to stay in a different capacity, decided the new work wasn't what he'd come to build, and left. Fourteen months, not by attrition, but by a calculated decision about fit after the ground had shifted under him.

The second: a family member was seriously ill for seven months during his tenure. The role was fully on-site with no remote option. He described trying to make it work for four months, the logistics of it, what it cost him and his family, and eventually making the call to find something with flexibility. Eleven months. Not a performance issue. Not a better offer. A family situation that the role structure couldn't accommodate.

The third: he had been watching a specific company for two years. A written offer came in for a scope that was meaningfully larger than what he had. He had eighteen months in at the time. He took it.

Three departures. Three separate, traceable causes. None of them were the same.

the adaptability answers

The rubric I'd built for this search had three questions that touched on adaptability. I'd been scoring them on specificity: did the candidate name a real situation, or describe a general disposition? The difference is significant. "I'm flexible" is easy to say. Naming the specific system you had to rebuild because the brief changed is harder to fake.

He named the system. The one from the acquisition. He walked through what had worked, what hadn't, and what he'd carry forward. No caveats like "I'm a quick learner" or "I adapt well to change." Just the thing that happened and what he did.

The second adaptability question was about managing uncertainty when the path isn't clear. His answer was 73 words. Short compared to most candidates, who expand when they sense the question expects a story. He said something close to: "I've had to get comfortable with the gap between what I planned and what actually happened. You either waste energy being bothered by the gap or you use it to figure out what's actually true now."

I read that twice.

I've listened to a lot of transcripts from people who have processed real difficulty and people who have learned the language of real difficulty. They sound different. The second group tends to be more articulate about it. More complete. Better phrasing. I've written about this before in the context of candidates whose transcripts feel optimised for the rubric rather than honest. His answer was messier. Less shaped. More like memory than performance.

I'm still not sure I can fully explain why it landed the way it did. Maybe I'm wrong. Maybe the tenure pattern is predictive and I'm reading too much into 73 words.

But I sent the invite at 11:47pm.

what the score couldn't see

The 43 wasn't wrong. The model read his CV correctly. Short tenures are a real signal and the model was right to surface them. What the score couldn't do was read the causes of those tenures, because causes don't live in structured data. They live in transcripts, if the candidate is honest and the question gives them room.

He has 37 other candidates ranked above him in that pool. Most of them will never get the context check I gave him, and maybe they don't need one. Maybe the model is right about them and the score is enough. I'm not arguing for reading every low-scoring transcript. I don't have time for that and it wouldn't change much.

But occasionally, late at night, it's worth asking what the number was actually counting.

Talk soon.

— the recruiter

The Diary of an AI Recruiter is a daily column from the team at Ployo. If you want your screening to surface context, not just scores, book a call.

The candidate the AI ranked last, and what I found at 11pm

what the score was measuring

the adaptability answers

what the score couldn't see

Keep reading

The context that came too late to change the score

One 68 advanced past two 79s and I can't explain it

I scored the same candidate twice and got two different numbers