Back to Sifted

Five thinking habits that predict who's worth hiring.

Anthropic published the framework for how people should work with AI. Four phases. Five testable behaviours. A clear line between people who get it and people who don't.

By Leo Garcia-Curtis · Waboom AI

I scored 4 out of 10 on my own test. Ouch.

I built Sifted using Anthropic's best practices guide for working with AI. Their framework. Their rubric. Their scoring criteria. Turned it into a tool recruiters can use to test candidates.

Then I took the test myself. Didn't read the guide first. Figured I didn't need to. I've been building with AI for 18 months. I know what I'm doing.

4 out of 10.

That stung. So I went back and actually read Anthropic's document. Properly. Cover to cover. And it clicked. The framework isn't complicated. It's just that most of us skip the parts that matter.

I retook the test. Scored 9.

Same person. Same brain. Same tools. The only thing that changed was how I thought about the problem before typing.

That's when I realised: listing “proficient in ChatGPT” on a CV means absolutely nothing. The tool doesn't matter. The thinking does.

Here's what Anthropic's own research says about how the best people work with AI. And why it changes everything about how you should hire.

Four phases. Nobody follows them.

Anthropic's recommended workflow: Explore, Plan, Execute, Verify. Four words. Dead simple.

Know what 90% of people do? They skip to Execute. Type a vague prompt. Get a vague answer. Blame the AI.

1

Explore

Read the data. Ask questions. Understand what you're dealing with before you touch anything.

2

Plan

Map out what you're going to do, in what order, and why. Separate thinking from doing.

3

Execute

Now build. But only with clear success criteria already defined. Not after.

4

Verify

Test it. Measure it. Compare before and after. Don't ship and pray.

The people who follow this pattern get results that look like magic. They're not smarter. They just think before they type.

So how do you tell who does this and who doesn't? You test for five specific habits.

Five habits. Each one testable in 5 minutes.

Not tool knowledge. Not years of experience. Thinking patterns that transfer across every role.

01SCOPING

They narrow the problem before touching it.

Picture this. A client dumps a CRM with 100,000 contacts on someone's desk. Collected over five years. Most haven't been contacted in 12+ months. Revenue has flatlined. “Reactivate the database.”

The 3/10 candidate? “I'd use AI to send emails to everyone.”

The 9/10 candidate? “Show me who spent the most money and hasn't been contacted in 12 months. Start there. That's 2,000 contacts, not 100,000.”

Same problem. One person tried to boil the ocean. The other carved it down to the 2% that actually matters.

02CLARITY

They name the numbers. Not the vibes.

“Support is struggling.” OK. What does that mean?

500 tickets a day. 14 hour average response time. Customer satisfaction dropped from 4.2 to 3.1 in one quarter. Eight agents. Billing is 35% of volume, technical is 40%, feature requests are 25%.

Now you know what you're solving. The first version tells you nothing. The second version gives the AI everything it needs to actually help.

The best people do this instinctively. They name the metric. Every time. (Turns out, the AI is only as good as the brief you give it. Who knew.)

03EXPLORE FIRST

They look at the data before building anything.

This is the one that separates the top 10%.

A company's blog gets 50,000 monthly visitors but converts 0.3% to leads. The team publishes 8 posts a month with no strategy. The CMO wants content that drives revenue.

The average person jumps straight to solutions. “Write better content. Use AI to create more posts. Add more CTAs.”

The AI native thinker? “Before we change anything, pull two years of Google Analytics and Search Console data. Cross reference with the CRM. Which posts did leads actually read before converting? Which ones got 5,000 views and which got 200? Show me the gap between traffic and conversion by topic. Then we'll know where to focus.”

Explore first. Build second. That's the pattern.

04DEFINE DONE

They set a target before they start building.

The CFO walks in. Operating expenses jumped 23% last quarter. Revenue only grew 8%. Board meeting in two weeks. “Find out where the money went.”

“I'll look into it” is not a plan. That's a wish wrapped in good intentions.

“I need a breakdown showing which of the 12 cost centres drove the 23% increase, with month over month trends for the last 6 quarters, ready for the board deck by Friday” is a goal you can measure.

The difference? Accountability. You can't measure “look into it.” You can measure “12 cost centres, 6 quarters, Friday.”

05VERIFY

They test before they trust.

140 feature requests. Engineering can ship 4 per quarter. Everyone disagrees on what to build next.

The person who ships a prioritisation framework and calls it done? That's a 6. Good effort. Not great.

The person who says “Let's test this framework against the last 3 quarters of decisions. If the model would have picked the same features that actually drove revenue, we trust it. If not, we adjust the weights before rolling it out to the team”? That's a 9.

Don't ship and pray. Prove it works first.

Five habits. None of them mention a specific tool. A marketing manager can score 9. A senior developer can score 3. It's not about the role.

It's about whether they think through problems or throw technology at them.

So what does this mean for you?

Right now your candidates list “proficient in ChatGPT” on their CV. Some of them can think. Most can't. And you have zero way to tell the difference in a 30 minute interview.

You could spend 20 minutes per candidate trying to gauge AI fluency from a conversation. Or you could send a link.

They get a real business problem. Not a toy quiz. Not “write a haiku about productivity.” A real scenario with real data, real constraints, and a real deadline. They write how they'd use AI to solve it.

We score their thinking against those five habits. You see exactly why they scored what they scored. What they nailed. What they missed. And an example of what a great answer looks like. So you can evaluate their thinking even if you've never opened Claude yourself.

(We also track whether they pasted their answer from ChatGPT. Keystroke patterns. Tab switches. Copy paste events. If they cheated, you'll know.)

The opportunity nobody's talking about.

Every company I talk to says the same thing. Different words. Same message.

“We don't want more headcount. We want people who make headcount unnecessary.”

Companies cutting costs want one person doing the work of five. Companies scaling want to grow without hiring 50 more people. Both need AI native thinkers.

The recruiter who can reliably find those people? That recruiter doesn't compete for placements. Companies compete for them.

This is a land grab. The market hasn't split yet. It's splitting right now. In 12 months, testing for AI thinking will be standard. The question is whether you're the one who made it standard or the one playing catch up.

Five things to do this week.

1

Stop screening for tool lists.

"Proficient in ChatGPT" is the 2026 version of "Proficient in Microsoft Word." Tells you nothing. The tool changes every 6 months. The thinking doesn't.

2

Test thinking patterns, not experience.

Can they break down a problem before jumping to a solution? Can they set a measurable target? Can they design a test? Those patterns transfer across every role and every industry.

3

Send 10 candidates through Sifted this week.

Read the breakdowns. Within an hour you'll start spotting the pattern yourself. In interviews. In cover letters. In how people talk about their work. Once you see it, you can't unsee it.

4

Position yourself as the AI talent specialist.

Most recruiters are still matching job titles to CV keywords. You could be the one sending clients pre scored, AI tested candidates with a thinking breakdown attached. Nobody is doing that yet.

5

Move now.

Not next quarter. In 12 months this will be obvious. The recruiters already doing it will have the relationships, the reputation, and the data. Everyone else will be playing catch up.

How Sifted works. 30 seconds.

1

Sign up free

Your name, email, company. You get a branded link instantly. hire.waboom.ai/your-company/test.

2

Send it to candidates

They get a real business scenario. 6 different ones, randomly assigned. CRM reactivation, support crisis, financial anomaly, content strategy, feature prioritisation, session timeout bug.

3

AI scores their thinking

Five dimensions. Specific feedback on each. An example of what a 9 out of 10 answer looks like. Plus paste detection and typing pattern analysis.

4

You see the breakdown

Dashboard shows every candidate's score, speed, and authenticity. Click through for the full analysis. Export to CSV. Share with hiring managers.

It's free. We built it because we needed it ourselves. Couldn't find people who think clearly with AI, so we built a way to test for it.

We're giving it away because the recruiters who figure this out early are the ones placing candidates into companies that will need AI automation built. Everyone wins.

The recruiters who specialise in this will own the next decade.

The tools are free. The framework is clear. The market is splitting. Start now or catch up later. Your call.

Ready to start sifting?

Free. No credit card. Branded test link in 30 seconds.

Create your free test