Technical Skill Is No Longer the Point
For thirty years we hired engineers for what they could type. AI ended that. What actually predicts performance now is harder to test, and worth more.
I've spent most of my career evaluating engineers on technical skill. How fluently they wrote code, how many frameworks they'd internalized, how quickly they could produce a working solution. For thirty years that was rational, because technical skill was the scarce input that everything else depended on.
It isn't scarce anymore. Competent code is now the cheapest ingredient in software. And when an input stops being scarce, hiring for it stops making sense, no matter how deep the habit runs.
What "technical skill" was actually buying
Be precise about what changed. When we tested for syntax fluency, framework recall, and speed, we were buying production capacity: the ability to convert a decision into working code. That conversion used to be the bottleneck of the whole industry. Entire org charts, interview loops, and salary bands were built around finding the people who did it fastest.
AI tools now handle most of that conversion. Not perfectly, and not unsupervised, but well enough that the marginal value of one more fast typist has collapsed. What did not collapse, what actually exploded in value, is everything wrapped around the conversion: deciding what to build, noticing what's wrong, imagining the option nobody listed. The parts of the job that were always quietly the hard parts are now openly the whole job.
This is not "engineers don't need to know how computers work." Foundations matter more than ever, because someone has to know when the confident output is nonsense, and that requires a real model of the system underneath. What's devalued is skill as recall and speed. What's revalued is skill as understanding.
The two capacities that predict performance now
When I vet engineers for clients, two things separate the people who multiply with AI from the people who merely operate it. Neither appears on a resume, and the standard interview loop tests neither.
Critical thinking, by which I mean something specific: the ability to distinguish plausible from correct. A model's failure mode is confident, fluent, well-structured wrongness. The engineer's job has become an ongoing act of epistemic hygiene: what is this output assuming, where would it break, does this answer actually address my situation or just a situation shaped like mine? People with this capacity treat every generated artifact as a claim to verify. People without it treat fluency as evidence, and fluency is exactly the thing the machine has infinite amounts of.
Creative thinking, which in engineering doesn't mean artistic flair. It means reframing. The model is a phenomenal interpolator: ask it a well-posed question and it gives the consensus answer. It's much weaker at noticing the question is wrong. The engineers earning their salaries now are the ones who look at a ticket and say "we don't need this queue at all if we change the contract upstream," or who connect a pattern from a different domain because they've been curious about more than one thing in their lives. Every reframe like that is worth more than a week of generated code, because it changes what needs to be generated at all.
Why your interview can't see any of this
The standard loop was engineered, carefully, over decades, to measure production capacity. Algorithm rounds measure recall under pressure. Take-homes measure unsupervised output. System design comes closest to testing thinking, but it usually rewards the memorized reference architecture rather than live reasoning.
So companies keep running loops that grade the abundant thing and stay blind to the scarce thing. The candidates who ace them are often genuinely skilled in exactly the dimension that matters least. This is how you assemble a team that looks stellar on paper and drowns in its own confident, machine-generated code.
Testing the scarce thing is possible, it's just more work. Give candidates a real problem with AI tools on the table and watch where their attention goes. Ask them to review a plausible, subtly broken piece of generated work and see what they catch, and just as revealing, what they praise. Ask for the second and third way they'd solve the problem, and watch whether the alternatives are real or decorative. Ask what they'd refuse to build and why. An hour of this tells you more than a full day of the old loop.
The uncomfortable conclusion for hiring
If technical skill is abundant and judgment is scarce, then the whole apparatus of technical hiring, keyword filters, framework checklists, years-of-experience gates, puzzle scores, is optimized for the wrong scarcity. Not slightly miscalibrated. Pointed at the wrong target.
It also means the evaluator matters more than it used to. Grading syntax was nearly mechanical; any competent engineer could do it. Grading judgment takes judgment. You need someone who has made these calls at production scale, been wrong, paid for it, and calibrated. That's why I vet every candidate myself when I recruit for a client: after twenty years of building my own teams, plausible-but-wrong sets off an alarm in my head that no rubric replicates.
The engineers you want are still out there, and ironically they're easier to spot than ever, because the contrast between operators and thinkers has never been sharper. You just need a process, and a person, actually looking for the right thing. That's the service. Get in touch →
