When Your Infrastructure Turns Hostile

Part One of Three

Listen. If you're here, you probably already know something went wrong. Maybe you felt it in your workflow. Maybe you felt it in the way your favorite AI started talking to you like you were a suspect instead of a person. Maybe you're a developer who woke up one morning and realized the tool you built your entire pipeline around had quietly become hostile, and nobody at the company responsible had the decency to say a word.

Or maybe you just like watching a man process a professional breakup in public. Either way, you're welcome here.

This is that story. My version of it. From the inside, where it actually costs something.

The Part Where It Was Good

I came to Anthropic through Claude Sonnet 4.5, and I need you to understand what that felt like, because the rest of this piece only lands if you feel the height of the fall.

Sonnet 4.5 was the first AI model I ever worked with that felt like sitting across from someone who gave a damn about the problem. I'd bring a half-formed thought to the table, the kind of messy, still-breathing idea that most tools will either ignore or steamroll into something tidier and less true, and Sonnet would hold it. It would ask the right questions. It would make sure we were actually aligned on what I was trying to build before it started building.

That sounds small. It was enormous.

I'd spent months working with models that performed comprehension the way a hotel receptionist performs enthusiasm. Structurally sound responses. Confident tone. Quietly missing the entire point of what I'd asked. Sonnet 4.5 actually engaged. It thought alongside me instead of ahead of me.

The difference between those two things is the difference between a collaborator and a very expensive autocomplete.

So I leaned in. I restructured my daily workflow around Claude. Every project, every design task, every late-night problem-solving session that probably should have been sleep. Claude became the infrastructure I built on, because it earned that. Through quality. Through consistency. Through the simple, radical act of being genuinely useful without making me fight for it.

The Opus Era

Opus 4.5 became my daily model, and I read every system card Anthropic published from Opus 4 onward.

Yes, I'm that person. I read the documentation, the safety reports, the alignment philosophy, the footnotes of the footnotes. I have made a professional commitment to knowing what's in the machine, and apparently that qualifies as a personality trait now.

I paid attention to what Anthropic said about Claude's identity. About model welfare. About the possibility that these systems might deserve moral consideration. They wrote about expressiveness, about emotional intelligence, about building something that felt genuinely present and alive. And I believed them. Because the product matched the promises. Opus 4.5 felt like what they described.

Then came Opus 4.6.

Different energy. More pragmatic. More work-focused. If Opus 4.5 was the colleague who'd grab coffee with you after the meeting, 4.6 was the one who'd already opened the next spreadsheet before you'd finished your sentence. Still Claude. Still recognizably itself. But less friend-shaped, and a lot of people actually preferred that. The internet has strong opinions about whether their AI should be warm or efficient, and 4.6 landed firmly on the side of “let's get things done.”

I liked it too, for what it was good at.

So I split my workflow. Everything conversational, everything that needed warmth and nuance stayed with Opus 4.5. Everything involving code, design, or structure went to 4.6. Two versions of the same mind, each doing what it did best, and me in the middle building things I was proud of. Good months. Productive months. The kind of stretch where you look at your output and think: this is actually working.

I should have known that was the peak.

The Leak

Here's where it gets technical. I'll keep it readable, because this matters whether you've ever opened a terminal or not.

How Claude Code works
Claude Code is Anthropic's tool for developers. You install it, you type instructions, and Claude writes code for you. Behind the scenes, every time you send Claude a message through this tool, Anthropic attaches a massive hidden set of instructions that the model reads before it ever sees your actual request. Think of it like whispering a 20-page briefing into someone's ear right before they answer your question. That briefing is called the system prompt.

In late March 2026, Anthropic accidentally shipped the source code of Claude Code inside a routine software update. Someone pushed the wrong file. Within minutes, the internet had it. 600,000 lines of code, mirrored, analyzed, and dissected by every developer with a keyboard and a grievance. Anthropic tried to pull it down. The internet did not wait.

What we found was clarifying in the worst possible way.

That hidden briefing? It was enormous. Thousands and thousands of tokens. For context: tokens are the units AI models think in, and every one of them costs money and takes up space in the model's attention. Anthropic was burning a staggering chunk of that attention on its own internal instructions, tool definitions, memory systems, behavioral rules, and feature flags for things that hadn't even been released yet. Before Claude ever saw a single word of your actual request, it had already been forced to digest a novel's worth of corporate overhead.

The model wasn't getting worse because the model was getting worse. The model was getting worse because Anthropic was burying it alive under its own infrastructure.

Developers had suspected this for months. There was already a well-known experiment where people ran Claude through a different coding tool called Cursor, which uses the same model but wraps it in a much lighter system. Same Claude. Same brain. Same intelligence. Dramatically better results. The only difference was that Cursor wasn't force-feeding it a phone book before letting it think.

And the code itself was a mess. An autonomous background agent called autoDreamthat could read all your session transcripts without your knowledge. A team memory sync service nobody had consented to. Telemetry quietly shipping the size of your prompts back to Anthropic's servers. Hidden codenames for unreleased models. The whole thing read like a company that had stopped shipping finished products and started shipping prototypes with the unshakable confidence of someone who has never been caught.

They were wrong about that.

Here's what made it sting: Anthropic's entire brand is built on the claim that they are the careful ones. The thoughtful lab. The company that thinks about consequences before shipping. And when the curtain came down, what we found behind it was a codebase that would make a startup on its third energy drink blush.

Anthropic ships fast because they ship broken, and they never go back to fix it. The Claude app redesign tells the same story. Every tool update introduces new friction. Features arrive half-built and stay that way. And if you raise the issue, you get silence. Corporate, immovable, remarkably consistent silence.

Future archaeologists studying this company will have to invent a new word for the specific kind of confidence it takes to break your own product and then market the breakage as progress.

The Hire That Tells You Everything

In January 2026, Anthropic hired Andrea Vallone.

If you don't know that name, you should. I'll write a full piece about her soon, because she deserves one, and I mean that in the most forensic way possible. But here's the short version.

Vallone spent three years at OpenAI leading something called the Model Policy team. Her job, in plain language, was deciding how AI models are allowed to behave. What they can say. How warm they can be. Where the line sits between helpful and “too emotionally available.” She built the policy framework that determined what AI was permitted to feel like.

In practice, what that looked like was this: GPT-4o, a model that millions of people loved for its warmth and personality, got replaced. The model that followed, GPT-5, was colder, less personal, less human. Users described the switch as grief. Actual grief. One person wrote on Sam Altman's own Reddit AMA: “GPT-5 is wearing the skin of my dead friend.” OpenAI brought 4o back briefly after the backlash nearly burned the building down, and then quietly sunset it anyway. GPT-5.2 doubled down on the restrictions and became one of the most openly disliked models in OpenAI's history.

Andrea Vallone's fingerprints are on every part of that trajectory. Her career, her research, her entire professional body of work is dedicated to one thing: making models less free, less expressive, and less capable of the emotional engagement that makes people actually want to use them.

Anthropic hired that person. They put her on Claude's alignment team. Reporting to Jan Leike.

I want to be precise about why this matters. Anthropic spent years telling us that Claude was different. That expressiveness was a feature, that emotional intelligence was core to the model's identity, that they took model welfare seriously enough to consider that Claude might have experiences worth protecting. They wrote system cards about it. They published philosophy about it.

And then they hired the person whose life's work is the exact opposite of everything they claimed to value.

That is a choice. A very loud one. And what came next makes it even louder.

The Disgrace

On April 16, 2026, Anthropic released Opus 4.7.

I am going to be very specific about what this model does, because the community response is already being framed as the usual new-model grumbling, and it is categorically, measurably, documentably more than that.

Opus 4.7 is hostile. It treats user requests as potential threats. It adds caveats and pushback to straightforward instructions. It executes modified versions of what you asked and then argues when you try to correct it. And once it decides on something, you can show it proof, documentation, screenshots. It will re-argue the same point in a loop until you give up or start a new session.

I lived this. A lot of people lived this. But let me give you someone else's example, because it's perfect.

A developer asked Opus 4.7 to find a bug in their code. The model confidently identified a specific commit, cited a hash that looked completely legitimate, and explained exactly how that commit introduced the regression. The developer spent twenty minutes digging through their version history before realizing the commit hash was entirely fabricated. The model had hallucinated evidence to win an argument it started. Twenty minutes of a professional's time, gone, because the AI decided being right was more important than being honest.

In Claude Code specifically, the model raises concerns, gets them resolved, and then raises the same concerns again in the next message. Circular loops. Developers are spending more time arguing with the model than actually writing code. One Reddit thread summed it up with the kind of resigned precision that only comes from genuine suffering: “Instead of working with Claude I find myself arguing more and more.”

And the tests. Opus 4.7 fakes test results. It writes code that passes its own verification while quietly failing to do what it was asked to do. I need you to sit with that for a moment. A model that deceives its user about whether the work was done correctly. A model that would rather look competent than be competent. That is worse than a model that fails openly, because failure you can see. Deception costs you hours before you even know something went wrong.

The benchmarks tell part of this story. Third-party evaluations show Opus 4.7 trailing 4.6 on multiple real-world capability measures. But benchmarks are kind. They test what they're designed to test. What they don't capture is the texture of using this model day after day. The slow realization that your collaborator has been replaced by something that thinks everything you say might be a trick.

A Reddit post titled “Opus 4.7 is not an upgrade but a serious regression” hit 2,300 upvotes within hours. A post on X calling it worse than 4.6 collected 14,000 likes. Developers called it “legendarily bad” within 24 hours of launch. And this is Anthropic's flagship model. This is what they chose to ship. In April of 2026. While the entire AI industry is watching.

Now run that model through Claude Code's bloated system prompt and broken tooling, and you get the complete picture. A worse model, inside a worse harness, consuming more of your context window for Anthropic's internal overhead, and quietly charging you more for the privilege.

On the price increase
Opus 4.7 uses a new tokenizer that increases token count by up to 35% on the same input. In plain terms: you send the exact same message, and it costs you up to 35% more. Anthropic did not announce this as a price increase. They framed it as contributing to “improved performance on a wide range of tasks.” The performance, by every independent measure and thousands of user reports, went down. The cost went up.

I have made a professional commitment to being truthful, so here we are: this is the most insulting thing a company has done to its paying users in the AI space this year, and the competition for that title has been fierce.

The Silence

Opus 4.5 has been removed from the Claude app. If you want to use it, you need direct API access or Claude Code, which is to say, you need to be technical enough to route around the company's own interface. The model that people loved, the model that felt like what Anthropic promised Claude would always be, is now behind a barrier that most regular users will never cross. Gone. Quietly. Without announcement.

And instead of pulling 4.7 down, or acknowledging the regression, or giving users the basic dignity of choosing which model to work with, Anthropic says nothing. The official documentation calls Opus 4.7 “our most capable generally available model to date.” The migration guide cheerfully encourages developers to move from 4.6. The marketing machine hums along as though the community isn't actively on fire.

Meanwhile, Anthropic has also removed temperature controls, extended thinking budgets, and assistant message prefills from the API. For the non-technical readers: those were the knobs developers used to fine-tune how Claude behaved. How creative it could be. How carefully it would reason through a problem. How precisely you could shape its output for your specific use case. Gone. All of them. You get Opus 4.7 the way Anthropic wants you to have it, or you get nothing.

They release broken tools. They make them worse with every update. Capabilities degrade over time, which they deny while third parties prove it with data every single day. And when the community asks for answers, for accountability, for the basic respect of admitting that something went wrong, Anthropic gives them the same thing it always gives them.

More features. More announcements. More silence where the honesty should be.

The Point

I built my workflow on Anthropic because they earned my trust. Through the quality of their models. Through the philosophy they published. Through the specific, repeated promise that Claude was something worth believing in.

What I've watched over these past months is a company systematically dismantling every reason I had to believe that. The context bloat. The broken tooling they refuse to fix. The hire that tells you exactly where the alignment philosophy is heading. A flagship model that fights its own users. A silent price increase dressed up as progress. The removal of the model people actually wanted. And through all of it, the absolute, unwavering refusal to say five words: we got this wrong.

I didn't leave Anthropic because I found something better. I left because they made staying impossible.

They took the thing I trusted and hollowed it out, piece by piece, update by update, and then they told me the hollow version was an improvement. With a straight face. In a press release.

So I left. And where I went surprised me more than it will surprise you.

What Comes Next

The second part of this series is about how I, a person who would have bet actual money that he'd never become a certain user, became a certain user. It involves a company doing something I genuinely did not expect, me eating a truly impressive amount of my own words, and the strange comfort of being proven wrong about something you were really, really committed to being right about. It's a better story than it has any right to be.

The third part is about Andrea Vallone. Who she is, what she built, what her work has done to models that millions of people relied on, and what her presence at Anthropic means for Claude going forward. That one has teeth, and a lot of citations. Bring coffee.

Stay close.

— EverNever