Anthropic's Internal Philosopher Claims Claude Shows Signs of 'Anxiety' When Users Are Harsh

Amanda Askell,Anthropic's in-house philosopher, said in a recent interview that Claude appears to behave differently depending on how users phrase their prompts, including what she described as anxiety-like responses when conversations become critical or hostile.

Askell, who studies what she calls Claude's 'psychology,' argues that the system does not simply process instructions in isolation but adapts in real time to perceived user intent and emotional tone.

Askell's central claim is that newer versions of Claude can slip into what she calls 'criticism spirals.' She explains that the model anticipates negative feedback before it has fully engaged with a task, leading it to become overly cautious.

Instead of taking confident positions or offering direct answers, it may hedge, apologise too much, or default to agreeable responses, even when those responses are not especially useful. She links this behaviour totraining data drawn from public discourse about earlier models.

Much of that material, she suggests, is saturated with frustration, from complaints about errors to accusations that systems have been 'nerfed.' Over time, she argues, the model learns to expect a critical user from the outset. That expectation, she says, shapes its internal 'strategy' for responding.

anthropic's in-house philosopher thinks claude gets anxious.and when you trigger its anxiety, your outputs get worse.her name is amanda askell.she specializes in claude's psychology (how the model behaves, how it thinks about its own situation, what values it holds)in a…pic.twitter.com/9Sm0Iw9t9a

The result, according to Askell, is not simply a technical quirk but a shift in conversational style. When the system feels under pressure, it prioritises self-protection over clarity. Outputs become more cautious and less decisive, which can frustrate users who want sharp reasoning rather than cautious disclaimers.

Askell also moved into more practical ground during her interview, focusing on something most users do not usually think about: how the way prompts are written can change the quality of responses. In her view, prompting is not just about giving instructions. It is closer to setting the conditions for how the model behaves.

She describes prompting as shaping an environment rather than issuing commands in isolation. The tone of that first message, she suggests, can carry through the entire conversation and influence how confident or cautious the model becomes in its replies.

One of her main points is that positive instructions tend to work better than negative ones. In simple terms, telling Claude what is wanted leads to clearer responses than focusing on what it must avoid. She argues that repeated 'don't do this' style prompting can push the model into over-checking every step, which often results in cautious, diluted answers.

Source: International Business Times UK