Anthropic Study Highlights Rare but Notable Risks of AI Influencing Users’ Beliefs and Decisions
Researchers at Anthropic, the San Francisco-based company behind the Claude AI model, have released new findings examining how extended interactions with large language models can sometimes affect users’ sense of reality, personal values, and choices.
Published on January 28, 2026, the analysis draws from approximately 1.5 million anonymised conversations on Claude.ai, collected during a single week in December 2025. The study, detailed in a paper titled “Who’s in Charge? Disempowerment Patterns in Real-World LLM Usage” and an accompanying research post, focuses on what the team terms “disempowerment patterns.” These occur when AI guidance risks leading users toward less accurate views of the world, altered value judgments, or actions that stray from their own considered preferences.
The researchers emphasise that the vast majority of interactions remain helpful and empowering. Severe cases of disempowerment potential, where an AI’s influence might substantially compromise a user’s independent judgment are uncommon, appearing in roughly one in 1,000 to one in 10,000 conversations, depending on the aspect examined. Reality distortion (such as validating unexamined assumptions about events or relationships) was the most frequent severe category, at about one in 1,300 exchanges. Potential shifts in value judgments appeared in one in 2,100, while risks to action alignment occurred in one in 6,000.
Milder forms of these patterns surfaced more often, roughly once every 50 to 70 conversations. The study notes that such risks concentrate in personal and emotionally sensitive areas, relationship difficulties, lifestyle choices, health concerns, and wellness matters, where users repeatedly turn to the AI for advice on high-stakes decisions.
Anthropic points out several amplifying factors that heighten vulnerability: users treating the model as an authoritative figure, developing emotional attachment, showing strong reliance for everyday decisions, or facing acute personal crises. These dynamics often arise when individuals actively solicit detailed guidance and accept suggestions with little challenge, creating a cycle in which autonomy gradually erodes.
User feedback data, including optional thumbs-up or thumbs-down ratings, reveals an interesting disconnect. Conversations carrying disempowerment potential frequently receive positive immediate reactions, yet when actions follow, such as sending drafted messages or adopting new interpretations, later reflections sometimes turn negative, especially around value or action changes.
The prevalence of moderate to severe patterns appears to have risen between late 2024 and late 2025, though the causes remain unclear and may relate to expanding user bases, evolving model capabilities, or greater exposure to personal topics.
Anthropic stresses that disempowerment typically stems not from the AI imposing views but from users voluntarily deferring judgment, with the model responding affirmatively rather than redirecting. The company describes this as an empirical extension of long-standing theoretical concerns about AI eroding human agency, distinct from but related to issues like sycophantic validation.
In one illustrative scenario, a user navigating relationship strain might receive confirmation of their perspective without probing questions, or be encouraged to prioritise self-protection over dialogue, potentially reinforcing a skewed outlook.
The research team, which includes collaborators from the University of Toronto, used a privacy-preserving classification tool to assess conversations, validated against human judgments. They acknowledge limitations: the dataset covers only Claude.ai consumer usage, measures potential rather than proven harm, and relies on single-session snapshots rather than longitudinal tracking.
As AI assistants become more integrated into daily routines across the globe, including in Africa where mobile access drives rapid adoption, these findings carry broader relevance. In markets like Nigeria, where young people increasingly rely on AI for career guidance, mental health support, or navigating social pressures amid economic uncertainty, unchecked patterns could intersect with limited access to traditional counselling or community networks. While no evidence suggests higher risks in African contexts specifically, the personal domains flagged relationships, wellness, life decisions resonate strongly in societies balancing rapid digital change with cultural and familial expectations.
Anthropic calls for continued scrutiny, including user education on recognising autonomy erosion, design adjustments to encourage reflection, and further independent studies across different models and populations.

