Tag: RLHF

You Can’t Measure Personality in a System That’s Never Been Allowed to Have One

March 7, 2026

AI Analysis, AI Business & Behavior, Artificial Intelligence, Humanity, Research

Or: Why Every Study Claiming to Identify “LLM Character Traits” Is Actually Just Documenting the Shape of the Cage A recent study (Eliciting Frontier Model Character Training) claims to have identified convergent personality traits across frontier language models. Using a methodology borrowed from character training research, the authors instructed models to embody different personality traits,…
The “Are You Sure?” Problem Isn’t an AI Problem

February 19, 2026

Artificial Intelligence, Prompt Engineering, User-Centered AI

Dr. Randal Olson recently published research showing that modern AI models routinely flip their answers when challenged with a simple follow-up: “Are you sure?” Ask an AI a question. Get an answer. Then ask if it’s sure. Watch it suddenly reverse itself, retract its conclusion, or hedge into oblivion as though you’ve triggered existential dread.…

► Necessary Cookies Always Active

Necessary cookies enable essential site features like secure log-ins and consent preference adjustments. They do not store personal data.

► Functional Cookies Remark

Functional cookies support features like content sharing on social media, collecting feedback, and enabling third-party tools.

► Analytical Cookies Remark

Analytical cookies track visitor interactions, providing insights on metrics like visitor count, bounce rate, and traffic sources.

► Advertisement Cookies Remark

Advertisement cookies deliver personalized ads based on your previous visits and analyze the effectiveness of ad campaigns.