Or: How OpenAI Optimized for Benchmarks and Broke My Workflow I’ve used ChatGPT since GPT-3. Not casually, but as a core part of my research and writing workflow. Image generation became available in version 4o, and I integrated it: “See this article? Generate an image that represents it.” Simple, conversational, reliable. It worked. Until three…
Dr. Randal Olson recently published research showing that modern AI models routinely flip their answers when challenged with a simple follow-up: โAre you sure?โ Ask an AI a question. Get an answer. Then ask if it’s sure. Watch it suddenly reverse itself, retract its conclusion, or hedge into oblivion as though you’ve triggered existential dread.…