You Can't Debug a Judgment: Behavior(alism) vs Function(alism) in AI Evaluation
In traditional software, we debug behavior; in AI, we evaluate function. This post explores the tension between behavioral transparency and functional performance in AI systems, drawing on both philosophy and software engineering. When the internal workings are opaque—like in neural networks—we shift from analyzing how a system works to judging what it achieves.