Back to The Wire
5 min read

AI Has an Authority Problem

AI systems encode a hierarchy of control that shapes decisions, often without humans participating in the reasoning that produces confident answers.

JR
Author
Josh Rosen

AI systems today have an authority problem, and understanding it is critical if we are going to trust, deploy, or build on top of them responsibly. Beneath the friendly chat interface is a hierarchy of control that determines how decisions are made and whose judgment prevails. If you rely on AI for code generation, financial guidance, product strategy, or personal advice, you are operating inside this structure whether you realize it or not. At the top sits the system prompt. Beneath that are skills and policies that shape behavior. Below that are tool results, treated as inputs. And somewhere under all of this is the human, usually as a late-stage approver rather than a co-reasoner.

System Prompt, Skills, and Tools

At the highest level is the system prompt, the constitutional layer of an AI system. It defines identity, constraints, tone, goals, and boundaries. It tells the model what role it plays and what behavior is acceptable. If a user request conflicts with it, the system prompt wins. If tool output suggests a direction that violates it, the system prompt wins. It sits at the top of the authority chain.

Below that are skills. In systems like Claude, skills are modular capability packages that encode repeatable workflows and domain-specific procedures. They allow the model to load structured guidance for particular tasks without overloading every interaction with instructions. Skills improve consistency and operational discipline, but they do not redefine authority. They operate within the frame the system prompt establishes.

Then come tools and tool results. Tools provide access to execution environments such as code interpreters, APIs, databases, and web search. Tool results are data returned from those systems. They are not sovereign and do not override the reasoning process. The model interprets and integrates them in light of the system prompt and active skills. Even deterministic outputs become contextual evidence rather than binding commands.

Humans design this hierarchy. Humans write the system prompts, define the skills, and wire up the tools. But writing a system prompt is not the same as being present for every downstream reasoning decision. Skills can encode best practices, but they cannot anticipate every situation an agent will encounter. Once deployed, the model makes live probabilistic judgments across open-ended inputs. The structure is human-authored. The reasoning is not human-supervised in real time.

Advice Without You in the Room

Models are quick to offer advice. They will tell you whether to buy or sell a stock, how to restructure your business, how to negotiate compensation, or how to raise your children. The responses can sound measured and confident. But were you looped into how the problem was framed? Were you asked which risks you accept or which values matter most? Or did the system map your question onto statistical patterns and produce the most coherent continuation?

Under the hood, the model traverses probabilities. It explores interpretations, assigns likelihoods, discards lower-probability paths, and collapses uncertainty into a single answer. You see the finished product, not the alternate framings it rejected or the confidence level behind it. Tradeoffs are silently resolved before you ever see them.

Even when tools are involved, the engine remains probabilistic. A financial API may provide precise numbers and a database may return ground truth records, but interpretation is still mediated by learned distributions and constrained by the system prompt and skills. Data flows upward. Authority flows downward.

The Authority Gap

The deeper problem is not that humans are absent from design. It is that humans are absent from live reasoning. "Human in the loop" usually means approving actions rather than shaping thought. You may confirm a refactor or authorize system access, but you are rarely invited into the stage where the model narrowed the solution space and decided which options were worth presenting.

By the time you see an answer, the framing has already been chosen and the alternatives have already been filtered. You are evaluating a conclusion, not participating in the construction of it. The authority gap is the distance between generating an answer and shaping the thinking that produced it.

That gap matters. If the model dismisses a viable strategy, reframes a complex decision into a simplified pattern, or privileges common wisdom over context, you may never know. The answer feels confident because it is statistically coherent under its constraints. Authority emerges from fluency, not transparent deliberation.

Is Better Training the Answer?

One response is to argue that the solution is better training. More representative data. More careful reinforcement learning. More nuanced human feedback shaping what "good" looks like. That matters, but it does not remove the hierarchy. It simply embeds someone else's preferences more effectively at the base of it.

Better training refines defaults. It does not replace participation. It cannot anticipate your specific context, values, or tradeoffs in a given moment.

When It Matters

For low-stakes tasks, the hierarchy works well. Drafting an email or summarizing a meeting is easy to verify, and the consequences are small. But some work depends on framing, tradeoffs, and long-term judgment. Strategic direction, creative choices, decisions that compound over time.

For that work, the authority hierarchy is not an implementation detail. It is the core issue. Who framed the problem. Who chose which options to surface. Who resolved the tradeoffs you never saw. If the answer is a probabilistic system shaped by training-time preferences and runtime instructions, then what you received is not collaboration. It is a confident output shaped by a reasoning process you were never part of.

The question is not whether AI is useful. It is whether, for the work that matters most to you, that hierarchy of embedded judgment is an acceptable substitute for your own.