Bobadilla v. AI (Part Two)

Amelia Bedelia is still dressing the chicken. Now she has access to the whole closet.


Two months ago I wrote a piece comparing AI to Amelia Bedelia, the children's book housekeeper who follows every instruction with absolute precision and zero inference. She dresses the chicken in a small outfit. She draws the drapes with a pencil. She does exactly what you tell her to do, nothing more, nothing less, with no ability to understand what you actually meant.

That's still true. The core metaphor hasn't changed. AI is still Amelia Bedelia.

What's changed is how many closets she has access to.

What's Different Now

When I wrote the first piece in April, the landscape was already crowded. It's gotten more crowded since. More models, each with different strengths and different failure modes. More settings, personal profiles, memory features, custom instructions, adapted reasoning modes, different output formats. More ways to consume and review what the tool produces. More integration points where AI can touch your business, your content, your communications, your client work.

More options means more decisions. More decisions you haven't made means more places for things to go quietly wrong.

The pace of change has also reached a point where keeping up requires a focused effort. It's no longer possible to casually absorb what's happening in the AI space and feel current. The people who feel like they're falling behind aren't wrong. They're just experiencing the natural consequence of a field that moves faster than anyone's ability to passively track it.

That's not a reason to panic. It's a reason to have your own framework for deciding what matters and what really doesn't, rather than trying to keep up with everything.

The Stigma Problem

Something also has shifted in how people perceive AI-generated work.

In April, AI use was unusual enough that the outputs could pass most eyes undetected. That window has closed. The tells are recognizable now. The stacked adjective triplets. The transition-as-sentence. The wordy meta-description instead of the direct statement. Just a whole lot of confident emptiness that sounds authoritative and says nothing. Now we’re in a world where people notice it - then they judge it.  And not all of that judgment is coming out of nowhere.

I was in a meeting recently where someone handed out a document that was obviously AI-generated. Not obviously in the sense that it was bad. In the sense that it had the colors, the fonts, the phrasing, the structural patterns that anyone who uses these tools regularly can identify on sight. The person who distributed it knew nothing about the contents. They weren't bought off on what it said. They hadn't edited it, shaped it, or applied any judgment to whether it was correct, appropriate, or aligned with the actual situation. They were asking everyone else in the room to accept it as an initiative-guiding work product.

That's not an AI problem. That's a judgment problem wearing an AI costume. The tool produced something that looked like a deliverable based on the prompt (or hopefully prompts) it was given. The person treated it as one. Nobody in the process applied the professional judgment that determines whether output is actually useful or just formatted.

I think about this a lot because it's the exact opposite of what I described in the original piece. The point was never that AI is dangerous. The point was that AI is only as good as the person using it, and "using it" means maintaining editorial authority over what it produces. The meeting I described is what happens when that authority gets surrendered.

More Errors, More Quietly

I've also noticed what feels like an increase in errors and inconsistencies across models, though I can't say definitively whether the tools are making more mistakes or whether I'm just better at catching them.

What I can say is that the failure modes have gotten subtler. Early AI errors were obvious. Hallucinated facts, invented citations, confidently wrong answers to simple questions. Those still happen but they're less frequent. What's more common now is the quiet drift. A term used slightly incorrectly. A logical constraint from earlier in the conversation that gets dropped. A nuance that was explicit in the briefing but absent in the output. The kind of error that a non-expert would never catch and an expert has to actively look for.

This is the Amelia Bedelia problem at scale. She's not making bigger mistakes. She's making smaller ones with more confidence, and with fewer obvious signals that something went wrong.

How I Actually Use It

I use Claude for substantive business work (not legal work - this is where having a policy comes in handy). Content development, strategic planning, financial modeling, copy drafting, research synthesis. I treat it the way I'd treat an analyst or a direct report. I brief it completely, I review everything it produces, and I correct immediately and specifically when something is wrong. It doesn't get the benefit of the doubt and it doesn't get to be the final authority on anything.

I use Gemini through Google for the average personal question. The kind of thing I used to type into a search bar. “Does my 7 year old need a Q collar?” “Can I plant the wildflower seeds I bought this weekend?” “How do I turn off auto-stop so I can go to the car wash?” Can I, how do I, what's the difference between. Quick lookups where the stakes are low and the context doesn't matter.

I consolidated from ChatGPT to Claude earlier this year because I wanted everything in one system. The connected context across conversations, the project structure, the way the training works. Using multiple AI tools for the same category of work is like having two filing cabinets in two different offices. You can make it work but you're creating friction that doesn't need to exist.

Each tool has a job and a lane. That's a principle I apply to lawyers, to software, and to AI. Using one tool for everything is the same mistake as using one lawyer for everything.

My Actual Best Practices

The framework I've built for myself since April has gotten more specific, not because the principles changed but because I've had more experience applying them.

The editorial authority never leaves my hands. Everything AI produces for me is an outlined draft. Not a rough draft, not a suggestion, a first dump of words on a page (usually based off of an iphone note) that gets the same scrutiny I'd apply to work from any other source. The voice in everything I publish is mine. The ideas are mine. The judgment about what's right and wrong in any given output is mine. AI accelerates the process. It does not replace the professional.

I brief the way I'd brief a skilled professional who has never worked in my industry. Complete context, explicit constraints, specific examples of what good looks like, and clear statements about what I don't want. When I skip this step the output suffers in direct proportion to what I left out.

I correct in real time. When something is wrong I say exactly what's wrong and why. "That sounds like AI" is a complete correction because it identifies the problem precisely. The tool can't improve without specific feedback any more than a new hire can.

I maintain my own voice as the standard. AI can replicate tone when given enough examples and explicit direction, but it trends toward a default register that sounds like everyone else's AI output. Resisting that drift requires knowing what your voice actually sounds like and being willing to reject output that doesn't match it, even when the output is technically good.

I don't adopt every new feature. Memory, voice mode, agentic capabilities, image generation. Each one is a new closet for Amelia Bedelia to wander into. I evaluate each one against a simple question: does this serve something I'm actually trying to do, or am I adopting it because it's available? Most of the time the answer is the second one and I leave it alone.

Here's the numbered version:

  1. Everything is a first draft and gets re-written from scratch keystroke by keystroke. No exceptions.

  2. Brief completely or expect incomplete output.

  3. Correct immediately and specifically.

  4. Know what your voice sounds like and reject what doesn't match it.

  5. One tool per job. Don't use the same AI for everything.

  6. Don't adopt features because they exist. Adopt them because they serve a specific need.

  7. If you can't tell whether the output is wrong, you're not ready to use the output.

That last one is still the most important thing I believe about AI use and it hasn't changed since April.

What I Haven't Adopted and Why

I haven't started using AI to conduct legal research, draft legal documents, or take actions on my behalf. Making appointments, sending emails, executing tasks in the real world. The agentic AI conversation is happening everywhere right now and I'm watching it with interest and staying out of it for the moment.

Not because the technology doesn't work. Because the gap between "AI can do this" and "I trust AI to do this on my behalf without supervision" is wider than most people acknowledge, and the consequences of getting it wrong in a professional context are real. When Amelia Bedelia dresses the chicken, the worst that happens is dinner is weird. When Amelia Bedelia sends an email to your client on your behalf and gets the tone, the facts, or the legal analysis wrong, the consequences are not dinner-weird. They're career-weird.

I'll get there eventually, maybe. I'll get there if I've tested it enough to trust it and built enough guardrails to catch the failures. If that doesn’t happen, it isn’t for me. That's the same approach I took with everything else and it's worked.

The Real Advantage

The advantage of going early hasn't disappeared. It's shifted.

Six months ago the advantage was access and familiarity. Getting comfortable with the tools before you needed them. That advantage has largely been competed away. Most people have tried AI by now.

The advantage now is judgment. Having used the tools long enough to know where they're reliable and where they're not. Having developed your own framework rather than following someone else's prompting tips. Having made enough mistakes to recognize when the output is drifting before it drifts into something consequential.

The people using AI best right now aren't the ones who adopted every new feature. They're the ones who know what they want from the tool and what they don't. Who can look at an output and tell you in specific terms what's wrong with it. Who maintained their own professional standards through the adoption process rather than letting the tool's capabilities define their standards for them.

Amelia Bedelia got access to the whole closet. She still doesn't know what you mean. That part is still your job.

- m


Next
Next

Bobadilla v. They Said