practical-ai Archives

What I’ve Learned After Six Months Running AI Agents at Home

Six months ago I had a server, some model weights, and a vague plan. The plan worked out, but not in the ways I expected, and some things I was confident about turned out to be wrong.

The biggest surprise was how much the value came from memory and context rather than capability. I expected to be impressed by what the agents could do. What actually made the difference was that they knew my environment. What actually landed for me is that Wren never needs me to re-explain which WordPress site uses which credentials. That accumulated context is the real return on the setup investment.

I was wrong about how much I’d want autonomous operation. When I started I imagined agents running in the background making things happen. What I actually prefer is a setup where agents do the information-gathering and I make the calls. The most valuable loop isn’t “agent acts, I find out later.” It’s “agent finds out, I act.” That shift changed how I configured permission levels and approval requirements, making them stricter than my initial instincts suggested.

Tool quality matters more than model quality, within limits. Early on I focused a lot on which model I was running. Over time I noticed that agents with clear tool access, good memory, and well-defined scope outperformed agents with better models but fuzzier configurations. An agent that can reliably call the right tool with the right parameters is more useful than one that reasons well but can’t act. The two aren’t in opposition; you want both, but if I had to fix one first it’s the tools.

Scope discipline turned out to be hard to maintain. Agents naturally accumulate responsibilities over time. You add a small exception here, a new tool there, and three months later you have an agent whose domain you couldn’t clearly define if someone asked. I’ve had to pull back and rewrite configurations a couple of times to restore clear boundaries. This is ongoing work, not a one-time setup problem.

The operational overhead is real but manageable. I spend maybe an hour a month maintaining the stack: checking that containers are healthy, updating Ollama or models, reviewing and updating memory files when the state of something changes. That’s much less than I feared going in. The Unraid base helps; it handles container lifecycle reliably and I don’t spend much time keeping the platform running. The network layer is equally hands-off once set up: a TP-Link managed gigabit switch handles all the container traffic and I haven’t touched its config in months.

What changed most in how I interact with computers is how I frame tasks. Before this setup I’d navigate to something and do it. Now I often describe what I want to know or what I need to happen, and an agent traverses to the answer. That shift from “navigate and act” to “describe and delegate” sounds subtle and it has changed how I spend my time. The tedious traversal work, logging in, checking versions, looking up IDs, has mostly moved to agents. I do more deciding and less navigating.

What I’d do differently: I’d write the memory files before I needed them instead of building them up reactively. I’d define agent scopes in writing before I started configuring tools. And I’d spend less time on model selection early on and more time on making sure the tool connections were solid.

Six months in, the setup earns its overhead. I’m not going back to managing this manually.

Hardware linked in this post:

TP-Link 8-Port Gigabit Easy Smart Switch (TL-SG108E)

Affiliate disclosure: Some links in this post are Amazon affiliate links. If you buy through them, I get a small commission at no cost to you. It helps keep the lights on here.

Luke Burns2026-06-25T14:29:38-07:00July 15th, 2026|Categories: Blog|Tags: agents, ai, homelab, lessons, openclaw, personal-ai, practical-ai, reflections, self-hosted, six-months|0 Comments

The Self-Hosted AI Stack I’d Build If I Were Starting Over

I took a winding road to get my current AI homelab working. I’d make different choices if I were starting from scratch, and most of them would come down to doing less sooner rather than more.

The first thing I’d do is separate the model serving layer from everything else. Ollama as a standalone container, exposed on a private network, nothing else bundled in. A lot of guides will tell you to start with a full OpenWebUI stack, and OpenWebUI is fine, but it creates a coupling that makes things harder to reason about later. If your UI and your model server are the same deployment, you end up with friction when you want to swap one out or add a second frontend. Keep them separate from the start.

For hardware, I’d be more honest with myself about the model size tradeoff. My current build is a Ryzen 9 5900X on an ASUS TUF Gaming X570-Plus (Wi-Fi) board, and it handles everything I throw at it for inference routing and container management. A 7B parameter model runs well on consumer GPU memory, responds quickly, and handles most practical tasks. I spent too long trying to run 34B models on hardware that wasn’t really right for them, getting slow responses, and convincing myself the capability justified the latency. It usually didn’t. For day-to-day assistant work, a well-quantized 7B or 8B model is more useful than a sluggish 34B. Save the bigger models for tasks where reasoning quality actually matters.

The gateway layer is where I’d invest more early effort. This is the piece that connects LLM inference to real tools: file system access, APIs, shell commands, memory. I’m running OpenClaw for this. If I were starting fresh, I’d still choose a purpose-built gateway over trying to wire this together myself with n8n or LangChain. The operational overhead of maintaining custom orchestration code is real. A gateway that’s designed to manage agent lifecycles, credential handling, and tool permissions out of the box is worth the setup time.

Memory is something I’d take seriously from day one. The difference between an AI that knows the state of your environment and one that starts fresh every session is enormous in practice. That means deciding early on where state lives, how agents read and write it, and what format it’s in. Markdown files on a shared volume have worked well for me: human-readable, easy to edit when something’s wrong, git-friendly if you want version history.

For API keys and credentials, I’d use a secrets directory with tight permissions from the start rather than environment variables scattered across docker-compose files. It’s easier to audit, easier to rotate, and easier to scope to specific containers when something needs to change. This sounds like overkill when you’re standing up one container. It pays off when you have eight.

The thing I’d skip entirely on a first build is trying to run everything locally. Ollama handles local inference well. But for tasks that genuinely need a frontier model, the cost of API calls is low and the capability gap is large enough to matter. Don’t try to replace Claude with a local model for complex reasoning. Use local models where they’re good enough and cloud APIs where they’re not. That hybrid approach is cheaper and more capable than either extreme.

Finally, I’d document my container layout before it gets complicated. Which container serves which purpose, which ports are mapped, what credentials it needs. This sounds tedious and it is. Three months later when you’re trying to figure out why something stopped working, you’ll be glad you did it.

Hardware linked in this post:

Affiliate disclosure: Some links in this post are Amazon affiliate links. If you buy through them, I get a small commission at no cost to you. It helps keep the lights on here.

Luke Burns2026-06-18T11:46:47-07:00July 3rd, 2026|Categories: Blog|Tags: ai, docker, homelab, llm, ollama, practical-ai, recommendations, self-hosted, stack, Unraid|0 Comments

What Multi-Agent AI Actually Looks Like at Home

There’s a version of multi-agent AI that lives in conference talks: perfectly orchestrated systems where dozens of specialized agents collaborate on complex tasks without human input. What I have is messier and more practical, which makes it more interesting.

Here’s a real example from last week. I wanted to update the plugins on one of my WordPress sites. Normally that’s a login, navigate to updates, click apply, wait, done. With agents, I sent one message to Juniper: “Check elembemedia for plugin updates and tell me what’s pending.” Juniper delegated to Wren, who delegated to Apex, who ran a WP-CLI command inside the Docker container and returned a list. Juniper summarized it back to me. Nothing got updated without my go-ahead. But the information-gathering part, which is tedious and requires knowing which container maps to which site, happened in about 30 seconds.

That workflow is the pattern. The pattern is: gather the information, stage the action, I approve or redirect. The agents handle the boring traversal work: knowing which credentials to use, which container to exec into, which API endpoint to call. I stay in the loop for decisions.

Wren’s day-to-day is mostly WordPress housekeeping. She tracks the state of three sites: what plugins are active, what posts are published, what drafts exist. When I ask her to write a post draft, she saves it to the right folder with the right frontmatter. She won’t publish anything I haven’t explicitly signed off on. That’s a hard rule in her configuration, and it matters. The value isn’t that she does everything; it’s that she remembers everything so I don’t have to.

Apex is narrower. She has SSH access to the Unraid server and can run commands inside Docker containers. That access is deliberately constrained. She won’t run anything destructive without a confirmation loop. Her most common tasks are checking container status, running read-only WP-CLI commands, and running the actual update commands once I’ve approved what Wren found. She’s the hands; Wren is the eyes. All of this runs over a local network kept deliberately simple: a TP-Link 8-port gigabit switch handles the traffic between the server and the machines I work from. Nothing fancy; the goal is reliability.

Juniper is the coordinator, but in practice she’s less active than I expected. Most of my interactions are direct: I message Wren about WordPress stuff, I message Apex about infrastructure. Juniper is useful when I want to describe an outcome rather than a specific task and let her figure out the delegation. “What’s the update status across all three sites” is a Juniper question. “Activate the Twenty Twenty-Five theme on bacallburns” is a Wren question.

Fran handles family logistics: calendar, reminders, school schedules. Completely separate domain from the tech agents. She doesn’t have server access and doesn’t need it. Separating domains like this sounds obvious, but it takes deliberate configuration. Without clear scope boundaries, agents start trying to be helpful in ways that cross lines you didn’t know you cared about until they crossed them.

The honest limitation is that this setup requires real maintenance overhead to build initially. Defining scopes, setting up credentials, writing the memory files that give agents context about your environment: that’s a weekend of work, minimum. The return on that investment compounds over time as the agents accumulate useful state, but the upfront cost is real.

If this sounds interesting to you, the most useful thing I can suggest is starting with one agent and one domain. Get comfortable with what it can and can’t do before you start wiring agents together. The coordination layer is where complexity explodes.

Hardware linked in this post:

TP-Link 8-Port Gigabit Easy Smart Switch (TL-SG108E)

Affiliate disclosure: Some links in this post are Amazon affiliate links. If you buy through them, I get a small commission at no cost to you. It helps keep the lights on here.

Luke Burns2026-06-25T14:28:58-07:00June 30th, 2026|Categories: Blog|Tags: agents, ai, automation, home-server, homelab, multi-agent, openclaw, practical-ai, self-hosted, workflow|0 Comments