I Built a 65-Subsystem AI Operating System With No Engineering Background
One person, four repos, 65 subsystems, ~177,000 lines of code, 30 scheduled jobs, 2,792 tests. The honest build-in-public story of CAIOS — and what an indie hacker can actually steal from it.
I Built a 65-Subsystem AI Operating System With No Engineering Background
**65 subsystems. ~177,000 lines of code across four repos. 30 scheduled jobs running on a single VM. 2,792 tests. 637 work packages auto-completed. One person, no engineering background, no team, no funding.**
This is not a "look how easy AI makes coding" post. It is the opposite. It is a record of what one non-engineer could actually drag into existence by treating Claude as a colleague instead of a magic wand — and what it cost in scar tissue along the way.
If you are an indie hacker or solo founder staring at a blank repo wondering whether you can really build the thing in your head, this is for you.
---
The Starting Point
I work at a chain retail store. Before that I cycled through six or seven other jobs — almost all of them entry-level service work. After seven or eight years of repeating the same tasks every day, earning a predictable but uninspiring salary, navigating office politics and difficult customers, I hit a wall.
In late December 2025 I started asking myself a question I could not stop thinking about: *how do I escape this?* Not just this job — this entire structure where my income depends on showing up to a building and doing what someone else decides I should do. I wanted a path to financial independence. One where I could work on my own terms, or eventually not work at all and still cover my expenses.
My first idea was investing. If I could grow capital fast enough, I could quit sooner. I looked into the AI-powered automated trading tools that banks were starting to offer. None of them made sense to me, and none of them did what I actually wanted. So I asked an AI chatbot whether it was possible to build my own — a fully automated AI trading agent tailored to my own strategy. The answer was yes, in theory.
But investing requires capital, and my retail salary barely covers living expenses. There was no pile of spare cash to deploy. Then I noticed something on social media: people were building AI-powered tools and generating income directly. Not from investing — from building. Maybe I could do the same thing.
That gave me the idea that became everything: **build an AI system that generates income, then routes that income into an automated investment engine, creating a self-reinforcing growth loop.** Income funds investment. Investment compounds. The AI runs both sides.
The investment system — joseph — is built. It scans the Taiwan stock market, scores opportunities, simulates trades, and reports. It has been running in dry-run mode every weekday at 08:00 for months, but live trading is permanently locked in source code until I have enough capital and enough backtesting data to trust it.
The income side is where I am stuck. I built a web platform ([creatoraitools.tools](https://www.creatoraitools.tools)), I built 25 interactive tools, I set up SEO, I set up affiliate partnerships — but I have not earned a single dollar yet. That is the honest state of things as of April 2026.
Here is what matters for this story: **I cannot write code.** Not one line. When I started in December 2025, I would ask a chatbot for code, copy it into a Python file, and run it. I did not understand what any of it did. Then I discovered code editors with AI agents built in — tools like Windsurf — that could write and modify code for me inside a real project. And finally I found Claude Code, which could not only write code but plan, debug, test, and explain what it was doing in a way I could actually follow and direct.
My role in this entire stack is not "programmer." It is **product manager, architect, and QA — powered by AI that writes the code I describe.** I generate the ideas. I test the results. I verify the behavior. I decide what to build next. Claude writes the implementation. That loop — describe → generate → test → verify → iterate — is how 177,000 lines of code got written in about three months by someone who still cannot write a for-loop from memory.
I have not made any money yet. That is important to say, because most "I built X with AI" stories conveniently skip the part where the thing has not paid for itself. This one has not. What it has done is prove — to me, at least — that the gap between "I have an idea" and "I have a working system" is no longer a $200,000 engineering team. It is one person, one AI, and a lot of stubborn evenings after work.
What I will say up front, because it matters for everything that follows: I am not a software engineer. I did not study CS. I do not have a team. Every repo described below was built by one person sitting in front of a terminal, asking Claude questions, reading the answers, copying things into files, breaking things, asking Claude why they broke, and trying again.
The trick — if there is one — was deciding very early that I was not going to "learn to code" first. I was going to build the thing I needed and let the learning happen as a side effect.
---
What I Built
There are four repos in my workspace. Each one does a real job. None of them are toys.
**CAIOS** — the Central AI Operating System. This is the brain. **65 subsystems**, **657 Python files**, **154,740 lines of code**, **46 database tables**, **143 test files** containing **2,792 test functions**. It runs as four systemd services on a single VM (no Docker, no Kubernetes, nothing fancy): a FastAPI gateway with an embedded scheduler, a Telegram long-polling worker, a development orchestrator worker, and a Next.js web console. CAIOS plans, approves, executes, audits, and reports on everything I do across all the other repos. It is the thing I talk to.
**growth-factory** — the public product, [creatoraitools.tools](https://www.creatoraitools.tools). A Next.js 15 / React 19 / Prisma / PostgreSQL site for creators who want to build with AI. **233 files**, **21,395 lines of TypeScript**, **20 pages**, **30 API routes**, **29 Prisma models**. Deployed on Vercel. This is the one users actually see.
**joseph** — a Taiwan stock investment engine I started before CAIOS existed. It scans, scores, simulates, and reports. It has been running in dry-run / shadow mode every weekday at 08:00 for months. Live trading is **permanently blocked** in code, not in config. We will come back to that decision.
**buildhub-patrol** — the watchdog. **848 files**, an autopilot loop that has now completed **87 iterations**, plus a Playwright e2e suite that runs every night at 03:00 and a health patrol that runs every six hours against the production site. When something on growth-factory breaks, this is what notices first.
That is the whole portfolio. One brain, one product, one investment engine, one watchdog. They all talk to each other through CAIOS.
---
The Stack
I am listing this because every "I built X with AI" post is allergic to specifics, and specifics are the only thing that helps you decide whether the approach maps to your situation.
**Languages:** Python 3.11 (CAIOS, joseph, the growth-factory worker), TypeScript (growth-factory web), Bash (everything that glues things together), SQL.
**CAIOS backend:** FastAPI, SQLAlchemy 2.0, Alembic, Pydantic 2, APScheduler, python-telegram-bot, structlog, httpx, Redis, asyncpg. Packaged with **uv**, not poetry. SQLite for dev, PostgreSQL for prod. Tested with pytest + pytest-asyncio, linted with ruff.
**growth-factory web:** Next.js 15.5, React 19.2, NextAuth v5, Prisma 7.5 with the pg adapter, Tailwind 3.4, Playwright for e2e, Vitest for unit tests, the Sandpack in-browser code sandbox for interactive examples. TypeScript 5 throughout.
**LLM layer:** Claude (via the `claude` CLI on a Claude Max subscription) is the primary brain. Vertex AI / Gemini is the fallback. OpenAI is still wired in for one legacy path on growth-factory because I have not finished cutting it over.
**Infra:** A single GCP Compute Engine VM running systemd services. No Docker on this host. Vercel for the web product. Gmail SMTP for outbound email. Resend for transactional. Telegram Bot API for everything operational. Google Search Console feeding the SEO loops. twstock and yfinance feeding joseph.
That is the entire stack. There is nothing exotic. No vector DB. No managed Kubernetes. No microservices mesh. The complexity is in how the pieces are wired, not in how many pieces there are.
---
How It Actually Works
Out of 65 subsystems I want to show you four, because they are the ones that changed how I think about what one person can run.
### 1. The scheduler is the heartbeat
`src/scheduler/service.py` registers **27 distinct jobs**. The system crontab adds another three. That is 30 things happening on a clock without me touching anything.
A representative day: at 08:00 the morning brief lands in my Telegram. Joseph's dry run kicks off in parallel. Every 5 minutes a healthcheck pings the API. Every 30 minutes an anomaly patrol sweeps the workspace looking for things that drifted. Every 6 hours buildhub-patrol curls the production site. At 20:00 the daily report is generated. At 21:30 the evening recap runs. At 22:00 the ADO (Autonomous Development OS) summary fires. At 23:00 memory sync writes the day's learnings into the persistent memory file. Weekly jobs handle cleanup, security scans, memory compression, and email reports.
The lesson here is not "use APScheduler". The lesson is: **a small loop that runs forever beats a brilliant script you have to remember to run**. Most of CAIOS's value comes from things that happen while I am asleep.
### 2. The memory system is the difference between an assistant and a colleague
Every CAIOS conversation reads and writes to a persistent memory file. There is a topic-indexed `MEMORY.md` at the root, and individual memory files for projects, feedback, references, and user context underneath it. When I open a new session, Claude already knows that live trading is permanently blocked, that the web console is intentionally disabled right now, that the Proactive Loop is paused, that ADO has a known cross-repo execution gap, that I prefer terse responses, that I write code in English and talk to it in Traditional Chinese.
I did not invent this — it is built on top of Claude Code's auto-memory feature — but I leaned on it harder than most people do. The result is that "ramp-up cost" between sessions has effectively gone to zero. I do not re-explain my project every morning. The memory remembers.
If you take one thing from this article: **build a memory system on day one, not day one hundred**. The compounding is enormous.
### 3. The Telegram bot is the operations layer
I do not log into a dashboard. I do not SSH into the box for routine work. I talk to a Telegram bot in natural language and CAIOS routes the message through a normalize → plan → policy → execute → audit pipeline. Approvals come back as inline buttons. Errors come back as zh-TW summaries with a correlation ID I can grep on. Daily reports, anomaly alerts, ADO status, every scheduled-job result — all of it lands in one chat.
The reason this works is that the pipeline is governed. Every action has a risk classification (`AUTO_EXECUTE_LOW_RISK`, `REQUIRES_APPROVAL`, `FORBIDDEN`). Anything risky stops at the approval gate and waits for me to tap a button. Anything safe just runs and tells me afterwards.
This is what made it possible for one person to operate four repos. The interface to the entire system is one chat thread.
### 4. The Autonomous Development OS writes its own work
ADO is the part that still feels like science fiction to me. It reads work-package backlogs out of markdown files, deduplicates them, classifies them, plans them, dispatches them to a real CLI (Claude or Codex) running in an isolated git worktree, runs the post-execution validation suite (pytest + lint + smoke + contract + git diff), and either commits the result or routes the failure to the repair pipeline.
As of this morning the work-package table reads: **637 completed, 987 cancelled, 0 queued, 0 awaiting approval, 0 executing**. The whole backlog is drained. The cancelled count is high because — and this is the next section — the loop went off the rails twice and I had to nuke the garbage.
Real coding is gated behind a feature flag (`ado_real_coding_enabled`), and `git push` is **hard-disabled in source code, not config**. The executor can commit to an isolated branch. It cannot publish anything. That is on purpose.
---
The Hard Lessons
These are real entries from my project memory, not made up for the article.
### Lesson 1 — A flapping watchdog is worse than no watchdog
The CAIOS web console has a watchdog service that is supposed to bring it back up if it crashes. The watchdog itself started flapping. So on 2026-04-06 I deliberately disabled both the web console and its watchdog, and wrote into the memory file exactly how to re-enable them later (set `WEB_CONSOLE_ENABLED=true`, restart two services). The system has been stable since.
The lesson is uncomfortable for engineers and obvious to operators: **ship the off switch before you ship the feature**. Anything that can run autonomously must be one environment variable away from being silent.
### Lesson 2 — When an autonomous loop produces garbage, stop the loop first
The Proactive Agent Loop and the ADO backlog ingestion both, at different times, produced "junk WP" explosions — the loop kept ingesting fragments of markdown as new work items, deduplication missed them, and the backlog ballooned. The fix in both cases was the same shape: **stop the loop, fix the root cause, then restart**. Not "let it run while I patch live". I have 156 cancelled work packages in the database to remind me what the alternative looks like.
If you build an autonomous loop, build the kill switch in the same commit. If you cannot stop it instantly, you do not control it.
### Lesson 3 — Irreversible actions get compile-time blocks, not config flags
Joseph's `allow_live` is hard-coded `False` in the adapter. CAIOS marks live trading as PERMANENTLY BLOCKED. ADO's `ado_real_coding_allow_push` is hard-disabled in the executor itself, so even with real coding turned on, the executor physically cannot push to a remote.
There is also a small but real bug story attached: when I first wrote the safety check, I used `bool(settings.allow_push)` instead of `is True`. Under MagicMock-based tests that meant a mocked settings object would silently evaluate truthy and bypass the guard. The fix — `if settings.allow_push is True` — is two characters longer and prevents an entire class of test-only false negatives.
The general principle: **for anything you cannot undo, the safety check belongs in source code, not in a config file you might forget to set**. Real money. Force pushes. Production database wipes. None of those should be one typo away.
---
What You Can Learn From This
I am not going to pretend my approach is the only one. But here is what actually worked, in the order it mattered:
1. **Pick a real project, not a tutorial.** I did not learn Python and then build CAIOS. I tried to build CAIOS and learned Python in the cracks. Tutorials teach you to type. Real projects teach you to think.
2. **Treat the AI as a colleague who has already read everything.** The mistake people make with Claude is asking it to *do* the task. The right move is to brief it the way you would brief a smart contractor: here is the goal, here is what I have already tried, here is what I want it to look like, here is the constraint. The output quality scales linearly with the quality of the brief.
3. **Build the operations layer before the features.** My single biggest unlock was wiring everything to Telegram on day one. The moment "checking on the system" stopped requiring an SSH session, my throughput went up an order of magnitude. Build the chat interface first. Add features into the chat interface.
4. **Memory is the moat.** A persistent, structured, self-updating memory file is the difference between starting every conversation from zero and resuming a conversation in progress. Set this up before you do anything else.
5. **Schedule everything you would otherwise forget.** Most of CAIOS's value happens at 08:00 and 23:00 — not because those times are special, but because *I am not at the keyboard* and the system is still working. Cron is the most underrated framework in the world.
6. **Off switches before features. Approval gates before automation. Compile-time blocks before config flags.** The order matters. Build the brake pedal before you build the engine.
7. **Tests are how you sleep at night.** I have 2,792 of them. I did not write them all by hand — many are AI-generated — but I read every single one and they have caught more regressions than I can count. If you are building autonomous loops, the tests are not optional. They are the only thing standing between "the system fixed itself" and "the system silently destroyed itself".
8. **Ship the boring things first.** Healthchecks. Daily reports. Audit logs. Correlation IDs. Backups. None of that is glamorous. All of it is what makes the glamorous stuff actually survive contact with reality.
The honest summary is this: **AI did not let me skip the engineering. It let me do the engineering without first becoming an engineer.** That is a real distinction, and it is the one I think people miss when they argue about whether non-engineers can build real software with LLMs. The answer is yes, but only if you are willing to do the un-glamorous parts that no tutorial covers. Approvals. Audit trails. Off switches. Memory. Schedulers. Tests.
If you are willing to do those, one person can run a lot more than you think.
---
Where to go next
Follow the build
I am writing this whole stack down in public, one piece at a time. The next post breaks down exactly how the Telegram operations layer is wired — how a retail worker talks to 65 subsystems through one chat thread.
If you want to follow along, drop your email below. No spam, no pitch — just the next chapter when it is ready.
