There's an irony that I've sat with for a while: I spend my days building AI systems for healthcare organizations, writing specs for AI agents that automate complex clinical workflows - and then I come home and manually write, format, and publish every piece of content by hand.

That gap bothered me. Not because manual content work is beneath me, but because I knew exactly what was possible. I'd seen AI agents handle complex multi-step workflows in enterprise environments. Why was I still copying and pasting HTML into a CMS?

So about eight months ago, I started building what I now call Jarvis - a personal AI operating system that runs, among other things, my entire content operation. This is a behind-the-scenes look at what I built, what works, what doesn't, and what I'm still figuring out.

Full transparency: This post itself was partially generated using the system I'm describing. I wrote the outline, contributed the personal anecdotes and opinions, and reviewed and edited the draft. The system handled structure, expansion, and formatting. I'll explain exactly how at the end.


The Problem I Was Solving

My content goals are straightforward: write and publish substantive, high-quality content about AI product management across multiple platforms (this blog, LinkedIn, Medium, Substack) consistently enough to build an audience and establish credibility in my field.

The gap between that goal and my available time was brutal. I work full-time leading AI product at HCLTech, I run Ethlore (a D2C brand) on the side, and I'm a human with a need for sleep and a social life. I could generate maybe 2-3 hours per week for content creation.

2-3 hours per week is not enough to write, edit, format, optimize, publish, and distribute one high-quality long-form post, let alone the volume needed to build meaningful platform presence.

The traditional answer is to hire a content team. I'm not at that stage yet. The AI answer is to build a system.

The Architecture: What Jarvis Actually Is

Jarvis is a collection of Python scripts and AI agents that live in a private GitHub repository. It runs primarily on Modal (a serverless GPU/CPU cloud platform) with Claude as the primary LLM. The whole thing is orchestrated through a combination of scheduled jobs, webhook triggers, and manual invocations via Claude Code in my terminal.

The content operation specifically has five components:

1. The Idea Engine

I maintain a running notes file where I dump article ideas, observations, half-formed thoughts, and things I've seen that I want to write about. The Idea Engine runs weekly, reads this file, pulls recent AI news and LinkedIn trending topics, and produces a ranked list of content opportunities with suggested angles.

It factors in: what I've already written (to avoid repetition), what's trending in my niche (so I'm timely), what has strong SEO opportunity (keyword gap analysis), and what's likely to resonate with my specific audience (based on historical engagement patterns).

I spend 15 minutes on Monday reviewing the list and picking 2-3 ideas for the week. That's the primary human-in-the-loop step for idea selection.

2. The Research Agent

Once I've selected an idea and provided a brief (usually 3-5 bullet points of what I want to cover), the Research Agent does a structured research pass. It searches for relevant recent articles, pulls data from my saved library of papers and references, identifies counterarguments worth engaging with, and compiles a research brief.

The output is a document that includes key facts, statistics, relevant company examples, and potential structural approaches. I review this (10-15 minutes) and add my own perspectives, personal examples, and the specific opinions I want to express.

3. The Draft Generator

The Draft Generator takes the research brief plus my annotations and produces a full draft in my voice. This is where the system earns its keep most obviously.

"In my voice" is not trivial to achieve. I spent about three months fine-tuning the system prompt and few-shot examples to get drafts that sound like me - direct, first-person, opinion-forward, with specific examples over vague generalities. I trained it on my previous writing and had it internalize my structural preferences (lead with the problem, use specific examples early, don't hedge, state opinions clearly).

The drafts are consistently good now. They're not yet good enough to publish unedited - I always spend 20-40 minutes revising - but they're good enough that I'm not rewriting from scratch.

4. The Publishing Pipeline

This is where I really saved time. Once a draft is approved, the Publishing Pipeline:

  • Formats the HTML appropriately for Ghost CMS
  • Generates SEO metadata (meta title, meta description, excerpt)
  • Suggests tags and categories based on existing taxonomy
  • Generates a LinkedIn post version (professional, opinion-forward, 1500-3000 characters)
  • Generates a Twitter/X thread version (10-15 tweets, thread format)
  • Generates a Medium cross-post version with appropriate formatting
  • Schedules all of these for publication at optimal times

Before this system, formatting and distribution alone took 1.5-2 hours per post. The pipeline reduces it to about 15 minutes of review and approval.

5. The Performance Monitor

Weekly, the Performance Monitor pulls engagement data from all platforms, identifies which posts are performing well (and why), surfaces posts worth promoting further, and feeds learnings back into the Idea Engine's content opportunity scoring.

This closes the feedback loop that most content creators never close - your best content informs your next content. The system learns what resonates with your audience in a structured way.

The Tech Stack

For those who want the specifics:

  • LLM: Claude 3.5 Sonnet for drafting and complex reasoning; Claude 3 Haiku for classification tasks and metadata generation
  • Orchestration: Claude Code (Anthropic's CLI) for local development and ad-hoc tasks; Modal for scheduled cloud execution
  • Infrastructure: Python throughout; python-dotenv for config; custom resilience layer with exponential backoff for all API calls
  • CMS: Ghost with Admin API for publishing
  • Social distribution: LinkedIn API (company page + personal), Twitter API v2, Medium import API
  • Analytics: Custom script pulling from Ghost Stats, LinkedIn Analytics, and Twitter Analytics into a unified dashboard
  • Storage: Google Sheets as the lightweight database for post tracking and performance data

What Works Really Well

Distribution automation is the biggest win. Writing the content is the interesting part. Reformatting it for six platforms, generating the right metadata for each, scheduling at optimal times - that's pure mechanical work. Automating it fully gives me back 2+ hours per post.

Research aggregation saves enormous time. The Research Agent is particularly valuable for posts that require surveying a space ("AI in fintech", "edtech personalization") rather than purely expressing my opinion. It assembles a research brief in minutes that would take me 2-3 hours of reading and note-taking.

The feedback loop improves quality over time. I'm now on my eighth month of running this system. The Idea Engine's content recommendations have gotten measurably better at predicting what gets engagement, because it has a richer performance data set to learn from.

What Doesn't Work (Yet)

Personal stories and specific experiences. The system can't generate the story I told earlier about the 94% accuracy mistake at Edxcare. It can't know what I observed at HIMSS26 or what I felt when a product I built flopped. The personal experience layer still requires me, which is as it should be. But it means the drafts need significant human enrichment to be excellent rather than merely competent.

Hot takes and genuine opinion. I can instruct the system to be opinion-forward, but genuine contrarianism - the "I disagree with the conventional wisdom here" moments that make content memorable - still comes from me. The system gives me well-organized, defensible positions. I inject the provocative takes.

Complex multi-platform coordination. When a post goes viral on LinkedIn and I want to react in real-time - expanding on a point, responding to counterarguments, publishing a follow-up - the system can't handle the improvisation. The scheduled workflow doesn't accommodate viral moments well.

SEO keyword research at depth. My metadata generation is good but not great. I'm still doing manual keyword research for my most important SEO-focused posts.

The Honest ROI

Before Jarvis: roughly 3-4 hours per post, 2-3 posts per month = ~8-10 hours per month on content.

After Jarvis: roughly 1-1.5 hours per post for higher quality output, 4-6 posts per month = ~5-7 hours per month on content.

I'm publishing more, spending similar total time, and the quality (based on engagement metrics) is higher because more of my time goes into the high-value parts (ideas, opinions, personal examples) rather than the low-value parts (formatting, distribution, metadata).

The system took about 6 weeks to build and tune. That's a significant upfront investment that only makes sense if you're committed to sustained content production. I am, so it was worth it.

About This Post

Here's the specific process for this post: I wrote a 6-bullet brief covering what I wanted to say. The Research Agent pulled relevant data on AI content tools and creator stacks. I added my personal examples (the Edxcare story, the specific numbers, the architecture details). The Draft Generator produced a first draft. I rewrote approximately 30% of it - mostly the opening, all the personal anecdotes, and the "What Doesn't Work" section, which required genuine self-reflection rather than AI synthesis.

Is that authentic? I think it is. The ideas are mine. The opinions are mine. The experiences are mine. The system helped me write them down more efficiently and present them more clearly. That's not different in kind from using a word processor or a research assistant.

The question worth asking: Would this post exist without the system? Probably not this week. Possibly not this month, given my schedule. The system didn't replace my voice - it gave my voice more opportunity to be heard.



Related reading