Montreal CTO Meetup · April 2026

The Agent Is The Last Step

How C3 shipped 17 production fixes in 20 days on Mentorly. While I slept.
Chip · Founder, IdeaPlaces
The Open

Imagine you are asleep.

It is 1:37 in the morning.

Your messaging feature is broken in production.

You wake up at seven.

A pull request is waiting. Root cause named. Tests added. Ready to review.

You read it over coffee. Merge it at 7:15.

You did not wake up for it.

The Setup

Every AI company promises this. Mine is running tonight.

Not in a vendor cloud. Not in a demo. On my actual dev machine in Montreal. If a Mentorly alert fires at 2 AM tonight, exactly what I just described will happen.

Mentorly is a Rails and Next.js SaaS with real paying customers. For the last 20 days, an agent has been handling production incidents in this codebase. Autonomously. While I sleep.

20
Days live
17+
PRs drafted
0
Wakeups
100%
Human reviewed

Agent drafts. I review and merge. On an enterprise SaaS with real customers, the agent never touches production unilaterally.

The Thesis

The pipeline is the product.

The agent is the last step.

C3 only sees errors a pipeline I built has already structured, summarized, and routed to the right Slack channel with the right prompt bound to the right repo.

Step 1 · The Flip

Move your dev machine to the cloud.

Instead of an agent running in someone else's sandbox, your own development environment runs on an always-on cloud VM. Same project. Same tools. Same credentials. You SSH in from your laptop like any remote dev box.

Your laptopSSH client
Cloud VM (your dev machine)your project · your tools · your credentials · always on

It is your dev machine. Just always on.

Nothing about C3 is "someone else's cloud running my code." It is my machine, relocated. This one design choice is where everything else follows from.

Architecture

Five minutes from user bug to agent with context.

Container AppsRails, Next.js, jobs
Log Analyticsone shared workspace
Azure Function5-min timer, KQL, GPT-4.1
Slack channel#alerts-backend-prod
C3 Agentyour machine, your repo
Pull Requeston GitHub

Every piece is in Terraform. Version controlled. Under five dollars a month per environment. Silent when healthy.

Not a Sandbox

Anthropic runs agents in their cloud. C3 runs on yours.

Claude Code Routines and Managed Agents are real and useful, but they are sandboxed. That is their design. C3 is the other shape of the same idea.

Capability Anthropic Routines / Managed Agents C3
Where it runs Anthropic cloud sandbox Your always-on dev machine
Production database No direct access Read-only via your existing query scripts
Cloud CLI & secrets Scoped connectors Azure CLI, kubectl, Key Vault via managed identity
Session continuity Run-to-completion Resume on web, phone, terminal, VS Code
Model choice Claude only Claude Code, OpenAI Codex, OpenClaude, anything
Prompt & behavior Anthropic-defined shape Plain markdown files in ~/.c3/prompts/

Your machine. Your credentials. Your real tools.

The Context Gap

"Isn't this just Cursor with extra steps?"

No. The difference is context.

What a developer does today

  • Email or Slack ping arrives with an error
  • Copy the error message
  • Open a chat with an AI
  • Paste. Ask "what is this?"

The AI sees: one paragraph of text. No repo. No logs. No history.

What C3 starts with

  • The structured error + all context fields
  • The correct repo, cloned and authenticated
  • Read access to the production database
  • Azure CLI with your credentials
  • Every prior PR in that repo, including its own
  • The prompt that encodes your team's rules

The developer starts from zero. The agent starts from ninety.

The Output Contract

Every alert thread looks the same. Every time.

0:00
Alert
Alert Summarizer posts the structured error
+3 to 15s
Session started
C3 posts session ID and a watch-live URL
+5 to 16 min
Investigation Complete
Root Cause · Severity · Classification · PR link · Recommendation

Consistency is what makes an agent trustworthy in production. Not intelligence. Not speed. The same shape, every time.

Demo 1 · Enterprise SSO

David Yurman employees can't log in.

April 2, 1:47 PM. Alert fires. Duo Security SAML. Config key mismatch.

The next day, another alert fires.

"After PR #696 renamed `dy` to `davidyurman`, David Yurman's Duo Security can now reach the callback endpoint."

The agent named its own prior PR.

Cause and effect reasoning across two days on real customer production.

The Principle

Agent proposes. Human disposes.

On an enterprise SaaS, the agent does not touch production unilaterally. Ever.

What the agent does:

What you do:

Review. Merge. Ship.

Agent saves investigation time. Not judgment time.

Example: PR #709. Agent investigated correctly, opened the PR. I reviewed and closed it.

Demo 3 · The Sleep Slide

April 19. 1:37 AM.

01:37:39
Alert: React "Maximum call stack size exceeded" in production messaging
01:37:52
Agent replies with session ID and watch-live URL (13 seconds later)
01:53:34
Full root cause + PR #1625, naming two specific commits that introduced the bug
07:10:00
I open my laptop. Read the PR over coffee.
07:15:00
I merge it. 3 hours of investigation I did not do.
"Consider enabling the react-hooks exhaustive-deps ESLint rule project-wide to catch render-time side effects earlier."

That is not a patch. That is a team lesson.

Beyond Incidents. Beyond Claude Code.

The same machine runs everything else in my workflow.

Production error response is one use case. The same platform runs the rest of my day.

What triggers a session

  • Slack and Discord messages (event)
  • Cron schedules (time)
  • Webhooks from external systems
  • Me, on the web or from my phone

What the agent can do

  • Reply in the thread that triggered it
  • Post to a different channel
  • DM a person directly
  • Write a report. Open a PR. Call an API.

Which agent runs it

  • Claude Code for most engineering work
  • OpenAI Codex for a second perspective
  • OpenClaude + Azure GPT-5 for a third

Every morning I get three independent reports on new agent-related GitHub repos. Three different models. Three different perspectives. Same question.

I tweak the prompts. The output evolves with my taste.

I set intentions. Not tasks.

Continuity

One session. Many entry points.

A session is not a pipeline step that runs to completion. It is a conversation you can join and leave from any device.

Where you startHow
Slack or DiscordAlert fires. Agent picks up the prompt bound to the channel and starts.
Web browserc3-chip.ideaplaces.com, pick a prompt, click start.
PhoneSame URL. From the car, the pickup line, a cafe.
Terminalclaude --resume <uuid> picks up exactly where the agent left off.
VS CodeSession streams into a side panel. Take over, type, hand it back.

Most of the time the agent completes the full PR alone. Sometimes it hands you a super-professional 80%, and you step in to finish.

Start anywhere. Continue anywhere.

The Numbers

20 days. 17 merges. Zero wakeups.

MetricValue
Days running continuously on Mentorly20
PRs drafted by the agent17+
PRs I reviewed and merged17+
PRs I reviewed and closed (judgment call mine)1 · PR #709
Production incidents escalated off-hours0
Typical alert to PR time5 to 16 minutes
Monthly cost of the upstream pipelineUnder $5 per environment
Why This Is A CTO Talk

"Can I just hand this to my junior engineer?"

No. And that is the point.

RequirementWhat you need already built
Structured loggingA contract every service follows
One observability sinkErrors flowing to a single destination
Channel routingEach error class bound to the right prompt and repo
Confidence gateYour team's decision rules encoded in prompts
Always-on dev machinePre-authenticated with production tooling

Without the pipeline, you get hallucinated file paths and shotgun fixes. With it, you get autonomous incident response that merges before you wake up.

The Punchline

This is not AI deciding things on your production system.

This is a pipeline.

The agent is the last step. You already built the first nine.

Try It Yourself

Open source. Apache 2.0. Your machine, your credentials, your rules.

GitHub
github.com/Ideaplaces/c3
Live
c3.ideaplaces.com
Docs
docs.ideaplaces.com/c3

30 minutes to your first trigger. The investment is writing prompts. That is where your team's expertise gets encoded into reusable infrastructure.

Q&A

Three questions you will ask. Short answers.

"What does it cost?"

Claude Max subscription. One Azure B2ms VM at $60/month. Total under $500/month for unlimited sessions across every project.

"Why not Anthropic Managed Agents or Devin?"

Sandboxed. They cannot query your production database, use your Azure CLI, access your Key Vault, or resume a conversation across your devices. C3 runs where your code lives.

"What about safety?"

Safety lives in the prompt, not the platform. Production channels get read-only investigation and PR-only fixes. Test channels get full access. You write the rules per channel.