kestrelsnest-blog/2025-09-22-lessons-from-the-solo-developer-trenches.md at 718ffb16385cdd1de4958cd9a9cd3cfe8d00fc6c

eric/kestrelsnest-blog

Fork 0

Files

Eric Wagoner 062c54a26b Locallygrown part five

2025-09-22 18:20:24 -04:00

45 KiB

Raw Blame History

title, description, date, preview, draft, tags, categories, lastmod, keywords, slug

title

description

date

preview

draft

TL;DR

I didn't use "AI" to write my app. I used code-aware tools to multiply my reach by researching options, generating tests and docs, and catching patterns. I stayed the architect and quality gate.

What worked: Test generation, documentation, technology research, code review What failed: Following architectural rules, understanding business logic, knowing when to stop What made it safe: Specs first, human review always, specialized agents for audits, hard CI/CD guardrails

The Starting Point — Rails-to-SvelteKit rewrite and the impossible solo deadline
The Evolution of Our Working Relationship — From architecture analysis to rule-breaking intern
Page-by-Page Migration — Password upgrades, pixel-perfect clones, and CSS survival
Claude Learns to Drive — Puppeteer testing and theme validation
Reimagining the Back Office — Modern tools for managers with sub-agent Ray
Building a Team of Specialized Code Analyzers — Human-named agents and the orchestra effect
The Parallel Universe Gambit — Running legacy and modern front-ends side by side
What Claude Excels At — Boilerplate, testing, review, documentation
What Claude Struggles With — Business logic, refactoring, debugging
The Two-Week Launch Gauntlet — Structured audits, blockers, and the go/no-go ritual
Lessons Learned — How to keep ML tools safe, useful, and human-centered
A Critical Warning — Why “no-code AI SaaS” launches are dangerous
What’s Next — Building on stable foundations for 20+ years ahead

The Starting Point

March 2025: Day one of the Rails-to-SvelteKit rewrite. I knew from the start I'd need the latest generation of developer tools:

23 Rails controllers, 23 models, 173 view templates, and 23 helpers to convert
All the business logic embedded in those views and models
27 database tables to migrate with all their relationships
Payment system to completely rebuild for modern Stripe
Authentication to reimplement in TypeScript
23 years of accumulated features and edge cases to preserve (Rails since 2006)
Documentation to write from scratch
Test coverage at 0%
Launch deadline: August 14, 2025 (5 months away)

I was one person trying to rebuild my own 23-year-old platform in a completely new framework while keeping the existing one running for thousands of users across dozens of markets. I needed tools that could help me plan work, organize tasks, document designs, write tests that match those designs, and review code for compliance.

I started with Cursor, an IDE with advanced code completion and pattern recognition. But after a few weeks, I switched to Claude Code. Unlike the "AI" that floods the internet with generic articles and stolen art styles, Claude Code is closer to what we used to call machine learning. It's trained on programming patterns, not creative works. It understands code structure, can trace logic flows, and most importantly, helps maintain consistency across thousands of files.

I chose Claude Code for its code-aware workflow and Anthropic's published emphasis on safety; that alignment mattered to me.

That decision to go all-in with Claude transformed everything. What started as smart autocomplete evolved into something unexpected: a programming partner that could hold the entire codebase in context.

The Evolution of Our Working Relationship

Day 1-7: The Architecture Phase

Before writing a single line of production code, I had Claude analyze the entire Rails application:

Me: "Here's my Rails schema and Gemfile. I'm moving to SvelteKit.
     What's the modern equivalent stack?"

Claude: "Looking at your ActiveRecord usage, consider these ORMs:
- Prisma: Most popular, good DX but heavy
- Drizzle: Lightweight, SQL-like, perfect for your use case
- TypeORM: Familiar if coming from ActiveRecord

For your use case with complex queries and performance needs,
I'd recommend Drizzle because..."

Me: "What about replacing Paperclip for image uploads?"

Claude: "Modern options:
- Uploadthing: Simple but locked to their service
- Local storage + Sharp for processing
- S3 + CloudFront for scale

Given your farmer's market context and need for simplicity,
local storage with Sharp gives you full control..."

We spent the first week evaluating every gem's modern equivalent:

ActiveRecord → Drizzle ORM (after comparing Prisma, TypeORM, and others)
Paperclip → Sharp for processing + S3/CloudFront for serving (scalable image handling)
Prawn → PDFKit (for label generation)
Geokit → Custom distance calculations (simple haversine formula)
gmaps4rails → OpenStreetMap/Leaflet (privacy-respecting)
will_paginate → Custom pagination (simpler with Drizzle)
delayed_job → BullMQ (modern job queues with Redis)
Stripe gem → Stripe SDK for Node (official, well-maintained)

Week 2-3: The Documentation and Testing Foundation

Before writing any Svelte code, I had Claude help me create a complete blueprint of the existing system:

Me: "Let's document the User model. Here's the Rails code.
     Create a TypeScript service that replicates every method."

Claude: "I'll create UserService with type-safe equivalents:
```typescript
// Replicating User.find_by_email_and_market
static async findByEmailAndMarket(email: string, marketId: number): Promise<User[]> {
  // Exact behavior from Rails, now in TypeScript
}
```"

Me: "Now write tests that verify this matches Rails behavior exactly"

Claude: "Creating comprehensive test suite:
```typescript
describe('UserService.findByEmailAndMarket', () => {
  it('returns empty array when no user found', async () => {
    // Test matches Rails: User.find_by_email_and_market('nonexistent', 1) => nil
    const result = await UserService.findByEmailAndMarket('nonexistent', 1);
    expect(result).toEqual([]);
  });

  it('handles multiple users with same email in different markets', async () => {
    // Documenting Rails edge case behavior
  });
});
```"

Me: "Perfect. Now document this in markdown with examples"

Claude: "Creating comprehensive documentation:
```markdown
## UserService.findByEmailAndMarket

Finds users by email within a specific market context.

### Rails Equivalent
`User.find_by_email_and_market(email, market_id)`

### Behavior Notes
- Returns array (Rails returned single record or nil)
- Case-sensitive email matching (preserves Rails behavior)
- Used in authentication flow and order assignment
```"

Another important output of this process was the CLAUDE.md file that emerged, containing rules for the codebase:

Me: "We keep having to repeat these patterns. Let me document the rules."

// Early CLAUDE.md entries:
- NEVER make database calls outside service files
- ALL database logic goes in /src/lib/server/services/
- Use camelCase for variables, PascalCase for types
- Every service method needs error handling
- Always return typed objects, never raw database rows

Me: "Claude, you JUST made a DB call in a route handler. AGAIN.
     The rules are RIGHT THERE in CLAUDE.md!"

Claude: "You're right, let me fix that and use the service layer..."

Me (next day): "Another DB call in a route. Are you even reading CLAUDE.md?"

Claude: "I apologize, I should have used the UserService..."

If Claude were a human intern, I'd have fired them for repeatedly ignoring the CLAUDE.md rules: DB calls in routes, inconsistent naming, bypassed service layers. But as a research assistant, test generator, and documentation writer? The value far outweighed the frustration of constantly correcting architectural violations.

By the end of week 3, we had:

27 TypeScript services matching Rails models
500+ unit tests documenting exact Rails behavior
100+ pages of markdown explaining every method
CLAUDE.md - our collaboration constitution
Zero UI code - but complete confidence in the foundation

Week 4-5: Page-by-Page Migration

We started converting the Rails application one page at a time, beginning with authentication:

Me: "We need to support the old Rails password hashes. I can't force
     thousands of users to reset passwords simultaneously."

Claude: "Rails uses bcrypt with a specific salt format. We could..."

Me: "Right, but these are OLD Rails hashes. Some might be MD5 from 2006."

Claude: "Interesting challenge. We could implement a versioning system..."

This led to one of my favorite solutions: a seamless password migration system.

Me: "Here's my idea: detect the hash version, validate using the old
     algorithm, then after X successful logins, silently upgrade them
     to modern Argon2."

Claude: "That's elegant. Let me help implement:

class PasswordService {
  // Detect hash version
  static getHashVersion(hash: string): 'legacy_md5' | 'legacy_bcrypt' | 'modern_argon2' {
    if (hash.startsWith('$2a$')) return 'legacy_bcrypt';
    if (hash.startsWith('$argon2')) return 'modern_argon2';
    return 'legacy_md5';
  }

  // Validate with appropriate algorithm
  static async verify(password: string, hash: string): Promise<boolean> {
    const version = this.getHashVersion(hash);
    switch(version) {
      case 'legacy_md5': return this.verifyMD5(password, hash);
      case 'legacy_bcrypt': return bcrypt.compare(password, hash);
      case 'modern_argon2': return argon2.verify(hash, password);
    }
  }

  // After successful login, maybe upgrade
  static async maybeUpgrade(userId: number, password: string, loginCount: number) {
    if (loginCount >= 3) { // After 3 successful logins
      const newHash = await argon2.hash(password);
      await UserService.updatePasswordHash(userId, newHash);
    }
  }
}"

Me: "Perfect! And we'll set a deadline 6 months out for forced reset
     if they haven't hit the threshold."

This wasn't a real discussion. Claude doesn't truly understand the human impact of forcing password resets on farmers at 5 AM trying to print harvest sheets. But the rubber ducking process (and yes, rubber ducking with actual rubber ducks works for a reason) helped me think through an elegant solution.

The result: Zero support tickets about passwords during migration. Users never knew their security was being upgraded behind the scenes.

Month 2: The Methodical Page-by-Page Clone

This phase wasn't glamorous, but it was crucial. Every market had custom CSS, and breaking their styling would destroy trust:

Me: "We need to match the HTML structure EXACTLY. Same IDs, same classes,
     same nesting. Markets have 19 years of custom CSS depending on this."

Claude: "So we're creating a pixel-perfect clone in Svelte?"

Me: "Exactly. My 2007 design looks ugly as sin now, but that's not the point.
     This is about trust and continuity."

We developed a methodical routine for each page:

Me: "Here's the Rails orders/index.erb view. Let's break it down."

Claude: "I see several role-based sections:
- Admin sees all orders with edit buttons
- Growers see only their products
- Customers see only their orders
- The HTML structure uses #order-list > .order-item > .order-details"

Me: "Right. And notice how the classes change based on order status?"

Claude: "Yes: class='order-item status-#{order.status}'
We'll need to replicate that exact pattern."

Me: "Here's what the page.server.ts needs to fetch..."

Claude: "Based on the view, we need:
- User's orders with items and products
- Market configuration for display options
- Grower details if user is a grower
- All maintaining the exact div structure"

Page after page, we followed this pattern:

Examine Rails views: understand every conditional, every loop
Document data requirements: what the server needs to fetch
Map role-based variations: how admins/growers/customers see differently
Match HTML exactly: same IDs, classes, structure
Test with real market CSS: ensure their customizations still work

<!-- The Svelte version had to produce identical HTML -->
<div id="order-list">
  {#each orders as order}
    <div class="order-item status-{order.status}" id="order_{order.id}">
      <div class="order-details">
        <!-- Every class, every ID, exactly as Rails produced -->
        {#if userRole === 'admin'}
          <div class="admin-controls">...</div>
        {/if}
      </div>
    </div>
  {/each}
</div>

By the end of Month 2, we had a complete visual clone. It looked exactly like the 2007 design: dated, clunky, but perfectly familiar to every user.

I couldn't test every single market's custom CSS. There wasn't time. But I had a secret weapon: a collection of themes from a Tennessee designer who'd gotten out of the web business years ago, selling me his LocallyGrown theme collection as he exited. These themes varied widely in style and gave me a solid baseline for testing. I knew if I could make these work, I'd have a foundation for helping individual markets adapt their custom CSS later.

Some markets had gone far beyond these themes with their own deep customizations. They'd need tweaks post-launch. But having those baseline themes working meant the core structure was solid. The IDs and classes were right, the nesting was correct. Markets could adapt their CSS as needed rather than having everything break at once.

This wasn't the exciting rewrite I'd imagined, but it was the right approach. Trust first, modernization later, and realistic about what could be tested before launch.

Month 3: Claude Learns to Drive

Everything changed when Claude stopped reading screenshots and started driving the browser itself. We integrated Puppeteer, letting Claude use a browser directly:

Me: "I'm tired of taking screenshots for you. Here's Puppeteer access.
     Drive yourself to the dev server and look for issues."

Claude: "Navigating to https://localhost:3000/orders...
        Taking screenshot...
        I can see the order table is missing the status column for growers.
        Let me check if it's a CSS issue or missing data..."

[Claude navigates to inspect mode, checks elements]

Claude: "The data is there but hidden. The CSS class .grower-view is
        overriding .status-column with display:none"

Me: "Finally! I've been staring at this for an hour."

This changed everything. Claude could now:

Self-verify fixes: Make a change, navigate to the page, confirm it worked
Test user flows: Click through multi-step processes like checkout
Compare Rails vs Svelte: Open both versions side-by-side, spot differences
Hunt for visual bugs: Systematically visit every page looking for issues

Me: "Check if all the theme variations work with the new order page"

Claude: "I'll test all 12 themes:
[Switches theme]
- Theme 1: ✓ Order table displays correctly
[Switches theme]
- Theme 2: ✓ Working
[Switches theme]
- Theme 3: ✗ ISSUE: Order totals overlapping with sidebar
  The float:left on .order-summary conflicts with .sidebar-right
[Continues through all themes]

Found issues in themes 3, 7, and 11. Here are the fixes..."

Month 4: Reimagining the Back Office

After achieving the visual clone, I realized an opportunity: market manager pages didn't need customer-facing branding. These were tools, not storefronts.

Me: "The manager pages (order management, product admin, reports)
     only managers see these. They don't need their market's CSS.
     We could make these actually modern, mobile-first."

This coincided perfectly with Anthropic introducing sub-agents to Claude Code. That same day, I created Ray, named after legendary designer Ray Eames.

Me: "Ray, you specialize in modern responsive design philosophy.
     We're reimagining back-office pages that have been stuck in 2007.
     They need to work on phones, tablets, and desktops."

Ray: "I'll establish design principles first:
- Mobile-first breakpoints (320px base, enhance upward)
- Touch-friendly targets (minimum 44px)
- Information hierarchy over decoration
- Consistent spacing system (8px base unit)
- Accessible color contrast (WCAG AA minimum)

For the order management page, instead of a cramped table,
consider cards that stack on mobile and grid on desktop..."

Claude: "I'll bring Ray in whenever we're working on these pages.
        She can establish patterns we'll reuse."

Ray created a cohesive design system for the back office:

/* Ray's spacing system */
:root {
  --space-xs: 0.5rem;  /* 8px */
  --space-sm: 1rem;    /* 16px */
  --space-md: 1.5rem;  /* 24px */
  --space-lg: 2rem;    /* 32px */
  --space-xl: 3rem;    /* 48px */
}

/* Mobile-first responsive grid */
.admin-grid {
  display: grid;
  gap: var(--space-md);
  grid-template-columns: 1fr;
}

@media (min-width: 768px) {
  .admin-grid {
    grid-template-columns: repeat(auto-fit, minmax(300px, 1fr));
  }
}

The result: Market managers finally got modern tools. They could manage orders from their phones at the farmers market. They could update inventory from tablets in the field. The ancient tables and tiny form fields were gone, replaced with touch-friendly, responsive interfaces.

Customer-facing pages kept their familiar 2007 look. Manager tools entered 2025.

Building a Team of Specialized Code Analyzers

The Sub-Agents Revolution

By month three, I discovered Claude's most powerful feature: specialized sub-agents. Instead of one generalist tool, I could spawn focused code analyzers with specific expertise:

Me: "I need a comprehensive security audit of our authentication system"

Claude: "I'll use the security-focused agent for this task..."

SecurityAgent: "Beginning authentication audit. Checking:
- Session management
- CSRF protection
- Authorization boundaries
- SQL injection vectors
- XSS vulnerabilities
..."

Our Developer Tools Roster

Over time, I assembled a team of specialized agents, each trained on different aspects of software engineering:

1. Ray - The Design Systems Expert (First agent, created Month 4)

// Named after Ray Eames, pioneering designer
"Create responsive, mobile-first design system for admin pages"

Ray: "Establishing design tokens:
- 8px spacing grid
- Touch targets minimum 44px
- Mobile-first breakpoints: 320px, 768px, 1024px
- Consistent typography scale"

2. Erma - The User-Facing Text Consistency Expert

// Named after Erma Bombeck, humorist who made the everyday relatable
"Review all user-facing text for consistency and clarity"

Erma: "Analyzing user-facing content:
- Standardized 47 inconsistent error messages
- Unified tone across 23 email templates
- Simplified checkout flow language
- Added helpful context to 18 form fields"

3. Diderot - The Documentation Synchronizer

// Named after Denis Diderot, encyclopedist who organized knowledge
"Update documentation to reflect the new payment flow"

Diderot: "Analyzing recent changes...
- Updated PaymentService.processStripeWebhook()
- New error codes added: PAYMENT_METHOD_DECLINED, INSUFFICIENT_FUNDS
- Updating API docs, error handling guide, and integration tests"

4. Maya - The Marketing Tone Specialist

// Named after Maya Angelou, poet who knew how words make people feel
"Create welcoming, inclusive marketing copy for the homepage"

Maya: "Crafting messaging that resonates:
- Hero: 'Fresh from your neighbors, delivered to your door'
- Value prop focuses on community, not just convenience
- Farmer stories highlight real people, not statistics
- Call-to-action: 'Join your local food community'"

5. Agatha - The Security Auditor

// Named after Agatha Christie, master of finding what others miss
"Examine data flow between server and client for security leaks"

Agatha: "Found 3 critical issues:
1. User objects expose internal IDs in /api/users endpoint
2. Password reset tokens visible in logs
3. Market admin can access other markets' data via direct API calls"

6. Ada - The Performance Optimizer

// Named after Ada Lovelace, first to see computing's true potential
"Profile the market homepage and optimize load time"

Ada: "Performance analysis complete:
- Initial load: 3.2s (target: <1s)
- Largest Contentful Paint: 2.8s
- 47 database queries (18 are duplicates)
- Recommendations:
  1. Implement query batching
  2. Add Redis caching for product lists
  3. Lazy load images below fold"

7. Carver - The Payment Integration Expert

// Named after George Washington Carver, who transformed simple inputs into valuable systems
"Implement subscription management with Stripe"

Carver: "Designing subscription architecture:
- Webhook handlers for subscription lifecycle
- Proration for mid-cycle changes
- Grace periods for failed payments
- Automatic retry logic with exponential backoff"

Why Human Names Matter

Most guides recommend technical names for sub-agents like "css-design-agent" or "auth-flow-agent." I deliberately chose to name mine after real people, and this decision shaped the entire development experience.

First, it was my way of acknowledging gratitude. Every time I worked with Ray, I remembered Ray Eames and the design revolution she helped create. When Agatha found security holes, I thought of Agatha Christie's meticulous attention to detail. These names kept me grounded, aware that I stand in a long line of innovators and problem-solvers.

More importantly, human names constantly reminded me who this platform serves. This isn't a faceless SaaS product. It's a community platform. My market managers are people with names and stories. The farmers wake up at 4 AM to harvest vegetables. The customers are neighbors feeding their families. I am a person, building tools for people.

// This feels different:
"css-optimizer-agent: Reduced bundle size by 47KB"

// Than this:
"Ada: I found 18 duplicate queries slowing down checkout.
      These farmers deserve faster tools. Let me fix this."

In an age of increasing automation, these human names became my north star. Every feature, every optimization, every bug fix serves real people building real communities around local food. The internet makes it easy to forget that. My named agents never let me.

The Orchestra Effect

The real power came from using multiple agents in parallel:

Me: "We're adding a new feature: recurring subscriptions for CSA boxes.
     Analyze all aspects."

// Agents working simultaneously:
Agatha: "Security review: Need to encrypt stored payment methods..."
Ada: "Performance impact: Subscription checks will add 50ms to each request..."
Erma: "User messaging: Need clear subscription terms and renewal notices..."
Carver: "Payment flow: Implementing Stripe subscriptions with webhooks..."
Diderot: "Documentation needed: API changes, user guides, webhook reference..."

// 2 hours later: Complete implementation plan with no gaps

The Parallel Universe Gambit

Mobile was killing us. Markets were leaving LocallyGrown because customers couldn't use their phones to shop. With my agent orchestra as a safety net, I attempted something audacious: building a completely modern front-end alongside the legacy-compatible one.

The Problem That Demanded a Solution

Our legacy system required pinch-zooming on phones. Market customizations made responsive design impossible. But forcing markets to abandon their customizations would break trust. I had to solve mobile without breaking anything.

Building Two Platforms in One

I created a parallel front-end that broke free from market customization constraints:

// Legacy route (preserves market CSS):
/market/products → old HTML structure, custom styles work

// Modern route (new theming system):
/shop/products → mobile-first, responsive, beautiful

The new theming system expanded beyond "anything goes" CSS:

interface MarketTheme {
  season: 'spring' | 'summer' | 'autumn' | 'winter';
  palette: 'earth' | 'ocean' | 'prairie' | 'mountain';
  accent: 'vibrant' | 'subtle' | 'warm' | 'cool';
}

// Generated 12 distinct, professional themes
// Each one tested on devices from iPhone SE to desktop

Progressive Rollout Strategy

Market managers could enable modern views by role:

marketSettings: {
  modernViews: {
    admin: true,      // Try it yourself first
    volunteers: true, // Then your helpers
    growers: false,   // Then your vendors
    customers: false  // Finally, everyone
  }
}

This meant markets could migrate at their own pace. No forced transitions, no angry customers confused by sudden changes.

The Complexity Explosion

This decision nearly broke me. The two months before launch became a complexity nightmare:

Duplicate Svelte server code for parallel routes
Double the testing surface
Context switching between legacy and modern patterns
State management across two UI paradigms

I mitigated this with a robust API layer:

// Instead of duplicating business logic:
// Legacy: +page.server.ts calls ProductService
// Modern: +page.server.ts calls ProductService

// I created:
// Legacy: +page.server.ts → /api/products → ProductService
// Modern: +page.server.ts → /api/products → ProductService

The Agent Safety Net

Without my agents, this parallel development would have failed:

Me: "I need to duplicate the checkout flow for modern UI"

Ray: "I'll ensure consistent responsive design across all new pages"
Agatha: "I'll verify both flows have identical security checks"
Ada: "I'll monitor performance - no flow should be slower"
Diderot: "I'll track which APIs serve both versions"
Erma: "I'll ensure error messages match between versions"

# 4 hours later: complete parallel checkout with zero security gaps

The Price of Innovation

Almost every bug in the frantic two weeks after launch traced back to this parallel UI period. Not because the tools failed, quite the opposite. The agents caught hundreds of potential issues. But the sheer complexity of maintaining two parallel systems while racing toward launch introduced edge cases even our vigilant orchestra couldn't predict.

// Real bug from launch week:
// Modern UI: Displayed product edit button for growers
// Legacy UI: Also displayed the button
// Backend: Correctly validated grower could edit their own products
// Frontend: {#if user.isAdmin || user.isManager} - forgot grower check
// Result: Growers couldn't see button to edit their own products

No Regrets

Despite the complexity and the post-launch firefighting, this parallel approach proved essential:

15 markets opted into modern UI within first week
Mobile usage noticeably increased in first month
Markets reported some customer confusion, but it was minimal
Gradual migration path preserved trust

The modern front-end fixed LocallyGrown's biggest weakness while respecting its greatest strength: each market's unique identity. Without the agent orchestra giving me confidence to attempt this parallel universe approach, we'd still be forcing markets to choose between mobile usability and their customizations.

What Claude Excels At

1. Boilerplate and Patterns

Claude wrote 80% of our API endpoints. They all follow the same pattern:

// I'd write:
"New endpoint: POST /api/markets/{id}/announcement
- Validate user is market manager
- Create announcement
- Notify all market users
- Return created announcement"

// Claude generates complete endpoint with:
- Input validation
- Authorization checks
- Error handling
- TypeScript types
- Transaction wrapping
- Logging

2. Test Generation

Me: "Generate comprehensive tests for OrderService pricing calculations"

Claude:
- "should charge only fixed fees when no products are delivered"
- "should maintain pricing consistency across spot and regular calculations"
- "should handle mixed delivery scenarios correctly"
- "changing product price after order creation does not affect order item price"
- "calculates spot price correctly when all items are missing"

3. Automated Code Review with BugBot

Even as a solo developer, I maintained professional workflows: proper branching, full pull requests, code reviews. Both Claude and Cursor's BugBot automatically reviewed PRs as they were created.

PR #287: "Fix order calculation for spot prices"

Claude: "The logic looks correct, but consider adding a test for
        when all items are marked as missing."

BugBot: "⚠️ Potential issue in line 147:
        - Missing validation: what if orderItem.delivered is negative?
        - Edge case: calculateSpotPrice doesn't handle null deliveredQuantity
        - Error handling: division by zero when market.growerPercentage is 100"

Me: [Adds validation, handles edge cases, prevents division by zero]

BugBot became my most valuable reviewer. While Claude had fresh context and could see things differently from the session that created the code, BugBot consistently caught:

Uncovered edge cases
Incomplete validations
Unhandled error conditions
Security implications I'd missed

I've long since stopped using Cursor to write code, but BugBot remains essential for automated review. It's like having a paranoid QA engineer who never sleeps and loves finding problems.

4. Documentation

Claude wrote most of my documentation by analyzing code:

/**
 * Calculates the final order total including all adjustments
 *
 * @param order - The order entity with items
 * @param market - Market configuration for fees
 * @param user - User for membership and credits
 *
 * @returns OrderTotal with breakdown of all components
 *
 * @throws InvalidOrderError if order contains invalid items
 * @throws InsufficientCreditsError if user credits don't cover total
 *
 * @example
 * const total = await calculateOrderTotal(order, market, user);
 * // Returns: { subtotal: 50.00, fees: 5.00, credits: -10.00, total: 45.00 }
 */

What Claude Struggles With

1. Business Logic Consistency

Claude Monday: "Membership fees should be charged when expired"
Claude Wednesday: "Membership fees should be charged after trial period"

Me: "We discussed this Monday. The rule is: charge when expired AND
     after trial period."

2. Large-Scale Refactoring

Claude can't hold enough context to safely refactor across 20+ files. I learned to break refactoring into smaller chunks:

Me: "Step 1: List all files importing the old UserService"
Claude: [Lists files]
Me: "Step 2: Update just the authentication methods in these 5 files"
Claude: [Updates those files]

3. Debugging Complex State

When bugs involved multiple services, queues, and timing issues, Claude would often fixate on the wrong area:

Claude: "The issue is in the email service"
Me: "Actually, I found it. The queue processor was using the wrong timezone"
Claude: "Ah yes, that would cause this exact symptom because..."

The Two-Week Launch Gauntlet

The Go/No-Go Framework

For two weeks before launch, I treated each day like an airline cockpit checklist: systematic, comprehensive, no shortcuts. Each day, I'd run the entire agent orchestra through production readiness checks, categorizing every finding:

# LAUNCH READINESS AUDIT - T-14 Days

## Issue Categories:
🚨 LAUNCH BLOCKERS - Must fix or launch fails
⚠️ CRITICAL - Fix within 24 hours
🔶 HIGH PRIORITY - Fix before Day 3 post-launch
🔷 MEDIUM - Fix within first week
🟢 LOW - Can wait for regular development

## Launch Decision Matrix:
- 0 Launch Blockers = GO
- 1-2 Launch Blockers (<12 hours to fix) = CONDITIONAL GO
- 3+ Launch Blockers OR any >12 hours = NO GO (postpone)

Each agent had specific focus areas and delivered structured reports:

Ada (T-72 hours): "Performance Audit Complete
🔴 CRITICAL: Severe Page Load Performance
  - First Contentful Paint: 27.0s
  - Largest Contentful Paint: 46.7s
  - Time to Interactive: 47.2s
  - Root cause: 325KB JavaScript chunks, no code splitting

🔴 Database Connection Pool Risk:
  - Production: 100 connections (67% of DB capacity)
  - Risk: Multiple batch operations could exhaust pool
  - Mitigation: Emergency cleanup functions exist ✅

🟠 Bundle Size: 16MB total, needs optimization

Launch Recommendation: CONDITIONAL - Can launch but needs immediate attention"

Agatha (T-72 hours): "Security Audit Complete
🚨 LAUNCH BLOCKERS: NONE IDENTIFIED ✅

All critical security components properly implemented:
✅ Modern password hashing (Argon2 with proper salts)
✅ Legacy password migration (MD5/bcrypt → Argon2) with deadlines
✅ Redis-backed sessions with secure cookies
✅ Rate limiting (3 attempts/24h for password resets)
✅ Multi-level authorization framework
✅ Market isolation with subdomain-based context

Launch Recommendation: GO from security perspective"

The audit wasn't just about finding problems. It tested actual user journeys with production data, verified every integration, and simulated failure scenarios. By T-24 hours, each agent would produce a 10-page markdown report with metrics, evidence, and specific remediation steps

Day 14: First Audit (July 31, 2025)

Me: "Full system audit. Are we ready to launch?"

Agatha: "BLOCKER: SQL injection possible in search endpoint
         BLOCKER: User sessions don't expire
         BLOCKER: No rate limiting on API"

Ada: "BLOCKER: Homepage takes 8 seconds to load
      WARNING: No database connection pooling"

Claude: "BLOCKER: 0% test coverage on payment system
        WARNING: Only 22% overall coverage"

Recommendation: NO-GO (Confidence: 95%)
Blockers must be resolved before launch.

Day 10: Progress Check

Me: "Re-audit with focus on blockers from day 14"

Agatha: "Previous blockers RESOLVED
         New issue: CORS configuration too permissive
         Status: CONDITIONAL-GO"

Ada: "Homepage load: 2.1s (acceptable)
      Database pooling implemented
      Status: GO"

Claude: "Payment system: 78% coverage
        Overall: 45% coverage
        Status: CONDITIONAL-GO (minimum viable)"

Recommendation: CONDITIONAL-GO (Confidence: 70%)
Can launch with monitoring for remaining issues.

Day 7: Cutting Scope to Ensure Stability

Me: "Agents are flagging Paraglide build warnings. Review needed."

Claude: "Build warnings from internationalization library:
- Paraglide/inlang throwing compilation errors
- Not breaking builds but creating noise
- Library for future multi-language support"

Me: "We're 7 days from launch. This is for future features.
     Ripping it out completely. Multi-language can wait."

[Removes entire i18n library and dependencies]

Me: "Better to launch with solid English-only than risk
     instability from unused future features."

Recommendation: GO
Reduced scope = reduced risk. Focus on what matters now.

Day 3: Final Systems Check

Me: "Final comprehensive audit before launch"

Team Consensus:
- Security: 14 issues fixed, 2 low-priority remaining
- Performance: All pages <2s load time
- Testing: 67% coverage, critical paths covered
- Documentation: API docs complete, user guides updated
- Payments: Fully functional with error handling

Recommendation: CONDITIONAL-GO (Confidence: 75%)

Remaining items (non-blocking):
1. Add rate limiting to search endpoint
2. Implement Redis caching for product lists
3. Add monitoring for payment webhooks

Day 1: Launch Eve Audit (August 13, 2025)

Me: "Final go/no-go decision needed"

All Agents: "Analyzing 2,847 files, 50,000+ lines of code..."

Results:
✅ Authentication: Secure
✅ Authorization: Market isolation verified
✅ Payments: Stripe integration stable
✅ Performance: <2s page loads
✅ Data integrity: No corruption paths found
✅ Error handling: All critical paths covered
⚠️ Test coverage: 87% (acceptable for launch)
⚠️ Documentation: 85% complete (acceptable)

Recommendation: GO (Confidence: 82%)

Agatha's final note: "No critical vulnerabilities. Ship it."
Ada's final note: "Performance acceptable. Monitor after launch."
Erma's final note: "User messaging is clear and helpful. Ready for real users."

The Launch Day Reality Check

The reports created by my agents were detailed enough that I could see where I needed to focus my attention. The issues they flagged as non-blocking remained non-blocking. The systems they certified as ready performed flawlessly. When the crisis hit (as described in Part 4), it wasn't in any of the areas they'd audited. It was in the simple things: math errors in calculations, UI buttons hidden from the wrong roles, invoices not grouping by grower, and input fields that couldn't accept decimal values.

Lessons Learned

1. These Are Tools, Not Developers

Claude Code doesn't understand business requirements, can't make product decisions, and won't catch when it's solving the wrong problem. The human is still the architect, product manager, and quality gatekeeper.

How to Keep ML Tools Safe in Production

✅ Write specs first: Markdown in-repo. Tools get the what; you keep the why. ✅ Enforce architecture with lint rules + CI: e.g., "no DB calls in routes" ✅ Require tests for every business logic change: No exceptions ✅ Security agent + human review for auth, payments, data exports ✅ Treat generated code like a junior dev's PR: Review, annotate, refine ✅ Maintain a "red file" of irreversible actions (deletes, refunds) with extra checks ✅ Monitor metrics: If an endpoint is 10x slower, investigate

2. Trust but Verify

Every piece of machine-generated code needs review. The agents sometimes flag false positives, like when they reported "missing" deployment automation that had been tested for weeks, or build warnings that weren't actually breaking anything. Human oversight catches both their mistakes and their over-caution.

3. Context Is Everything (Documentation as Specification)

The more context you provide, the better the output. I learned to document specs and implementation plans in markdown files that become part of every prompt:

# Feature: Order Refunds
## Requirements
- Partial and full refunds supported
- Audit trail for all refund actions
- Email notifications to customer and grower
## Implementation
- Use PaymentService.refundOrder()
- Store refund records in orderRefunds table
- Queue email via RefundNotificationJob

This markdown becomes Claude's blueprint. No ambiguity, no guessing.

CLAUDE.md serves as a readily consulted set of rules and standards for the entire project. When Claude remembers to check it, it helps enforce patterns like "use the service layer" and "follow our error handling conventions." When it forgets, at least the rules are documented for me to reference.

4. Specialized Agents > General Purpose

One Claude trying to do everything is good. Five specialized agents each doing what they're best at is transformative:

Security audits need security focus (Agatha)
Performance needs profiling mindset (Ada)
User messaging needs consistent clarity (Erma)
Payments need domain expertise (Carver)
Docs need synchronization rigor (Diderot)

5. Iterative Refinement Works

Don't expect perfect code on first generation. Treat it like pair programming:

Claude suggests
You critique
Claude refines
You approve

6. The Go/No-Go Ritual Is Essential

Daily automated audits during the two weeks before launch drove critical infrastructure improvements:

Database connection pooling (was at 67% capacity, risked saturation)
Redis caching implementation for product lists and market data
Health monitoring endpoints for external services
Emergency cleanup functions for connection pool exhaustion
Bundle size optimization (reduced from 325KB chunks)
Page load improvements (from 27s down to <3s)

Without the daily "are we ready?" question pushing infrastructure improvements, we would have launched with a system that couldn't handle production load.

7. These Tools Make You a Better Developer

Explaining problems clearly to Claude forced me to think more clearly. Reviewing machine-generated code taught me patterns I didn't know. Debugging tool mistakes strengthened my understanding. Managing a team of specialized analyzers strengthened my project management skills.

A Critical Warning: The "Vibe Coding" Disaster Waiting to Happen

I need to address something that terrifies me: YouTube videos of non-technical entrepreneurs using these tools to "build a SaaS in 10 minutes" and then immediately deploying to production.

Yes, it's amazing that someone with no coding experience can prompt their way to a working demo. But watching them charge customers for these applications fills me with horror. Here's what those videos don't show:

What "Working" Actually Means

Non-technical founder: "Look! The payment form works!"
Reality: - No input validation
         - Credit card numbers logged to console
         - SQL injection vulnerabilities everywhere
         - No error handling
         - Race conditions in order processing
         - No transaction rollbacks
         - Passwords stored in plain text

The LocallyGrown.net Reality Check

Even with three decades of programming experience, even with comprehensive tests, even with specialized security agents reviewing code, we STILL had:

Critical vulnerabilities that Agatha (security agent) missed
Payment edge cases that only surfaced in production
Performance issues that emerged under real load
Data integrity problems in complex workflows

And this was WITH me reviewing every single line, understanding the architecture, and knowing what to look for.

Why Human Expertise Is Non-Negotiable

These tools are powerful, but they:

Don't understand business consequences - They'll happily delete all user data if prompted wrong
Can't detect logical flaws - The code runs but the business logic is backwards
Miss security context - They don't know your specific threat model
Generate convincing-looking disasters - Code that appears professional but has fatal flaws
Have no concept of liability - When customer data leaks, you're responsible, not Claude

A skilled developer using these tools can build amazing things 3-5x faster. An unskilled person can build a lawsuit generator that happens to have a nice UI.

What's Next: The Future of Development

These tools won't replace developers. But developers who don't use them will be outpaced by those who do.

The division of labor:

Humans: Architecture, business logic, product decisions, quality control, ethics, liability
Tools: Pattern implementation, boilerplate reduction, test generation, documentation, consistency checking
Together: 3-5x productivity of either alone

The bottom line: If you're not technical enough to spot when these tools are generating disasters, you're not ready to deploy to production. Period.

With the platform stable and development velocity high, I can finally think beyond survival mode.

Part 6 will cover:

The feature roadmap enabled by modern architecture
Building for the next generation of local food systems
Open-sourcing components for other food communities
The business model evolution
Why small software serving real communities matters
The 20-year vision for local food infrastructure

After nearly losing everything, we're now positioned to build something that could outlive us all.

This is part five of a series documenting the rescue and modernization of LocallyGrown.net.

The Series

From Accidental Discovery to Agricultural Infrastructure (2002-2011)
The 23-Year Rescue Mission: Saving Agricultural Innovation from Technical Extinction
The Architecture Challenge: Translating 19 Years of Rails Logic to Modern SvelteKit
The Reality of Production: When Hope Meets Live Users
Lessons from the Solo Developer Using Modern Tools ← You are here
The Future: Building on Modern Foundations

45 KiB Raw Blame History Unescape Escape