← Ben Lai

Blog

14 notes · RSS · JSON Feed

    • 1 min read

    Local models won the long tail

    The frontier wins demos. A 70B model on one good GPU wins the two hundred calls per day workflow nobody tweets about. The economics flipped this year.

    • 2 min read

    Customer support is product research wearing a costume

    I answered support tickets for the first six months of the company. Then I stopped, and within a quarter we shipped three features nobody asked for. The inbox knew what to build. I had stopped reading it.

    • 1 min read

    The eval you don't have, production already wrote for you

    Every customer-reported failure of an LLM feature is a test case you refused to file. The dataset is free; you're throwing it away nightly.

    • 2 min read

    Observability is a tax you pay before you owe it

    The first incident is when you discover you don't have logs. The second is when you discover the logs aren't searchable. By the third you have observability, and a story about how expensive it would have been not to.

    • 1 min read

    One worktree per agent

    Sub-agents and tool-loop trees are not parallelism. Git worktrees are. The unit of useful concurrency in a coding swarm is the working tree, not the model call.

    • 2 min read

    Your first hire is a multiplier or a manager. Pick.

    The two failure modes of first hires are "the second me" who does what I do twice and "the senior person" who manages me instead of working. The good first hire does neither.

    • 1 min read

    The output token tax

    Inputs got cheaper this year. Outputs didn't. Your verbose agent is paying both halves of a bill that quietly stopped being symmetric.

    • 2 min read

    Postgres is a queue. Stop reaching for Kafka.

    A team I know spent six weeks operationalising Kafka for a workload doing two hundred messages per second. Postgres did the same job with forty lines and a SELECT FOR UPDATE SKIP LOCKED.

    • 2 min read

    Every agent should have a passport

    We onboard humans with identity, scope, audit, and approval. Agents get an env var and a system prompt. The next round of incidents will be about this.

    • 2 min read

    Pricing pages are written for people who won't buy

    I spent two weeks rewriting our pricing page. Conversion didn't move. What moved conversion was rewriting the sales email — the document everyone who actually paid had already read by then.

    • 2 min read

    The 3-person team is the new 50-person team

    A friend's startup did $4M in revenue last year with three engineers and a tax contractor. Their competitor did $6M with forty-two people. The competitor has more meetings.

    • 2 min read

    Your prompt isn't cache-shaped

    I asked a team what their cache hit rate was. The lead said "we cache responses, right?" That single misunderstanding was costing them about fourteen thousand dollars a month.

    • 2 min read

    The cheapest dependency is the one you delete

    I removed a logging library last quarter. Build time dropped twelve seconds, bundle dropped four hundred KB, the on-call rotation forgot it existed. We had been paying rent on it for three years.

    • 2 min read

    You're optimizing the wrong axis of agent cost

    I watched a team spend a quarter dropping their per-call latency by 40%. The bill kept going up. The bottleneck was never speed.