Key takeaways
- CelX used Skene as a continuous build loop — not a one-off audit — running it at key milestones to detect friction, surface architecture gaps, and translate product intent into implementation-ready changes.
- Three focused runs (onboarding, security, repo-deep diagnostics) reframed onboarding from "teach the product" to "get the user to an irreversible value moment fast," and pushed analytics to be a first-class product surface rather than "charts later."
- Skene's growth + skills runs triggered the biggest outcome: a pivot from seat-based SaaS to Agents-as-a-Service (AaaS), where users pay for work delivered and outcomes achieved — not software access.
- Concrete systems shipped under this cadence: a two-layer analytics + experimentation backbone, usage-based pricing with metering, limits, dunning, and an invoice explainer, a ranked experimentation backlog with two experiments end-to-end, an API lifecycle wrapper, and approval-gated agent execution.
- The repo-deep phase (after moving from Google AI Studio to Codex in Terminal) turned Skene from helpful into decisive — diagnostics became a product capability rather than an engineering bottleneck.
Company / Product
CelX is an operations platform designed to route intake, execute work through both humans and agents, and track measurable outcomes end-to-end.
From the start, CelX was built with Skene as a continuous product + engineering copilot: not as a one-off audit tool, but as an iterative "build loop" we ran at key milestones to detect friction, surface architecture gaps, and translate product intent into implementation-ready changes.
The problem we were solving
Early-stage product teams don't fail because they can't ship features. They fail because they ship the wrong features in the wrong order, without instrumentation, without security discipline, and without a business model that matches reality.
We had three core risks:
- Onboarding leakage. Users drop before first value, and we don't know why.
- Analytics blindness. We can't reliably measure activation, friction, or retention because events aren't designed for product learning.
- Security and operational risk. Agentic systems amplify exposure — mis-handled secrets, user-data boundaries, and unsafe automation become existential.
Skene became the system we used to repeatedly answer: "What's the highest-leverage fix to ship next, and how do we implement it cleanly?"
How we used Skene, iteration by iteration
Run 1 — Onboarding drop-off + analytics build path
The first Skene run focused on activation reality: where users were dropping, and what we needed to measure to stop guessing.
Skene outputs pushed us toward:
- identifying onboarding friction points and missing steps in the "first value" path
- prioritising an analytics architecture that could support funnel diagnostics, not just operational telemetry
- implementing an activation loop designed around rapid value delivery (the "deploy immediately" mindset), rather than long tutorial flows
What changed in the product as a result:
- We reworked onboarding thinking from "teach the product" to "get the user to an irreversible value moment fast."
- We treated analytics as a first-class product surface and built toward event clarity and experimentation readiness — not "charts later."
This was also when we were building in Google AI Studio, where speed was high but repo-level depth was limited.
Run 2 — Security leaks (user data + API keys)
We ran Skene again with a security lens, specifically looking for leakage patterns that are common in early agentic builds:
- user data boundary mistakes
- unsafe handling of credentials and keys
- anything that could cause cross-tenant exposure or accidental external actions
This run drove concrete security hardening, and critically reinforced that agentic products need explicit operational safety rules, not "best effort." That philosophy maps directly to how CelX implements approval-gated automation and side-effect controls.
What changed in the product as a result:
- We moved from "agent features" to agent governance: explicit states, approvals, audit trails, and event traceability.
- We aligned workflows with risk-based execution — high-risk actions are gated and observable.
Run 3 — Moving from Google AI Studio to Codex in Terminal (deep repo + diagnostics)
Once we moved away from Google AI Studio to Codex in Terminal, Skene went from helpful to decisive — because Skene could now operate repo-deep:
- tracing flows across UI entry points and service layers
- mapping instrumentation gaps to exact insertion points
- producing implementation plans that were minimal, local-runnable, and adapter-friendly
At this stage, Skene helped us build in-platform diagnostics for the agent architecture, so we weren't dependent on bug reports or terminal-only debugging to understand failures.
That is a major shift in how agentic products mature: diagnostics become a product capability, not an internal developer luxury. This aligns with CelX's architecture split — frontend shells + service layer + adapters — so diagnostic signals could be emitted through the same event surfaces used for analytics and experimentation.
The breakthrough: SaaS thinking → Agents-as-a-Service
The biggest outcome wasn't a feature. It was a model shift.
Skene's growth + skills runs didn't just propose "monetisation tactics" — they helped us reframe the product as outcome delivery, where:
- users don't pay for software access
- they pay for work delivered (jobs executed) and outcomes achieved
That change cascaded into:
- how we define value
- how we package the product
- how we design the "Jobs" system
- how we measure success
CelX pivoted from a conventional SaaS posture to Agents-as-a-Service (AaaS), and Skene became the tool that continuously "understood the new model" and kept reinforcing the build direction toward outcome-based execution. This is visible in how the platform is defined: routing intake, executing work via humans, agents, and the hybrid model, and tracking measurable outcomes.
What we actually built with Skene (concrete system components)
1. A real analytics + experimentation backbone (not just telemetry)
Skene pushed CelX from "some events exist" to an analytics model designed for:
- conversion (freemium → paid)
- usage-based pricing
- experimentation (exposure / conversion)
- funnel drop-off diagnostics
This includes:
- a two-layer event concept (operational events + analytics / experimentation events)
- experiment analysis utilities (exposures, conversions, lift, statistical tests)
- funnel drop-off computation with recommended baseline steps
2. Usage-based pricing built as a first-class system
Skene didn't just suggest pricing — it helped implement a metering + limits + dunning + invoice-explanation program that is testable behind flags and safe to run locally.
Key outcomes:
- clear usage dimensions (API calls, events tracked, storage, job outcome fees)
- threshold notifications and limit-reached events
- overage prediction logic
- a mock-safe dunning automation state machine
- an invoice explainer for transparency
This matters for AaaS because pricing has to map to workload and outcomes, not seats.
3. Experimentation as a shipping discipline (backlog → top 2 shipped)
Skene drove an experimentation system that's not theoretical:
- a ranked backlog
- two experiments implemented end-to-end (assignment, exposure, conversion, analysis)
This turned growth from "ideas" into a repeatable shipping loop — the same shape as a well-designed growth loop.
4. API lifecycle as a product contract (versioning, deprecation, templates)
For an agentic platform, APIs aren't just endpoints — they are long-lived contracts. Skene guided an API lifecycle wrapper:
- endpoint catalog
- versioning + deprecation policies
- changelog + migration templates
- deprecation notice generation
- webhook contract testing
- integration health scoring
That's the kind of maturity usually added far later. Skene pulled it forward.
5. Safe automation by default (approval-gated agent runs)
Skene reinforced that agentic execution must be observable and controllable:
- no external side effects before approval
- explicit task states
- retry limits, heartbeats, audit trails
- event markers for every transition
This is a core unlock: you can ship agents without inheriting chaos.
Why this worked: Skene as a continuous build loop
Skene functioned less like a static toolkit and more like a repeatable operating system for product progress:
- Detect what's leaking (onboarding, activation, security, monetisation).
- Translate into a minimal implementation plan that fits the repo architecture.
- Implement safely (flags, mocks, adapters, approval gating).
- Measure impact with experiments and funnels.
- Iterate with the next run, now grounded in real product reality.
This is why we were able to get to a V1 build and a full business-model pivot: Skene kept us out of the trap of building "SaaS features" and pushed us toward an outcome-native, diagnostics-first, measurable agent platform.
Results (qualitative outcomes, without overclaiming numbers)
With Skene integrated from the start, CelX achieved:
- clear onboarding + activation direction grounded in "first value fast" principles
- a structured analytics and experimentation foundation capable of funnel diagnosis and iteration
- security hardening and operational safety appropriate for agentic execution
- in-product diagnostics for agent execution, so debugging becomes a product capability — not an engineering bottleneck
- a successful pivot from SaaS to Agents-as-a-Service, with monetisation and Jobs design aligned to outcomes
What we'd tell another team considering Skene
If you're building an agentic product, Skene is most valuable when you treat it as a recurring operating cadence:
- run it early to expose activation + instrumentation gaps
- run it again when security and execution safety become non-negotiable
- run it repo-deep once your architecture stabilises
- re-run after every major product model shift (like SaaS → AaaS) so the build path stays aligned with the new value function
