SP Engineering Strategy · Cheat Sheet

Why · Who · How · Quick Reference
The strategic creed in one page. Tape it next to your monitor.
For the full doc → Strategy · Daily tactics → Playbook · Tactical lookups → Tactical Cheat Sheet

The Short Version

Modular systems owned by empowered streams who ship small, safe, frequent changes that deliver customer outcomes. We measure what matters. We own what we build. We learn from what breaks.

Six Guiding Principles

Priority order — #1 outranks #6 when in conflict.
  1. Customer Outcomes over Internal Convenience We exist to serve users. Everything else enables this.
  2. Small, Safe, Frequent over Big & Risky Make change safe by making it small.
  3. Clarity over Speed Confusion is more expensive than deliberation.
  4. Ownership over Delegation You build it, you run it, you fix it.
  5. Sustainable Pace over Heroics Sustainable velocity beats max output in one sprint.
  6. Learning over Knowing "I don't know — let's find out" is the smart move.

Platform Teams

Shared capabilities streams consume.
Core Platformauth · infra · shared services
SP AdminInternal admin tools (read-only into all)
Skills ValidationAssessments · skills · certifications

Value Streams

Each pod owns a stream end-to-end: build, test, deploy, operate.
  • Library P1 — source of truth for what content exists
  • Learning Experience P1 — engaging learning moments
  • Live-Training (SPLT) P2 — real-time instructor learning
  • VAR P2 — reseller management
  • Talent Marketplace P3 — capability ↔ perception gap
  • Platform Access P4 — the "front door"
  • Organized Learning P4 — institutions & structured programs
  • Performance Mgmt P5 — manager-driven upskilling

Who Owns This?

Login / SSO UXPlatform Access
Auth infra / SDKCore Platform
Subscriptions UXPlatform Access
Stripe SDKCore Platform
Courses · lessons · pathsLibrary
Classes · groups · LTIOrganized Learning
Quizzes · skills · certsSkills Validation
Reseller ordersVAR
SPLT sessionsLive-Training
Profiles · portfoliosTalent Marketplace
Internal adminSP Admin
CI/CD · shared infraCore Platform
Unclear? Streams resolve directly. Stuck > 48h → escalate to Eng Leadership.

Three Team Types

  • Stream-aligned — own a business segment end-to-end. Build, test, deploy, operate.
  • Platform — self-service capabilities. Streams are customers. Thinnest Viable Platform.
  • Enabling — temporary. Teach to fish. Success = no longer needed.

Interaction Modes

CollaborationDays-weeks · novel problems · pairing
X-as-a-ServiceOngoing · mature capability · self-serve
FacilitatingWeeks-months · transfer skill · ends
Evolution: Collaboration → Facilitating → X-as-a-Service.

Team Size · 7 ± 2

SizePathsEffect
510Tight, shared context
721Manageable
936Subgroups forming
1266Two teams pretending
Below 5: no resilience, vacations break it. Above 9: split.

Modular Monolith · The Rules

  • One artifact, one CI/CD pipeline, atomic deploys.
  • Domain-centric modules aligned with stream ownership.
  • Explicit interfaces only — public APIs, events, queues. No reaching into internals.
  • Shared infra in Core Platform — auth, authz, DB, cache.
  • No direct DB access across module boundaries.
  • Stack: Laravel · Vue 3 · Tailwind/UnoCSS · Pinia · Vite · AWS · GitHub Actions

Daily Flow · Idea → Customer

Committrunk
Build<5 min
Test<5 min
Review<3 hr
Mergedaily
Deploy<5 min
Monitoralways
Trunk-based · short-lived branches · feature flags · daily deploys per stream · no code freezes.

Deployment Strategies

RollingDefault
Feature flagsGradual rollout
Blue-greenZero-downtime infra
CanaryHigh-risk · subset traffic

Incident Response

SevRespondResolve
P1< 15 min< 4 hr
P2< 1 hr< 24 hr
P3< 4 hr< 1 week
P4Next sprintAs prioritized
Blameless post-mortems for P1/P2. Fix forward. Share learnings.

Causal Chain · How Metrics Connect

Healthy Teams Efficient Dev Healthy Systems Great Experience Business Results
Tier 1 problems show up in Tiers 2-5 in 3-6 months. Treat team health as a leading indicator.

Metrics That Matter

TierKey Metric · Target
T1 · TeamDevEx ≥ 4.0 · Stability ≥ 95%
T2 · DevDaily deploys · cycle < 24h · CFR < 5%
T3 · SystemMTTR < 4h P1 · API p95 < 200ms
T4 · CustomerReliability ≥ 99.5% · Adoption ≥ 25%
T5 · BusinessMAU · CLTV · NPS ≥ 50

Anti-Metrics · Refuse to Measure

  • Lines of code — incentivizes bloat
  • Features shipped — without adoption, it's maintenance burden
  • Overall coverage % — 90% on critical paths beats 80% of everything
  • Hours worked — inversely correlated with sustainable pace

Team State · Diagnostic

Falling BehindBacklog grows · firefighting
Treading WaterCritical only · no improvement
Repaying DebtActive improvement · velocity ↑
InnovatingDebt low · new value
Quarterly diagnostic: 10-12 ✓ healthy · 7-9 ✓ watch · <7 ✓ intervene.

Anti-Patterns · Avoid

  • Long-lived branches
  • Manual deployments
  • Testing only in production
  • Hero deployments (one-person ship)
  • Ignoring flaky tests
  • Big bang releases
  • Cross-module DB access
  • God modules · premature service extraction
  • Platform team as ticket queue
  • Enabling that creates dependency

Decide When Stuck

  1. Customer outcome — does this serve users?
  2. Smallest slice — can we ship a thin version?
  3. Sustainable — can we sustain this pace?
  4. Document — write down the decision (ADR).
  5. Communicate — tell affected teams.