Engineering Strategy

Purpose SPENG · Confluence 1826422789

Why We Exist

We translate vision into code — building the technical foundation for the 3D design community so creators, educators, and engineers can focus on creativity, not complexity.

Confluence ↗

Our Core Challenge

A fragmented technical foundation that slows our ability to deliver value and makes change expensive.

Our Strategic Response

Consolidate around a modular, well-documented architecture with empowered teams who own outcomes end-to-end.

The Diagnosis · Four Observable Symptoms

Symptom	How It Manifests
Inconsistent Architecture	Multiple patterns across applications; difficult onboarding. No shared architectural contract.
Outdated Technology	Legacy systems hard to update, secure, or scale. Fragmentation makes modernization high-risk.
Slow Velocity	Features take longer than they should. Engineers spend cognitive load navigating inconsistency.
Quality Concerns	Insufficient testing; bugs escape to production. No unified quality bar.

The Bottleneck · Stocks & Flows

The bottleneck is not any single technology or team — it's the rate at which we can safely change the system. When change is expensive and risky, everything slows.

↓ Inflow · Grows the stock

Rushed features
Inconsistent patterns
Team churn
Skipped documentation

Stock · Accumulates

Technical Debt

Tribal knowledge
Undocumented decisions

Outflow · Drains the stock ↑

Documentation
Refactoring
Knowledge sharing
Small, safe deploys

Outcomes Magic Wand · 2-year horizon

What Success Looks Like

If everything went perfectly over the next two years, what would be unmistakably different?

Confluence ↗

For Customers

Faster time-to-value — first meaningful milestone in the first session.
Reliable experience — zero friction in critical learning moments.
Continuous improvement — the platform visibly gets better.

For Engineering Teams

Confident shipping — daily deploys without fear. Rollbacks rare and painless.
Clear ownership — every system has a team; every team knows their boundaries.
Sustainable pace — velocity over quarters, not bursts.

For the Business

Predictable delivery — leadership trusts engineering timelines.
Efficient investment — every dollar traceable to impact.
Scalable foundation — new products build on the platform, not around it.

Decision Framework Ordered by priority · 1 outranks 6

Six Guiding Principles

Principles only matter when they're hard — when following them has a cost. Each names a real tension and declares which side we choose when forced to pick.

Confluence ↗

#	Principle	Rationale
01	Customer Outcomes over Internal Convenience	We exist to serve users. Everything else enables this.
02	Small, Safe, Frequent over Big and Risky	The best way to serve customers is to deliver continuously and safely.
03	Clarity over Speed	Sustainable delivery requires shared understanding.
04	Ownership over Delegation	Clarity means little without accountability.
05	Sustainable Pace over Heroics	Ownership only works if teams can sustain the load.
06	Learning over Knowing	Continuous learning enables everything above.

1Customer Outcomes Over Internal Convenience

Tension: It's easier to optimize for our own workflows than for user experience.

Our stance: When internal convenience conflicts with customer value, customer value wins.

What This Looks Like

Kill features that don't serve users — even if we spent months building them.
Instrument user behavior, not just system health.
Talk to customers regularly — not just about customers.
Measure by user outcomes (learning velocity, time-to-value), not engineering output.

Anti-pattern: Building what's easy instead of what's needed. Celebrating launches instead of adoption.

When it bends: When serving customers requires unsustainable pace (see #5). Burnout doesn't serve customers long-term.

2Small, Safe, and Frequent Over Big and Risky

Tension: Large releases feel significant. Big launches carry big risk.

Our stance: Small, incremental changes deployed frequently. We make change safe by making it small.

What This Looks Like

Features ship behind feature flags, enabled incrementally.
Deploy to production daily (or more).
PRs small enough to review in one sitting.
Break large projects into shippable slices, not phases.
Celebrate learning from small experiments — not just big wins.

Anti-pattern: The "big bang" release. Branches that live for weeks. "We'll test it all at the end."

When it bends: Some changes (infra migrations, security patches) genuinely can't be sliced small — invest extra in testing, rollback, and blast-radius limits.

3Clarity Over Speed

Tension: Moving fast feels productive. Stopping to document feels like overhead.

Our stance: Slower with clarity beats faster with ambiguity. Confusion is more expensive than deliberation.

What This Looks Like

Write down decisions (ADRs) and share them.
Define "done" before starting, not after finishing.
Ownership boundaries are explicit and documented.
Ask clarifying questions even when it slows the meeting.
Say "I don't understand" without embarrassment.

Anti-pattern: Tribal knowledge. Hallway decisions that never get documented. Two teams accidentally building the same thing.

When it bends: In genuine emergencies (P1 incidents, security breaches) — act first, document after. Always come back to document.

4Ownership Over Delegation

Tension: It's tempting to push problems to other teams. Fragmented ownership creates fragmented accountability.

Our stance: Teams own outcomes end-to-end — from user need to production operation.

What This Looks Like

Stream-aligned teams own their entire domain — not just pieces.
The team that builds it runs it. No handoffs to "ops."
On-call rotation is within the owning team.
Fix the problem first, then discuss root cause.
Ownership is documented and discoverable.

Anti-pattern: "That's not my job." Systems with no clear owner. Finger-pointing post-mortems.

When it bends: True cross-cutting concerns (auth, infrastructure) belong to platform teams. But stream-aligned teams still own integration and experience.

5Sustainable Pace Over Heroics

Tension: Crunch delivers short-term results. Heroics create burnout, attrition, and hidden quality debt.

Our stance: Sustainable velocity over time beats maximum output in any single sprint.

What This Looks Like

Staff for sustainable capacity, not minimum viable headcount.
Celebrate teams that maintain velocity, not individuals who sacrifice weekends.
Say "no" or "not yet" to protect focus.
Measure team health as seriously as system health.
Burnout = systemic failure, not personal weakness.

Anti-pattern: Hero programmers. "We'll rest after launch." Celebrating 80-hour weeks.

When it bends: Genuine, rare emergencies (critical security, major outages). Acknowledged, followed by recovery.

6Learning Over Knowing

Tension: It feels good to have the answer. Admitting uncertainty feels weak.

Our stance: We value learning over performing expertise. The smartest thing we can say is often "I don't know — let's find out."

What This Looks Like

Run experiments before committing to large investments.
Say "I was wrong" and "I changed my mind" publicly.
Post-mortems are blameless and focused on learning.
Hire for growth mindset, not just current skill.
Create space for failure in service of learning.

Anti-pattern: "We've always done it this way." Punishing reasonable risks that didn't pan out. Only celebrating successes.

When it bends: Some decisions are not experiments — they're commitments. Once we've learned enough, we commit and execute.

Worked Example

When Principles Conflict

Scenario: A team has discovered a significant user pain point. Fixing it requires a large, complex change that's hard to slice. The team is already at capacity.

Applying the Hierarchy

#1 Customer Outcomes says: we should fix it.
#2 Small, Safe, Frequent says: find a way to slice it.
#5 Sustainable Pace says: don't overload the team.

Resolution: Find the smallest slice that addresses the core pain (honor #2). If no safe slice exists, negotiate scope reduction or timeline rather than crunching (honor #5). Document why and communicate to stakeholders (honor #3).

Measurement Outcomes > activity

How We Measure Success

We measure outcomes, not activity. We connect engineering metrics to business results. We treat team health as a leading indicator.

Confluence ↗

The Causal Chain

Our metrics connect engineering health to business outcomes. Each tier predicts the next.

Tier 1 · Leading

Healthy Teams

DevEx, clarity, stability, sustainable pace.

→

Tier 2

Efficient Dev

Deploy frequency, cycle time, change-failure rate.

→

Tier 3

Healthy Systems

MTTR, latency, Core Web Vitals, coverage.

→

Tier 4

Great Experience

Time-to-value, adoption, reliability.

→

Tier 5 · Lagging

Business Results

MAU, CLTV, NPS, engineering ROI.

↩ Business results fund continued investment in team health · the loop closes

Tier 1 · Team Health (leading — problems here show up elsewhere in 3-6 months)

Metric	Target	Why It Matters
Developer Experience	≥ 4.0 / 5.0	Poor DevEx creates friction that slows everything.
Clarity Score	≥ 4.2 / 5.0	"I know what's expected and why my work matters."
Sustainable Pace	≥ 80%	% sprints where planned ≈ completed (±15%).
Team Stability	≥ 95%	Quarterly retention. Churn destroys velocity.

Tier 2 · Development Efficiency

Metric	Target	Why It Matters
Deploy Frequency	Daily / team	Frequent small releases reduce risk.
PR Cycle Time	< 24 hours	Long cycles = process friction.
Change Failure Rate	< 5%	Speed without quality = recklessness.
Onboarding Time	< 2 weeks	First prod commit. Slow = doc/complexity debt.

Tier 3 · Technical Health

Metric	Target	Why It Matters
MTTR	< 4h P1 / < 24h P2	Recovery speed beats prevention.
API p95	< 200 ms	Performance is a feature; slow is broken.
Core Web Vitals	"Good" per Google	User-perceived performance drives satisfaction.
Critical-Path Coverage	≥ 90%	Coverage on what matters > overall %.

Tier 4 · Customer Experience · Tier 5 · Business Impact

Customer (T4)	Target
Time to First Value	Falling YoY
Learning Velocity	Baseline + X% YoY
Perceived Reliability	≥ 99.5%
Feature Adoption (30d)	≥ 25%

Business (T5)	Target
MAU	Business targets
CLTV	Increase YoY
NPS	≥ 50
Engineering ROI	Quarterly review

Anti-Metrics · What We Refuse to Measure

Lines of code — incentivizes bloat · Features shipped — features without adoption are maintenance burden · Overall test coverage % — 80% wrong < 50% critical · Hours worked — incentivizes presence over impact, inversely correlated with sustainable pace.

Team State · Diagnostic

Adapted from Will Larson's team-state model. Diagnose where each team needs support.

Falling Behind

Backlog grows weekly · morale declining · constant fire-fighting.

Fix: Add capacity or reduce scope. Find quick wins to rebuild confidence.

Treading Water

Critical work gets done but no improvement work happens.

Fix: Reduce WIP. Consolidate focus. Shift to team-based metrics.

Repaying Debt

Active improvement work · velocity increasing.

Fix: Protect the time. Don't interrupt with new priorities.

Innovating

Debt is low · team focused on new value creation.

Fix: Maintain slack. Make work visible and valued.

Governance & Cadence

Cadence	Activity	Participants
Weekly	Team-level Tier 2 review	Pod leads
Bi-weekly	Cross-team blockers, dependencies	Eng leadership
Monthly	Tier 1 + Tier 3 review	Eng + People
Quarterly	Full review · OKR check-in · T4-T5	Eng + Product + Exec
Annually	Strategy refresh · target updates	Eng leadership

Ownership If you can't point to exactly one stream, that's a gap

SP Ecosystem · Who Owns What

Every system, module, and data domain has exactly one owning stream. When something breaks or evolves, there's no ambiguity. Streams own, consume, or collaborate.

Confluence ↗

Platform Teams · Capabilities streams consume

Core Platform Platform

Identity, auth, infrastructure, shared services. Provides the foundation every stream depends on.

authinfrastructureusers (core)

SP Admin Platform

Internal admin tools for content, users, accounts, orders. Read-only access into all streams.

admin

Skills Validation Platform

Skill evaluation, assessments, certifications. Source of truth for skills data consumed by other streams.

assessmentsskillsachievements

Value Streams · Direct customer value

LibraryPod 1

Single source of truth for what learning content exists. Help users find the right content.

contentlibrarysearch

Learning ExperiencePod 1

World-class learning that helps users build skills through engaging, measurable learning moments.

learning-paths

Live-Training (SPLT)Pod 2

Exceptional live learning — connecting learners with expert instructors in real time.

splt

VARPod 2

Enable resellers to efficiently manage their customers and grow on our platform.

var

Talent MarketplacePod 3

Close the gap between a professional's true capability and how hiring managers perceive them.

careerprofiles

Platform AccessPod 4

The "front door" — how users discover, access, and manage their relationship with SolidProfessor.

accessonboardingsubscriptionsbilling

Organized LearningPod 4

Empower educational institutions and organizations to deliver structured, measurable learning programs.

classesgroupsLTI

Performance ManagementPod 5

Empower managers to develop their teams — identify skill gaps, track growth, drive targeted upskilling.

analyticsreports

Stream Dependency Graph · Core Platform sits at the center

Every stream consumes from Core Platform. Bold arrows mark the heaviest integrations.

Resolved Boundary Decisions · Avoid Future Ambiguity

Area	Decision
Payments	Core Platform owns Stripe SDK · Platform Access owns payment UI & subscription logic.
Skills data	Skills Validation owns evaluation · Talent Marketplace consumes for display.
Class content	Library owns content · Organized Learning references & tracks class-specific progress.

"Who Owns This?" · Quick Lookup

If you need to know about…	Owned by
User login, SSO, authentication UX	Platform Access
Auth infrastructure, Auth SDK	Core Platform
Subscriptions, billing UX	Platform Access
Payment infra, Stripe SDK	Core Platform
Course content, lessons, paths	Library
Classes, groups, assignments, LTI	Organized Learning
Quizzes, assessments, certifications, skills	Skills Validation
Reseller management, VAR orders	VAR
Live training, SPLT sessions	Live-Training
Professional profiles, portfolios	Talent Marketplace
Internal admin tools	SP Admin
Infrastructure, CI/CD, shared services	Core Platform

Escalation Path · When Ownership Is Unclear

Streams attempt to resolve directly.
Unresolved within 48 hours → escalate to Engineering Leadership.
Decision documented within 1 week. ADR created if significant.

Team Topologies Conway's Law in reverse

Team Design Philosophy

We design teams around cognitive load — the amount of complexity a team can effectively own. Stable, empowered teams with clear boundaries that deliver value without constant coordination overhead.

Confluence ↗

The test of good team design: Can each team wake up, understand their mission, do meaningful work, and ship it to users without waiting on other teams?

Cognitive Load · The Organizing Constraint

Domain

Business rules, workflows, edge cases.

Technical

Codebase size, patterns, integrations.

Operational

Deploys, monitoring, on-call.

Coordination

Cross-team dependencies, comms overhead.

Team Size · 7 ± 2 Principle

Communication paths grow quadratically with team size. We target 5–9 people.

5

10 paths

Tight, full shared context.

7

21 paths

Manageable; some info loss.

9

36 paths

Overhead rising; subgroups form.

12

66 paths

Two teams pretending to be one.

Three Team Types

Stream-aligned

Direct customer value

Own a business segment end-to-end. Build, test, deploy, operate. Minimal dependencies on other stream teams.

Healthy: ships frequently, clear boundaries, on-call manageable, new engineers productive in 2-4 weeks.

Platform

Self-service capabilities

Treat platform as an internal product. Stream teams are customers. Aim for the Thinnest Viable Platform.

Healthy: streams self-serve, docs current, changes rarely break consumers, capacity goes to capability not requests.

Enabling

Capability transfer

Temporary. Teach teams to fish. Success = no longer needed. Work across teams but not simultaneously.

Healthy: stream becomes self-sufficient, engagement has a clear end date, capability transferred (not performed).

Three Interaction Modes

Mode	Purpose	Duration	Best For	Warning Sign
Collaboration	Discover solutions together	Days-weeks	New APIs, novel problems	Never ends → split or graduate
X-as-a-Service	Consume capabilities self-serve	Ongoing	Mature, stable capabilities	Can't self-serve → still collaboration
Facilitating	Transfer knowledge & skills	Weeks-months	Building new capabilities	Creates dependency, not capability

Evolution pattern: Collaboration → Facilitating → X-as-a-Service.
Example: Core Platform introduces feature flagging. (1) Collab with one stream to design the API. (2) Facilitate other streams adopting. (3) X-as-a-Service — all streams self-serve via docs.

Team Health · Quarterly Diagnostic

10-12 ✓ = Healthy · 7-9 ✓ = Watch closely · < 7 ✓ = Intervention needed.

Cognitive Load

New engineer productive in < 4 weeks?
Team can explain domain in 10 minutes?
Team owns a coherent domain?
Members can vacation without heroics?

Autonomy

Can ship without waiting on other teams?
Controls own roadmap priorities?
Resolves most incidents without escalation?
Clear ownership with adjacent teams?

Sustainability

On-call ≤ 1 week per 4-6 weeks per person?
Completing planned work most sprints?
Team stable past 6 months?
Attrition / burnout absent or rare?

Common Failure Modes

Failure Mode	Warning Signs	Response
Team too large	Subgroups · comms overhead · diluted ownership	Split along domain boundaries
Team too small	Unsustainable on-call · single points of failure	Merge with adjacent team or hire
Platform bottleneck	Streams waiting on Platform · Platform backlog growing	Invest in self-service · push back on customs
Enabling = dependency	Same team repeatedly enabled · no end date	Set exit criteria · assess transfer
Fuzzy ownership	"I thought your team owned that" · dropped balls	Document & publicize ownership
Cognitive overload	Long onboarding · tribal knowledge · firefighting	Split team · transfer ownership · simplify domain

Architecture Early draft

Modular Monolith

A single deployable application organized into well-bounded domain modules. Modularity's organizational benefits without the operational complexity of distributed systems.

Confluence ↗

The rule: Modules have clear boundaries and communicate through explicit interfaces. No reaching into another module's internals.

Why We Chose It

Benefit	How It Helps
Simpler ops	One deployment pipeline. No service mesh. No distributed tracing complexity.
Easier debugging	Single process. Straightforward request tracing.
Strong boundaries	Enforces separation of concerns without network overhead.
Evolutionary path	Modules can be extracted to services later if needed.

Four Core Principles

Domain-centric modules — each module owns its routes, services, models, migrations, APIs. Module ownership aligns with stream ownership.
Explicit interfaces — public APIs, events (Laravel events), queues for async. Never access another module's internals.
Shared infrastructure — auth, authz, design system, DB, cache, queues all live in Core Platform.
Single deployment — one artifact, unified CI/CD, atomic deploys, shared versioning.

Module Structure

# Each module owns its full vertical slice
app/
└── Modules/
    ├── Library/              # Library stream
    │   ├── Controllers/
    │   ├── Models/
    │   ├── Services/
    │   ├── Routes/
    │   └── Database/
    ├── Assessment/           # Skills Validation
    ├── OrganizedLearning/    # Organized Learning
    ├── PlatformAccess/       # Platform Access
    ├── VAR/                  # VAR
    ├── SPLT/                 # Live-Training
    ├── SolidCareer/          # Talent Marketplace
    └── Core/                 # Shared infra (Core Platform)

Module ↔ Stream Mapping

auth	Core Platform
infrastructure	Core Platform
users	Core Platform
admin	SP Admin
access / onboarding	Platform Access
subscriptions / billing	Platform Access
content / library	Library

learning-paths	Library
classes / groups	Organized Learning
assessments / quizzes	Skills Validation
skills / achievements	Skills Validation
var	VAR
splt	Live-Training
career / profiles	Talent Marketplace

Tech Stack

Layer	Technology
Backend	Laravel (PHP) · primary framework
Frontend	Vue 3 · target · Vue 2/Nuxt 2 migrations ongoing
Styling	TailwindCSS · UnoCSS (utility-first)
State	Pinia
Build	Vite
Infrastructure	AWS · Terraform (IaC via Core Platform)
CI/CD	GitHub Actions · templated pipelines

When to Deviate

Modular Monolith is our default. Deviate only when:

Scenario	Possible Response
Extreme isolated scaling	Extract to service (e.g., real-time analytics)
Fundamentally different tech	Separate service (e.g., Python for ML inference)
Strict team autonomy	Micro-frontend or microservice

Before deviating: Document in an ADR. Extraction is the exception, not the norm.

Anti-Patterns

Direct DB access across modules · hidden coupling.
Circular dependencies · modules must have clear dependency direction.
God modules · one module doing everything defeats the purpose.
Premature extraction · don't build microservices until you've proven you need them.

Delivery Flow Idea → Customer · in hours, not days

Operating Practices

We optimize for flow — the speed at which value moves from idea to customer. Small batches, fast feedback, automated pipelines, clear ownership.

Confluence ↗

The CI/CD Pipeline

Commit

trunk-based

Build

< 5 min

Test

< 5 min

Review

< 3 hr

Merge

daily

Deploy

< 5 min

Monitor

continuous

Development Principles

Principle	Practice
Small batches	PRs small, focused, reviewable in one sitting
Trunk-based	Short-lived branches (< 1 day) · frequent merges to main
Feature flags	Decouple deploy from release · ship dark · enable incrementally
Continuous testing	Tests run on every commit · broken builds block merges
Ownership	The stream that builds it runs it and fixes it

Deployment & Release

Strategy	When Used
Rolling	Default for most changes
Feature flags	New features · gradual rollout
Blue-green	Infrastructure · zero-downtime
Canary	High-risk · subset validation

Cadence	Rule
Target	Daily deploys per stream
Hotfixes	Ship immediately when needed
Code freezes	None — except critical incidents

Incident Response

Severity	Definition	Response	Resolution
P1	Service down · major user impact	< 15 min	< 4 hr
P2	Degraded · significant impact	< 1 hr	< 24 hr
P3	Minor · workaround exists	< 4 hr	< 1 week
P4	Low impact · scheduled fix	Next sprint	As prioritized

Post-incident: Blameless post-mortems for all P1/P2 · root cause + timeline + remediation documented · action items tracked to completion · learnings shared across engineering.

Observability

Pillar	Tool	Purpose
Monitoring	Datadog	System health, dashboards
Logging	Datadog	Centralized logs, search
Alerting	Datadog	Notify on-call for issues
Tracing	Datadog	Request flow across services

Strategic Initiatives · Current Priorities

Initiative	Goal	Metrics Impacted
Unified Observability	Consolidate monitoring in Datadog	MTTR ↓
Enhanced IaC	All infra via Terraform	Deployment reliability ↑
Vue 3 Migration	Complete frontend modernization	DevEx ↑ · Web Vitals ↑
Modular Monolith	Align codebase with stream boundaries	Cycle time ↓ · Onboarding ↓
Automated Incident Response	Self-healing for common issues	MTTR ↓ · Incident rate ↓

Anti-Patterns

Long-lived branches · merge conflicts, delayed feedback.
Manual deployments · error-prone bottlenecks.
Testing in production only · customers find bugs.
Hero deployments · one person knows how to ship.
Ignoring flaky tests · erodes trust in the suite.
Big bang releases · high risk, hard to debug, stressful.

For the daily playbook · processes, rituals, and standards

This section is the strategy view of how we ship. The tactical view — branching, code review, PR sizing, feature flags, testing standards, work types, on-call recipes — lives in the Playbook.

Open the Playbook →

Guide

How to Use This Document

These documents are living. If something isn't working, raise it. We review quarterly.

New to engineering?

Start with Why We Exist and SP Ecosystem to understand who owns what.

Making a decision?

Check Six Guiding Principles — they're ordered by priority. #1 outranks #6.

Building something?

Modular Monolith defines our approach. Playbook → defines the daily process.

Proposing a change?

Raise it. Documents are reviewed quarterly. If a principle isn't helping you decide, tell us — it needs to be fixed or removed.

Modular systems. Empowered streams. Small, safe, frequent changes.

Foundations · Why

Why We Exist

↓ Inflow · Grows the stock

Outflow · Drains the stock ↑

What Success Looks Like

Six Guiding Principles

1Customer Outcomes Over Internal Convenience

2Small, Safe, and Frequent Over Big and Risky

3Clarity Over Speed

4Ownership Over Delegation

5Sustainable Pace Over Heroics

6Learning Over Knowing

When Principles Conflict

How We Measure Success

Falling Behind

Treading Water

Repaying Debt

Innovating

People · Who

SP Ecosystem · Who Owns What

Team Design Philosophy

Direct customer value

Self-service capabilities

Capability transfer

Cognitive Load

Autonomy

Sustainability

Systems · How

Modular Monolith

Operating Practices

For the daily playbook · processes, rituals, and standards

How to Use This Document