Engineering Q&A
Companion to the SP Engineering Playbook — answers to questions that come up in real engineering work at SolidProfessor. Living document; add as questions come up.
← PlaybookTechnical Practices
Value Streams ↗ playbook
What is a value stream?
A product-facing unit of ownership at SolidProfessor. Each stream owns a set of systems, data, and capabilities — and is responsible for the end-to-end customer experience in that area.
SP has eight product streams (Library, Learning Experience, Organized Learning, Platform Access, VAR, Live-Training, Performance Management, Talent Marketplace) and three capability teams (Skills Validation, Core Platform Team, SP Admin) that they consume from.
I'm not sure which stream owns my system. What do I do?
Start at Value Streams (Experiences) on Confluence. Each stream's page lists its Systems Owned and Boundary Clarifications.
If you use Claude, the /value-stream skill loads all stream pages at once so you can ask "who owns X?" against the full set.
How do I handle cross-stream changes?
Read the relevant streams' Boundary Clarifications first — many cross-stream questions are already resolved there. If yours isn't, loop in both streams' tech leads before merging. Don't push changes to another stream's owned systems without their review.
Build vs. Adopt ↗ playbook
When should we build something custom?
Only when it creates meaningful differentiation — usually the learning experience itself, or a domain problem nothing existing solves well. The default for everything else is to adopt a proven solution.
Ask: "Does building this ourselves give us a meaningful advantage over adopting a proven solution?" If no, every hour spent building is traded away from work that moves the needle.
What's "undifferentiated heavy lifting"?
An AWS term for work that's necessary for the product to function but creates zero competitive advantage. SSR, CI/CD, auth flows, build pipelines — table-stakes infrastructure.
Differentiation is what we build on top of that infrastructure: library search, recommendations, lesson player, retention features.
We could build it faster than evaluating frameworks. Why not just build?
Because the build is the minority of the cost. Industry data: 70%+ of total cost of ownership is maintenance, 40% of IT budget is consumed by tech debt (Gartner), and 35% of large custom IT projects are abandoned (McKinsey).
Comparing initial build estimate to long-term maintenance cost is a category mistake. The framework is rarely slower over the year that follows.
How do I document what we're choosing not to build?
Add a short "What we are choosing not to build, and why" section to the design doc or PR description. Keeping the opportunity cost visible is the discipline — it prevents quiet accumulation of custom infrastructure and makes the trade-off explicit for future decisions.
Trunk-Based Development ↗ playbook
Why must branches live less than a day?
Because long branches accumulate divergence from main. Every additional day increases the chance of merge conflicts, integration bugs, and stale assumptions about the current state of the codebase. Accelerate identifies short-lived branches as one of the strongest predictors of elite engineering performance.
What if my work isn't done in a day?
Decompose it. The work isn't "done in a day" — but a small, releasable slice of it is. Use SPIDR slicing or the Hamburger Method to find one shippable layer.
If the feature can't be safely exposed yet, ship the code behind a feature flag and merge to main dark.
Can I use a long-lived feature branch for big features?
No. Big features are built incrementally on main, with feature flags gating user-visible behavior. Long-lived branches are how you get the merge-hell problems trunk-based development is explicitly designed to avoid.
Do I rebase or merge from main?
Rebase, daily, while your branch is still open. It keeps history linear and conflicts small. The longer you wait, the worse the rebase. Squash on merge.
Branching Strategy ↗ playbook
What's the prefix for what kind of work?
feature/ for new capability, enhancement/ for improving existing features, bugfix/ for prod bugs, defect/ for in-sprint defects (caught before prod), hotfix/ for urgent prod fixes, techdebt/ for debt repayment, maintenance/ for short-lived cleanup. Format: <prefix>/SPPLT-XXXXX.
What if I don't have a JIRA ticket?
Get one before branching. If the work is too small to ticket, it's probably also too small to branch — bundle it with the next related work. If it's truly necessary, escalate to your PM to scope and ticket it; the trail matters for traceability and release notes.
bugfix or defect — what's the difference?
Bug = caught in production. Defect = caught before production (in dev, QA, or UAT). Same fix, different prefix — and the distinction matters for QA metrics and what we learn from it.
Semantic Versioning ↗ playbook
Why do platforms and packages have different versioning rules?
Because the version number serves different audiences. Platforms ship to end users on a calendar, so the version answers when did this go live (year/sprint/in-sprint cadence). Packages ship to other apps that depend on them, so the version answers what kind of change is this — can I upgrade safely? (breaking/feature/fix).
What counts as a "breaking change" in a package?
Any change that could break a consumer using the documented public API: removed exports, changed function signatures, renamed props, removed config options, changed default behavior. If a consumer's working code might stop working after upgrading, it's MAJOR — full stop, regardless of how small the change feels.
When do I bump a platform version?
Don't, mid-sprint. Platform versions follow the calendar/sprint cadence: MAJOR at the start of each year, MINOR at the end of each sprint, PATCH within a sprint for fixes. The release process handles the bump — you don't pick the number based on the change type.
Feature Flagging ↗ playbook
When does something need a feature flag?
Default to flagging anything that could go wrong in prod for paying customers — schema-adjacent changes, new integrations, behavior changes, anything you'd want to flip off without redeploying. Also: anything that lets you ship dark to main while iterating.
What's the cleanup stage and why does it get skipped?
After a feature is fully rolled out, you delete the flag and remove the branching logic from the code. It gets skipped because shipping is done and there's no urgency — but stale flags are tech debt, and the conditional code rots.
The discipline: open a techdebt/SPPLT-XXXXX branch immediately after rollout to remove the flag.
Can a flag be permanent?
Yes — flags that gate genuinely permanent differences (e.g., free vs paid tiering, region-specific behavior) live forever. Make the permanence explicit in the flag's description so it doesn't get cleaned up by mistake.
Database Changes ↗ playbook
Can I write raw SQL for a one-time fix?
No. Use a OneTimeOperation class — the laravel-one-time-operations package runs each operation exactly once and tracks completion. Raw SQL against prod is unauditable, unrepeatable across environments, and skips the safety net of testing in preview first.
What's an OTO?
A One-Time Operation. A class for production data changes (backfills, conversions, cleanups) that runs as part of a deploy, exactly once. It's tracked by the package, named descriptively, and committed to the repo as the audit trail.
Should I delete an OTO file after running?
No. Keep the file. The package tracks completion, and the file is the permanent audit record of what data change ran in production. Deleting it loses the history.
Modifying a column — what gets dropped?
Anything you don't re-specify. When you write a Laravel migration that modifies a column, you must repeat all previously-defined attributes; otherwise they get dropped silently. Always check the existing column definition before writing the modify migration.
Continuous Integration ↗ playbook
Can I rerun the build to skip the flake?
Not without investigating. "Just rerun it" is how flakes become permanent — every successful retry trains the team to ignore the signal. Root-cause it (real failure vs flake) before the rerun. If it's truly flaky, quarantine the test with a JIRA link in the same PR, then fix in a follow-up.
How do I handle a flaky test?
Quarantine via it.skip() (or your framework's equivalent) with a JIRA link in the comment, in the same PR where you discovered the flake. Don't just delete it. Then open the JIRA ticket as defect/ and fix the root cause in a follow-up.
Should I copy a workflow into my repo or reuse the central one?
Reuse the central one in solidprofessorhub/.github via uses: solidprofessorhub/.github/.github/workflows/<name>.yml@main. Duplicating workflow logic across repos creates drift — when we improve the central workflow, every consumer benefits automatically.
Testing ↗ playbook
Why "test behavior, not implementation"?
Tests that bind to implementation break every time you refactor — even when the behavior is correct. Tests that bind to behavior change only when requirements change. The first kind erodes trust and slows refactoring; the second kind enables it.
Should I mock the database?
Usually no. Integration tests that hit a real DB catch migration issues, schema drift, and ORM surprises that mocks miss. Mock external services (Stripe, Algolia, Segment) and slow operations; don't mock the system you're actually testing against.
DAMP > DRY — what does that mean for tests?
Descriptive And Meaningful Phrases. In production code, DRY (Don't Repeat Yourself) wins. In tests, a little duplication is fine if it makes the test self-contained and obvious. A reader should understand a test without scrolling elsewhere.
What's the Beyoncé Rule?
"If you liked it, you should have put a CI test on it." (From Software Engineering at Google.) If a behavior matters enough that someone would be upset when it breaks, it deserves a test that catches the break automatically.
Tech Debt ↗ playbook
Whose job is tech debt?
Yours. Every pod owns its own debt. Debt isn't routed to a separate "platform team to clean up later" — that's how it accumulates indefinitely.
How do I get debt prioritized?
Make it visible to your PM, framed as a quality win for users (faster page loads, fewer bugs, faster feature delivery). Quantify the cost of doing nothing — what feature work is it currently blocking or slowing? Debt repayment is roadmap work, not slack-time work.
Should I bundle debt fixes into feature PRs?
No. "While I was here…" cleanup bloats PR size and obscures review. If you spot adjacent debt during feature work, decide explicitly: fix in scope (rare) or open a separate techdebt/ ticket (usually).
AI Coding ↗ playbook
Can I commit AI-generated code?
Yes — but only after reading every line and verifying you can explain it. AI is a first-draft generator, not an author. If you can't explain it, you can't ship it.
Should I include "Co-Authored-By: Claude"?
No. SP commits and PRs do not include AI authorship trailers. If AI assistance was significant, mention it in the PR description with a review checklist (business logic verified, security checked, edge cases tested, patterns match standards).
How rigorously do I review AI-generated code?
At least as rigorously as hand-written code — often more. AI generates plausible-looking but subtly wrong logic: imagined APIs, wrong types, off-by-ones, security oversights. Run it locally before sharing. Test the edge cases the AI wouldn't have known to test.
How does AI affect PR sizing?
It makes decomposition discipline more important, not less. AI accelerates the production of large, risky batches. If you skip the decomposition step because "the AI made it fast," you've traded engineering for programming. Small PRs are the safety net for AI-assisted velocity.
Types of Work ↗ playbook
Bug or Defect — what's the difference?
Bug = caught in production. Defect = caught before production (dev, QA, or UAT). The fix is the same; the distinction matters for QA metrics, the root-cause conversation, and what we learn about how it slipped through.
Should I open a Tech Debt ticket for cleanup during a feature?
Yes — separate ticket, separate PR. Bundling "while I was here…" debt fixes into a feature PR violates the One Concern Rule and inflates review time. Open the ticket, scope it explicitly, and pull it next sprint.
What's a Spike?
Time-boxed research or discovery work — typically to answer a specific technical question before committing to a build. The deliverable is a recommendation (a doc, a prototype, a decision), not shipped code. If the box runs out, escalate; don't quietly extend.
Process Practices
Working in Small Batches ↗ playbook
What's an Elite-tier PR?
Under 225 lines changed. LinearB's 2026 benchmarks (8.1M PRs, 4,800+ orgs) put Elite at <1% change-failure rate and <11h review time. Good: 225–370. Fair: 371–698. Needs Focus: >698. Past Needs Focus, change-failure rate exceeds 17% and review time exceeds 38 hours.
My change is naturally bigger than 225 LOC. What do I do?
It's almost certainly not — it just hasn't been decomposed yet. Try one of the techniques: SPIDR Slicing (split by Spike/Paths/Interface/Data/Rules), Hamburger Method (thinnest end-to-end slice first), Walking Skeleton (full path with stubs), One Concern Rule (if you'd describe it with "and", split it).
What is SPIDR slicing?
A decomposition framework: split the work along one of Spike (research first), Paths (one user path before the others), Interface (the API/contract before the implementation), Data (one entity type before the others), Rules (one business rule before the others).
Does the LOC count include tests?
Yes — tests count, generated migrations count, fixtures count. Auto-generated files (lock files, minified output, snapshots) are excluded by the PR Size Check action. The number you see in the PR comment is the number reviewers actually have to read.
What is the Hamburger Method?
Ship the thinnest end-to-end slice across all layers (frontend → API → backend → DB) first; layer additional capability on after. Contrast with vertical slicing where you'd ship a complete frontend before any backend — the Hamburger gives you a working system earlier.
Preview-First Workflow ↗ playbook
What is the preview environment?
An ephemeral environment that spins up automatically when you open a PR. It runs your branch's code against a production-like setup, with its own URL, so reviewers (and you) can validate behavior before merge. Pushes to the PR auto-update the preview.
Why isn't staging the quality gate anymore?
Because the old workflow — merge to main, then validate in staging — meant defects became new tickets, the original engineer moved on, and main was perpetually unstable. Preview-first puts validation before merge, keeps main always-releasable, and keeps the engineer in context for fast fixes.
Merging now auto-deploys to staging, preprod, and production. Staging is a deploy target, not a gate.
An issue was found in QA. Do I file a new ticket?
No. Issues found during QA & UAT in preview are fixed in the same PR. That's the whole point of preview-first — keep one piece of work in flight until it's actually done.
What does "stay with your PR" mean?
Don't context-switch away after opening the PR. You're not done — you're in the middle of finishing. Notify reviewers, be available to answer questions, fix issues immediately when found. Treat "waiting for QA & UAT" as active work, not a handoff.
Code Review ↗ playbook
What's an "orthogonal lens"?
A lens that covers a dimension the others explicitly ignore. For code changes: Correctness (ignore style/security), Security & edge cases (ignore happy path/perf), Maintainability & clarity (ignore correctness/security). If two lenses would flag the same issue, sharpen the boundary or merge them.
Should I comment on style issues in code review?
No. Pint, oxlint, and Prettier own formatting. If style passed CI, don't bikeshed it in review. Save your reviewer attention for correctness, security, and design.
The PR is huge — should I just approve?
No. Either review it properly (and likely send back for decomposition) or push back on the size before reviewing. Rubber-stamping large PRs is how the high-failure-rate change happens. The PR Size Check on the PR will tell you the tier — if it's Needs Focus, the right comment is "this needs to be decomposed."
When should I use /deep-review?
For architectural changes or risky refactors where one reviewer reading sequentially is the wrong shape. /deep-review launches parallel agents, each assigned one orthogonal lens, and synthesizes the findings into a deduplicated report. Overkill for an Elite-tier PR; right tool for a system-design change.
What's the 90-minute rule?
If a PR can't be reviewed, tested, and understood in under 90 minutes, it hasn't been decomposed enough. Reviewers can hold ~200–400 LOC in working memory; past that, they skim instead of analyze and bugs slip through.
Platform Backend ↗ playbook
What endpoint tests are required?
Every new endpoint covers 200 / 401 / 403 / 422: success, unauthorized, forbidden, validation error. Plus tests for any associated Actions, Jobs, Mailables, Notifications. The PR description must say how to run them.
Why must system tests be under 2 seconds?
Because slow tests destroy the inner-loop. If your test takes 30 seconds, you'll run it less, and the team will run it less, and CI will get slower. If yours doesn't fit in 2s, refactor the logic or mock the slow interaction (external API, file I/O) — don't accept the slowness.
New business logic — Service or Action?
Service. New business logic goes in Service classes. Existing Actions (lorisleiva/laravel-actions with the AsObject trait) stay where they are — don't refactor them, but don't create new ones either.
Bug Investigation ↗ playbook
Why write the failing test first?
Three reasons: (1) it proves you've actually reproduced the bug, not just guessed; (2) it gives you a green-light signal when the fix lands; (3) it becomes the regression guard so the bug can't return silently. The Red → Green loop is the discipline that turns "I think it's fixed" into "I know it's fixed and it'll stay fixed."
I see the symptom. Do I really need to find the root cause?
Yes. Symptom-fixes (catch the exception, null-coalesce the value, add a special case) hide bugs that resurface elsewhere with a different shape. Trace it: Route → FormRequest → Controller → Service → Model. Don't stop until you can explain why it failed, not just where.
Can I refactor while fixing a bug?
No. Refactors are a separate PR. Bug fixes stay minimal so reviewers can verify the diff is the fix. Refactor-bundled-with-fix is the classic "now I have two bugs" pattern — the fix introduces a regression in the refactor that's easy to miss.
Culture Practices
Engineer Mindset ↗ playbook
What's the difference between developer mode and engineer mode?
Developer mode solves the problem in front of you — requirements in, code out, PR merged. Primary axis: correctness right now.
Engineer mode solves the problem and the class of problems it belongs to. Primary axis: sustainability over time.
Both modes are necessary. The shift we're asking for is which one is your default.
How do I know which mode I'm in?
Ask: when this code breaks in 18 months, will the next person be able to understand it, change it, and trust it? Developer-mode answers "I haven't thought about it." Engineer-mode answers "yes, because…" Same code can pass or fail this question depending on the choices you made at write-time.
"Code is read more than written" — what does that mean for daily work?
Optimize for the reader, not the writer. Choose clearer over clever, even if "clever" is shorter. Name things for the next person, not the current task. Write the comment that explains why, not the comment that explains what (the code already says what).
What's the Boy Scout Rule?
Leave the code cleaner than you found it. Not by refactoring during a bug fix (don't), but by treating small improvements — better names, removed dead code, clearer comments — as part of every change you touch. The codebase compounds in whichever direction the team leans.
How do I justify time for engineer-mode work to my PM?
Frame it as quality wins for users, not engineering hygiene. "This refactor pays back in faster feature delivery for the next 6 months." "This test catches the bug class before it ships, not after." Engineer-mode is a roadmap line item, not slack-time work — see tech debt.
What's the 1-on-1 question?
Stop asking "what did you ship?" Start asking "what did you make easier to change?" Asked consistently in 1-on-1s and reviews, that question rewires what the team believes the job actually is.
The Product Trio ↗ playbook
What does it mean to be in the trio as an engineer?
You're a co-discoverer, not a receiver of specs. You're in customer interviews. You contribute technical possibilities the PM and Designer don't yet know exist. You ask "what if we could…?" to expand the option space, not just deliver the one inside it.
What if PM hands me a spec?
Push back. That's not a trio operating — that's an assembly line. Ask: "what customer evidence drove this?" and "have we explored other approaches?" If the trio isn't doing discovery together, surface it to your manager.
Who decides what gets built?
The trio, by consensus through evidence. When consensus fails: PM has the final call on what problem to solve (business value), Designer on how to solve it (UX integrity), Engineer on how to build it (technical sustainability). Final call is the tiebreaker, not the starting point.
We disagree — how do we resolve it?
Ask "what evidence would change your mind?" Then go get that evidence. Most disagreements aren't really disagreements about taste — they're disagreements about what's true. The unlock is making the truth-question explicit and falsifiable.
What's the unlock question?
"What evidence would change your mind?" It works in trio disagreements, in code review disagreements, in architecture debates. It moves the conversation from positions to falsifiable claims, and gives both sides a concrete next step.