Knowledge base

Keel Overview

I
Part I

Business & Strategy

What Keel is, who we serve, how we make money

§1

What Keel is

Keel balances financial wellness with responsible gambling. A specialized financial wallet that bridges personal financial management (PFM) with real-time sports-betting analytics, and provides guardrails that encourage responsible play.

As the only platform giving players a unified view of their spending, bets, and wins/losses across multiple platforms and timeframes, Keel is positioned to become the defacto sports-betting financial dashboard.

Three differentiated mechanisms:

  • KeelScore. Proprietary risk score that helps users understand the level of risk they're taking based on betting history and current financial profile. Deterministic and traceable — every score links to specific signal ids.
  • Auto-sweep on a cadence. Funds are swept on a regular rhythm so redeploying capital becomes a deliberate act, not a tap. Self-introspection before new bets — which also drives the transaction-fee revenue model.
  • Cross-platform allocation. One spending limit calculated from bank-linked cash flow, honored across every sportsbook the user spends on. Set once, follows the user everywhere.

Two product surfaces, one platform:

  • Risk and allocation engine. Every consumer gets a calculated monthly discretionary-spending budget based on bank-linked income and obligations. Every partner-initiated debit runs through a deterministic decision engine that approves, declines, or pauses based on remaining allocation, active interventions, and current risk score. Sub-300ms p99.
  • Custodial wallet rail. Partners who opt into custodial mode route consumer funds through Keel instead of pulling ACH directly. Funds rest in a consumer-owned liability balance. We settle to the partner on a configurable cadence, net of fees and a holdback against returns. PayPal-style pay-in with guardrails baked in.
Adjacent verticals (potential)
Keel's primitives — bank-linked allocation, cross-platform enforcement, deterministic risk scoring — apply equally to active retail trading (options, leverage, crypto) where the same in-session-impulse-vs.-out-of-session-intent dynamic plays out. Not Phase I scope; the rail is built so the second vertical doesn't require a rebuild.
§2

The problem we solve

Mobile sports betting is exploding without tools in place to encourage responsible play. Legal US sports betting went from near-zero pre-PASPA (2018) to over $167B in annual handle generating $14B in revenue across 54M Americans. The infrastructure for consumer protection has not scaled with the market.

Three concrete gaps:

  • No unified visibility. A bettor's “money in, money out” is partitioned across 4–8 platforms that don't coordinate. None can see the whole.
  • No balance between financial wellness and gambling behavior. Spending limits exist on every sportsbook's marketing page but rarely bind in the moment they'd need to. Without real-time, cross-platform visibility, consumers — from core bettors to college students — are highly vulnerable to financial imbalance.
  • Incentive mismatch inside operators. An operator's revenue tracks the consumers a serious protection program would slow down. Self-policing puts protection in conflict with topline.

Keel solves the visibility gap with custodial mode plus a risk engine that spans platforms. It changes the incentive mismatch by making protection a consumer- facing wallet rather than something an operator owes its users — and by aligning revenue (transaction fees on the rail) with the act of redeploying funds, not the act of losing them.

The same dynamic plays out in active retail trading (options, leverage, crypto) where in-session impulse and out-of-session intent diverge on the same shape. Adjacent vertical, same primitives — see §1.

§3

Trust & safety philosophy

This is the product. Three principles drive every architectural decision.

  • Users cannot change their own limits in the moment. They can request a review. The allocation engine recalculates from bank-linked cash flow on its own cadence. There is no “raise my limit by $500” button. Enforced in code at AllocationService.
  • Interventions are sticky. A cooling-off period or spending pause is a one-way door. It can only be lifted by its own expiry timer or by an operator override with a written reason recorded in the audit log.
  • Neutral framing on partner-facing wires. Partners never see cooling_off, self_exclusion, or problem_gambler on the wire. They see a universal restriction_period_active decline code. The internal vocabulary preserves regulator terminology where required. The wire deliberately does not. A sportsbook UI and a trading-platform UI render the same outcome with their own appropriate copy.
Non-negotiable
Risk scores are deterministic and traceable. Every score links to the specific signal ids that produced it. No black-box ML scoring. A regulator must be able to reconstruct any decision from the audit log.
§4

Who we serve

Two-phase go-to-market. Each phase has its own audience and acquisition engine.

Phase I — The core bettor
Men 22–45 with disposable income, juggling multiple sportsbooks
  • Premium dashboard tracking money in, money out, bets won, bets lost
  • One bank link, every sportsbook the user spends on
  • KeelScore as the recurring reason to come back
  • Acquisition via Paul Charchian's audience: 100K Twitter, 30K+ YouTube viewers, weekly radio in 15+ markets. Near-zero CAC.
Phase II — The enterprise moat
Universities and higher-ed institutions
  • Students get a free budgeting tool
  • Institutions get a sanitized aggregate view of campus financial health
  • Intervention path when a student's KeelScore drops into a risk range
  • Pitch reframes from “better's dashboard” to pedagogy-aligned student-financial-wellness platform

The architecture is vertical-agnostic. The same allocation engine that backstops a sportsbook deposit also backstops any future partner-integrated rail. “Deposit” on the wire is neutral. Vertical-specific copy lives in the partner's UI, not in our schema — leaves the door open to the adjacent verticals called out in §1.

§5

How we make money

Two phases, two engines.

Phase I — Consumer rail

  • 2.50% transaction fee whenever a user transacts on the platform. The auto-sweep cadence creates a regular redeployment of funds, and each redeployment registers a new transaction. The fee compounds with engagement without rewarding the kind of spending the protection product is designed to slow.

Phase II — Enterprise & institutional scaling

  • Per-user fees from universities. Students get a free budgeting tool. Universities pay for the sanitized aggregate view of campus financial health, plus the ability to intervene when a student's KeelScore drops into a range that signals risk.
  • SaaS platform subscription fees. Tiered access to the institutional dashboard, reporting, and intervention configuration.

Optional partner-side revenue (Phase III, opportunistic)

The custodial rail is built to support partner-side monetization too if a sportsbook partner contract makes sense — transaction fees on deposits routed through us, implementation and integration fees, settlement-reserve float at scale. Treated as opportunistic rather than core, behind the Phase I consumer-rail revenue.

Design constraint
Keel's revenue must not scale with how much an individual consumer loses. A take-rate on volume of money moving through the rail is fine; a take-rate tied to losses, or upsell pressure on the consumers most likely to need expensive protection, would reproduce the operator-internal incentive conflict.

Illustrative custodial-mode parameters

For the ledger primer and the technical-architecture sections we use a concrete example of the custodial pay-in flow. These rates are illustrative; the consumer-rail 2.50% above is the committed Phase I price.

Fee per deposit (illustrative)
~150 bps + $0.20 flat
configurable per partner; pinned at contract
Settlement holdback (illustrative)
0–1000 bps
routed to PARTNER_RESERVE against ACH-return exposure
Settlement cadence
daily / hourly / instant
instant = batched at minute-scale, not literal per-deposit
Hold window
5 business days (default)
ACH return window before payable becomes settleable
Reserve release
2× hold-days after last deposit
v1 policy; per-cohort schedule deferred
Float economics
Reserve balance earns at scale
treasury function activates once volume justifies it
§6

Regulatory posture

Keel is a fintech, not a gambling or trading operator. We never take a wager and we never execute a trade. The strategic question we engineered around: what is the minimum regulatory footprint that enables consumer protection plus the custodial rail?

  • Money transmitter framework for the custodial rail. State-by-state licensing strategy. The custodial wallet is gated on legal clearance and stays dark until cleared in the consumer's home state.
  • BSA/AML obligations follow from holding consumer funds. The AML hook (Slice 9) logs every confirmed transfer with structured fields. The velocity and structuring rule engine ships in Phase 4 compliance.
  • KYC and identity verification handled by Plaid IDV at onboarding, including watchlist and sanctions screening. Same vendor as the bank link, so the integration surface is one product, not two.
  • State-level gambling regulation. We sit adjacent, not inside. Sportsbooks in regulated states already do their KYC. We layer guardrails on top of their license. We do not duplicate it.
  • Securities and brokerage regulation. Same posture for the trading- platform vertical. We're not a broker-dealer. We don't hold securities. The FINRA suitability machinery sits inside the brokerage. We provide the spending- side controls, not the trading-side ones.
Gating decision
No wallet funding or money routing until the wallet/ module is explicitly enabled per state. Phase 1 is non-custodial. Signal mode works without any money-transmitter posture. The custodial-mode launch is staged behind a legal clearance gate enforced in code.
§7

Phases & strategy

The build phases are not arbitrary. Each unlocks the next. Phase 1 proves the protection engine works without taking custody. Phase 2 layers the custodial rail. Phase 3 opens the partner surface. Phase 4 closes the compliance loop.

  1. Phase 1: protection engine. User onboarding, KYC, bank link, cash- flow analysis, allocation, risk engine, interventions, notifications. No money movement. Consumer benefit without regulatory exposure.
  2. Phase 2: custodial wallet. Ledger primitives, BaaS pay-in and pay- out, settlement processor, AML monitoring. Gated behind legal clearance.
  3. Phase 3: partner integration. Partner management, API keys, signed embed and redirect surface, outbound webhooks. The product becomes shippable to operators.
  4. Phase 4: compliance build. Regulator-facing exports, right-to-access fulfillment, retention enforcement, AML rule engine, settlement reconciliation reports. The product becomes auditable at the operator and regulator level.
§8

Competitive shape

Four rough categories of adjacent companies.

  • Operator-internal protection tooling. Every major sportsbook (DraftKings, FanDuel, BetMGM) and every active-trading platform (Robinhood, Webull, IBKR) ships in-house consumer-protection features. Deposit limits, cooling-off, pattern-day-trader gates. They cannot, by design, see cross-operator activity. Their incentives compete with their protection team's mandate. We complement them.
  • Advocacy and non-profits. NCPG and GamCare on the gambling side. Financial-literacy organizations on the trading side. Important policy voice. They don't ship infrastructure. We are infrastructure they want to exist.
  • Wallet and pay-in providers. PayPal, Sightline, Nuvei. They run pay-in rails for one vertical or the other. They're rails, not gatekeepers. Limits and interventions aren't in their product. The combination of custodial rail plus guardrails is the wedge.
  • Personal-finance and budgeting apps. Monarch, Copilot, YNAB. They show the consumer their own data, read-only, after the fact. They don't enforce anything at the operator level. Keel sits one layer deeper: the budget doesn't just show up in a chart, it binds at the platform's deposit- authorization call.

What is distinctly Keel: the deterministic risk engine, the dual signal-or- custodial mode, and the cross-platform allocation. No competitor has all three.

§9

North-star metrics

What good looks like, quantitatively. These are the dashboards we'll build first. The Slice 9 AML log stream is the foundation.

Active consumers
Onboarded + at least one linked partner
consumer-side health
Partner-integrated volume
Confirmed deposit cents per period
business-volume health
Allocation adherence rate
% of consumers within their monthly budget
product-effectiveness; should rise over time
Intervention acceptance rate
% of suggested restrictions the consumer accepts
consumer-trust signal
Risk-engine decision latency p99
< 300ms
hard SLO
Settlement success rate
% of SETTLEMENT transfers reaching Confirmed without operator action
operational health; should be > 99%
AML alert rate
TBD (Phase 4 rule engine)
compliance signal
Reserve return rate
% of holdback released without claim
over-holding is wasted capital; trends downward as we calibrate
§10

Brand & voice

The name Keel (rebranded from StakeGuard, 2026-05-22) is a maritime metaphor. The keel is the structural element that lets a boat carry sail without capsizing. Stability that enables ambition, not a brake. The visual identity leans toward clarity. We don't use sportsbook iconography or stock-chart iconography. We read as a payments tool.

To partners, we sound like:

  • Precise. Wire shapes are documented. Idempotency contracts are explicit. Errors are RFC 7807. We act like a payments company.
  • Vertical-neutral on the wire. Decline codes and event names are universal across sportsbooks and brokerages. A partner integrates once. The same shapes work for whichever vertical they're in.

To consumers, we sound like:

  • Calm and direct. No gamification. No “you've earned a streak!” copy. The product job is to keep someone from making a decision they'll regret. The voice has to match.
  • Specific. “Restriction ends Friday at 9pm” beats “your account is paused.” Numbers and dates, not vague language.
  • Vertical-neutral framing. Consumer-facing surfaces talk about “discretionary spending” and “restrictions,” not “bets” or “cooling-off.” Partner-facing surfaces use their own vocabulary. Ours stays neutral.
Internal naming drift (follow-up)
The codebase still has some gambling-coded internal names from earlier slices: FundingService.creditWinnings, WinningsCredited domain event, WalletWinningsCredited audit event. These are internal. The partner wire and consumer copy are already vertical-neutral. The rename pass to neutral terms (creditPayout, PayoutCredited) is a tracked follow-up in Part III §29 Open decisions.
Domain canonical
Production domain is stakeguard.info (retained through the rebrand). Never .io. Never a keel.* variant. All brand-name references on the frontend resolve from BRAND.name via @/lib/branding. Never hardcoded.
II
Part II

Architecture

How the system is built

§11

Technology stack

Conservative defaults. We pick boring, proven tools and reserve novelty for the places where the product is novel (the ledger, the risk engine). Every choice below has been load-bearing for at least one production-grade fintech we've seen up close.

Edge
TLS, DNS, request routing
Route 53ACMALB
Application
HTTP / GraphQL / embed surfaces
Next.js 15Apollo ClientNestJS 11class-validator
Worker
Separate Node process for async work
BullMQNestJS @Processor
Auth
Three identity surfaces
Auth0HMACAPI keys
Data
Durable + ephemeral state
PostgreSQL 16TypeORMRedisPlaid
Infrastructure
AWS us-east-1; account 606096000902
ECS FargateECRSecrets ManagerCloudWatch
Runtime
Node 22.22.2
pinned via .nvmrc + Docker base image
API framework
NestJS 11
class-based DI, decorator-driven HTTP/GraphQL/BullMQ
ORM
TypeORM 0.3
Repository pattern; QueryBuilder for cases data-mapper can't express
Database
PostgreSQL 16 (RDS Aurora)
PG enums for stable state machines; varchar + runtime validation for evolving catalogs
Queue
BullMQ on Redis 7
separate worker process; no in-process job execution
Cache / session
Redis
consumer sg_session + embed sg_embed_session; same instance, different key prefixes
Web framework
Next.js 15 (App Router) + React 19
Turbopack dev; server components used sparingly; most pages are client-rendered
GraphQL client
Apollo Client 3
consumer surface only; partners never touch /graphql
Auth
Auth0
JWT for consumers + admin; HMAC for partner embed launches
Validation
Zod (env + service DTOs) + class-validator (HTTP inputs)
two libs because Nest's validation pipe is class-validator-shaped
Logging
pino via nestjs-pino
structured JSON; AML hook + audit_entry are the durable surfaces
Infra
AWS: ECS, RDS Aurora, ElastiCache, ECR, Secrets Manager, Route53
us-east-1 only; keel profile, account 606096000902
CI/CD
GitHub Actions
ci.yml for PR checks; deploy.yml for env-targeted deploys
§12

Repository layout

Single repo, two workspaces, role-based directory structure inside each module. The monorepo decision pays off when the wire shape changes on the API: the web codegen catches the drift on the next pull. There is no library version to coordinate.

stakeguard/
├── api/                      # NestJS HTTP/GraphQL + BullMQ worker
│   └── src/
│       ├── <module>/         # one dir per domain module
│       │   ├── MODULE.md     # always at the root; read first
│       │   ├── <module>.module.ts
│       │   ├── entities/     # TypeORM @Entity classes
│       │   ├── services/     # business logic + .service.spec.ts
│       │   ├── resolvers/    # GraphQL resolvers
│       │   ├── controllers/  # REST + embed
│       │   ├── jobs/         # BullMQ processors (worker-only)
│       │   ├── inputs/       # class-validator DTOs / GraphQL inputs
│       │   ├── models/       # GraphQL @ObjectType output projections
│       │   └── types/        # service-layer interfaces
│       ├── shared/           # by-concern, not by-role
│       └── worker.ts         # entry point for the worker process
├── web/                      # Next.js consumer + admin + embed
│   └── src/
│       ├── app/
│       │   ├── app/          # consumer surface
│       │   ├── admin/        # staff surface (this page lives here)
│       │   └── embed/        # partner-iframed pages
│       ├── components/{ui,shared,layout}/
│       └── generated/        # codegen output; never edit by hand
├── docs/                     # cross-cutting specs + plans
├── infrastructure/aws/       # IAM, networking, scripts
└── .github/workflows/        # ci.yml + deploy.yml

Files are grouped by role, not by feature. A new service in the wallet module lives in api/src/wallet/services/ alongside every other wallet service. Cross-module imports use the full path (from '../user/services/user.service'). Never a barrel re-export.

§13

Module map

Phase ordering matters. Each module depends on the ones earlier in the chain, so the build order is the dependency order. Hover a card below to highlight that module's imports (upstream) and its consumers (downstream).

Hover a module to highlight its dependencies
upstream (selected imports from)downstream (imports from selected)

Three things to notice in the graph: shared/ sits at the root of every chain, audit/ is next because every other module writes to it, and allocation/ plus risk-engine/ are the two engines every partner-facing decision routes through.

§14

The two API surfaces

Strict separation. /graphql is consumer-only. /api/v1/* is partner-only. Cross-traffic is a security violation. Partner keys cannot hit GraphQL. Consumer sessions cannot hit the partner REST surface. Enforced via separate guards (SessionOrJwt vs. ApiKeyGuard + ScopeGuard) and separate Nest modules, not by URL pattern alone.

Consumer
/graphql
  1. 1
    Web client
    Apollo Client in /app/*
  2. 2
    Auth0 JWT or sg_session
    cookie httpOnly + Secure
  3. 3
    SessionOrJwt guard
    @CurrentUser() resolves to user.id
  4. 4
    GraphQL resolver
    camelCase wire shape
  5. 5
    Consumer-scoped data
    RLS + service-layer scoping
Partner
/api/v1/*
  1. 1
    Server-to-server
    no browser; backend HTTP only
  2. 2
    sk_test_ / sk_live_ bearer
    partial-unique hash in DB
  3. 3
    ApiKeyGuard + ScopeGuard
    @RequireScope per route
  4. 4
    REST controller
    snake_case wire shape
  5. 5
    Partner-scoped data
    linked-users only
Embed
/api/v1/embed/*
  1. 1
    Iframed in partner site
    or top-level redirect
  2. 2
    HMAC on launch
    X-Signature + X-Timestamp + X-Nonce
  3. 3
    sg_embed_session cookie
    Path=/api/v1/embed; can’t leak
  4. 4
    EmbedSessionGuard
    cookie-bound user + partner
  5. 5
    Snapshotted session data
    no live DB read on hot path
snake_case on the partner wire, camelCase on GraphQL
Partners get amount_cents. Consumers get amountCents. Code at the boundary (controller layer) handles the mapping. Service-layer DTOs are always camelCase.
§15

Domain events

Inter-module communication runs through DomainEventBus, a thin typed wrapper over the underlying Nest event emitter, so the compiler enforces name and payload pairing. Every emit is type-checked against the payload interface in shared/events/domain-events.ts. Every listener uses @OnDomainEvent(DomainEventName.X) with the payload typed as the corresponding event interface.

Two emission patterns. Sync (emit) for fire-and-forget AML or notifications, where the caller doesn't need the listener to complete. Async (emitAsync) when downstream listener state is part of the same logical operation. Example: WalletService.handleBaasEvent waits for the settlement-aggregate sync before reporting the callback applied.

Post-commit emission only
Events emit after the transaction commits, gated on ownedTransaction from runInTransaction. Listeners always see durable state. They never see a rolled-back event.
§16

Wallet architecture

The custodial rail is a double-entry ledger plus a state-machine on every transfer. Six account types, nine transfer types, four reservation states, four settlement statuses. All const-objects in shared/types/index.ts. The ledger is insert-only at the DB layer (trigger rejects UPDATE/DELETE/TRUNCATE on ledger_entry). Balance snapshots are maintained transactionally per post.

Two partner modes. Signal mode never touches money. We approve or decline, the partner runs their own pay-in rail. Custodial mode routes deposits through our omnibus: capture against a 15-minute reservation, post the DEPOSIT (plus optional FEE and RESERVE_HOLD), aggregate into a SETTLEMENT after the configured hold-days, push to the partner via the BaaS provider.

Account types
  • OMNIBUS_CASH: provider × currency, asset
  • CONSUMER_LIABILITY: user × currency
  • PARTNER_PAYABLE: partner × currency, liability
  • PARTNER_RESERVE: partner × currency, holdback
  • FEE_REVENUE: singleton revenue account
  • SUSPENSE: break-glass; never expected
Three-layer immutability
Audit and ledger entities enforce insert-only at three layers. TypeScript (readonly fields, no .save() path). Service (only .insert() is exposed). Database (PG trigger raises on UPDATE/DELETE/TRUNCATE). Each layer alone is bypassable. Three together aren't.

Pay-in modes side by side

Signal mode
We tell, you decide
Custodial mode
We hold, you collect
Money movement
None. Partner runs their own pay-in.
Through Keel omnibus to partner payable
Pricing
Per-decision SaaS + tiered volume
Fee per deposit + holdback float
Legal posture
No money transmitter obligations
MT framework + BSA/AML obligations
Partner integration
Deposit-auth REST endpoint only
Same endpoint + custodial settlement + winnings
Consumer wallet balance
Not visible to Keel; held at the partner
Held in CONSUMER_LIABILITY; cross-partner allocation
Settlement
N/A
Daily / hourly / instant cadence; post hold-days
Reversal exposure
Partner bears
Mitigated by PARTNER_RESERVE holdback
AML / compliance
Partner-owned
Keel-monitored (TRANSFER_COMPLETED + rule engine)

Ledger primer: a $25 deposit, step by step

Click through to watch the chain post. The same five accounts hold all the state. Each transfer-type is a different combination of debits and credits against them.

Initial state
Consumer has funded their wallet via a separate FUNDING transfer; partner accounts exist but are empty.
Liability
CONSUMER_LIABILITY
$100.00
Liability
PARTNER_PAYABLE
$0.00
Revenue
FEE_REVENUE
$0.00
Liability
PARTNER_RESERVE
$0.00
Asset
OMNIBUS_CASH
$100.00
No entries posted
Initial state. No ledger entries yet.

Custodial pay-in over time

The capture and the settlement live on the same flow but happen at different times. Capture in milliseconds. Settlement on the hold-days cadence. One diagram for both:

Rendering diagram…
T0: capture at confirm. T+hold-days: settlement cron pays the partner.

BaaS callback lifecycle

What happens when the BaaS provider tells us a transfer changed state. Same handler for provisional, confirmed, failed. The per-status branch in handleBaasEvent is the central dispatcher.

Rendering diagram…
Settlement aggregate sync runs as a listener post-commit. The BaaS gets its ack before any reconciliation work.
§17

Partner integration flows

Partners interact with the platform through four surfaces, in roughly the order they encounter them:

  1. Connect. Hosted consent flow at /connect. Partner signs an HMAC launch. The user signs into Keel and grants the partner access. Output: a partner_user_link row that maps the partner's opaque user id to our internal user.id.
  2. Deposit authorize. POST /api/v1/deposit-authorizations. Hot path: p99 < 300ms. Returns approved or declined. Decline codes are neutral-framed. Idempotent on a partner-supplied key.
  3. Embed checkout. Partner mints a launch token via POST /api/v1/embed/checkout-launch, then iframes us at the resulting URL (or top-level-navigates in redirect mode). The user reviews and confirms. We run the deposit-auth decision. The iframe postMessages the parent (embed mode) or 302s back with the decision on the query string (redirect mode).
  4. Settlements and payouts. GET /api/v1/settlements for reconciliation. POST /api/v1/withdrawals when the partner platform pays funds back to the consumer (sportsbook winnings, trade-settlement credits, same wire shape; book-only credit to CONSUMER_LIABILITY).
HMAC contract
Embed-launch and connect-launch endpoints are HMAC-signed: X-Signature, X-Timestamp, X-Nonce headers over the raw body. Timestamp window: ±5 min. Nonce is partner-scoped, 24h-unique. Same-key replay returns the original launch URL. Different body plus same nonce returns 401.

Deposit-auth decision pipeline

The hot path. What happens when a partner posts a deposit authorization. All reads are pre-computed snapshots. No external API calls on the path. p99 < 300ms.

Rendering diagram…
Pre-computed reads, p99 < 300ms. Idempotent on the partner's key.

Embed checkout: redirect mode

The same decision pipeline, without an iframe. Decision is returned via a 302 to the partner's return_url with query-string parameters.

Rendering diagram…
mode=embed (default) replaces the final 302 with a postMessage to the parent window. mode=redirect is shown here.
§18

Auth & multi-tenancy

Three identity surfaces, three auth mechanisms, three cookie or header conventions:

Consumer web
Auth0 JWT or sg_session cookie
PassportJS + custom session strategy; web app trades JWT for cookie at login
Staff admin
Same Auth0 JWT + role claim
admin:* permissions enforced via @RequirePermissions decorator + PermissionsGuard
Partner REST
sk_test_ / sk_live_ bearer
partial-unique hash in DB; scopes enforced via @RequireScope
Partner embed
HMAC on launch; sg_embed_session cookie after exchange
Path=/api/v1/embed scoped so it can't leak into the consumer surface

Multi-tenancy is enforced at the service layer and by RLS on the tenant-scoped tables. Consumer queries scope to the authenticated user via @CurrentUser(). Partner queries scope to the authenticated partner via @CurrentPartner() plus the linked-users relationship. Admin queries see everything, but write actions are audit-logged.

§19

Worker process

BullMQ jobs run in a separate Node process (api/src/worker.ts), never in-process with the API. The reasoning is operational. A runaway job shouldn't take down the request path, and the worker scales independently of the API (different CPU and memory profile, different scaling triggers).

A BullMQ @Processor registered in a module loaded by main.ts is a guaranteed bug. The API process would also try to consume the queue, splitting jobs across two pools. Rule enforced by convention and CI lint.

Queue catalog (incomplete; extended per slice)
  • RiskEvaluation: recompute signals + score
  • CashFlowAnalysis: rebuild snapshot
  • AllocationRecalc: budget refresh
  • BankSync: Plaid transaction pull
  • Notifications: SES + Pinpoint delivery
  • WebhookDelivery: outbound to partners
  • WalletSimulatedBaas: sim callback fanout
  • WalletTransferWatchdog: stuck-transfer reconcile
  • WalletReconciliation: hourly drift check
  • WalletSettlement: on-demand + cron
§20

Codegen pipeline

The backend is the single source of truth for wire types. Two generators, one umbrella command (pnpm --filter keel-web codegen):

  • GraphQL. graphql-codegen reads the live SDL from /graphql and emits web/src/generated/graphql.ts. Enums emit as string-literal unions (enumsAsTypes: true) so the project's ban on TS enum applies to generated code too.
  • REST. openapi-typescript reads api/openapi.json (regenerated via pnpm --filter keel-api openapi:export) and emits web/src/generated/api.d.ts. Used by the embed checkout page and any future REST-driven web surfaces.

Web-side enum const-objects (web/src/lib/wire-enums.ts) wrap the generated literal unions with as const satisfies Record<string, GeneratedUnion> so backend drift becomes a compile error on the web.

§21

Infrastructure & deployment

AWS, us-east-1, single account (keel profile, 606096000902). One region today. Multi-region is a Phase 4-or-later concern.

Compute
ECS (Fargate)
separate services per env; API + worker as sibling services
DB
RDS Aurora PostgreSQL
Multi-AZ; reads + writes on the cluster endpoint
Cache / queue
ElastiCache Redis
BullMQ + Redis-backed sessions
Container registry
ECR (keel-api)
image-immutable; deploys reference a SHA tag
Secrets
AWS Secrets Manager
DB password rotated automatically; HMAC + JWT signing keys live there
DNS
Route53, stakeguard.info
never .io, never keel.*
TLS
ACM-issued certs on the ALB
auto-renewed
Migrations
TypeORM CLI via bastion + Session Manager
deploy.yml tunnels through the bastion before rolling the services

Deploy pipeline: .github/workflows/deploy.yml runs migrations first (so the new schema is in place before the new code starts), then rolls the ECS services, then waits for them to stabilize. Rollback is by re-running deploy with the previous SHA tag. No separate rollback action.

§22

Observability

Three durable surfaces, one transient.

  • pino logs (transient). Structured JSON, shipped to CloudWatch. The Slice 9 AML hook's aml.transfer.observed entry is the first rule-engine-shaped log. More domain-specific log shapes follow the same field-catalog discipline.
  • audit_entry (durable). Every state change writes one. Insert-only by trigger. The regulator-facing audit trail.
  • BullMQ job records (durable until retention). Completed and failed jobs retained per-queue. Failure traces are first-stop debugging for cron paths.
  • RDS Performance Insights + CloudWatch metrics (durable). DB-level query timing, ECS service health, ElastiCache memory pressure.

Dashboards we'll build first, in priority order: settlement throughput by partner, deposit-auth p99 latency, BaaS callback success rate, AML observation volume, reservation churn (created / captured / expired ratios).

§23

Security model

Three threat surfaces, three sets of mitigations:

Partner ↔ API
HMAC for launches; sk_live_ / sk_test_ bearer for REST
±5min timestamp window; partner-scoped 24h nonce; secrets encrypted at rest, shown plaintext once on rotation
Consumer ↔ web
Auth0 JWT or sg_session cookie
cookie httpOnly + Secure + SameSite=Strict; CSRF via origin checks; sessions revocable via Redis sentinel
Embed ↔ partner iframe
sg_embed_session cookie, Path=/api/v1/embed
frame-ancestors CSP set per-partner from preflight; postMessage targets registered origins only, never '*'

Encryption. TLS 1.2+ in transit. RDS and ElastiCache encrypted at rest (KMS). HMAC signing secrets stored encrypted in the partner table. The column is decrypted only inside PartnerService.

Partner-key compromise playbook
A leaked partner API key is revoked via the admin surface. Revocation invalidates the DB row and emits a session-revocation sentinel, so any sessions issued before now are destroyed on next read. Partners rotate HMAC secrets the same way. Old secret stays valid for 30 days so the partner can roll forward without downtime.
§24

Data retention & PII

We hold:

  • PII. Name, DOB, address, last-4 SSN (KYC), phone, email. Stored on user plus KYC tables. Right-to-access export includes everything. Right-to-delete walks the dependency graph and replaces with placeholders where deletion would break audit integrity.
  • Financial. Bank account tokens (Plaid; never the account number directly), transaction history (90 days rolling per Plaid), cash-flow snapshot, allocation history.
  • Ledger. Every transfer and ledger_entry retained indefinitely. The regulatory minimum is 7 years for the BSA filings the AML rule engine eventually generates.
  • Audit. Insert-only, indefinite. Audit can't be purged without also breaking the regulator's ability to reconstruct a decision.

Retention enforcement and right-to-access fulfillment are Phase 4 (compliance/ module). v1 keeps everything. The consequence is a larger DB footprint, accepted as a tradeoff for not having to design retention before we have a regulator to satisfy.

§25

Performance budgets

Two hard SLAs in the codebase today. More to come as partners onboard.

Deposit auth p99
< 300ms (hard)
Hot path; no external API calls; all reads pre-computed snapshots
Embed exchange p99
< 500ms target
Cookie roundtrip + Redis session create
Settlement run
No SLA; async cron
BaaS call + per-partner-payable lock; bounded by BaaS provider, not us
Webhook delivery
Best-effort + retry-with-backoff
Signed; partners ack via 2xx; failures retry up to 24h
§26

Testing & non-negotiable rules

Tests run at three levels. Unit (pure service logic). Integration (real DB via the local PG instance, mocked external APIs). And the full bundle that exercises 200+ tests in one run-in-band invocation. About 234 tests across wallet, deposit-auth, embed, partner-api, admin, and aml, green on every commit that touches these surfaces.

The non-negotiables. Rules the codebase enforces by lint, by CI, by DB constraint, or by review:

  • Const-object enums only. No TS enum. All values in shared/types/index.ts.
  • No bare strings for typed fields. Status, direction, type, actor, event-type, scope, decision-code. All routed through the const-object.
  • AppConfig over process.env. Two exceptions (data-source.ts, pino.config.ts) are documented.
  • audit_entry is insert-only. Three-layer immutability. Every state change writes one.
  • Transactional discipline. Business write + audit + event in one tx. Event emission post-commit, gated on ownedTransaction.
  • Worker process is separate from API. No @Processor in any module loaded by main.ts.
  • Repository or QueryBuilder for DB access. Raw .query() only in three documented cases (migrations, cross-entity reads QB can't express, measured hot paths).
  • Insert-only entities use .insert(), never .save(). save() can silently update. insert can't.
  • No backward-compat shims pre-production. Renames delete the old name. No @deprecated re-exports.
  • Partner API is REST only. Consumer GraphQL is consumer only. The two surfaces are first-class citizens of the auth model.
Why these are non-negotiable
Each rule was written down because at least one near-miss happened during development: a partial migration, a silent enum drift, an audit row that didn't write because a save() was used instead of insert(). The rules came from real incidents, not theory.
III
Part III

Status & Roadmap

What is built, what is next, what is decided

§27

What’s built

Phase 1 is feature-complete. Phase 2 wallet shipped end-to-end across nine slices. Phase 3 partner surface is live across REST, embed, and both SDKs. Several personal-finance modules expanded the product beyond the original twelve-module Phase 1 plan.

Phase 1 — Protection engine

  • Twelve foundational modules: shared, audit, user, identity, bank-connection, cash-flow, allocation, risk-engine, intervention, notifications, admin, health.
  • Plaid IDV for KYC and Plaid Link for bank connection. Cooling-off, deposit-pause, and hard-stop interventions with sticky-by-design lift rules.
  • Email via AWS SES and SMS via AWS Pinpoint. Per-event consumer notification preferences stored per-user.
  • authorization module: RBAC for staff. Permissions stored on role_permission rows; admin endpoints gated by @RequirePermissions.

Phase 2 — Custodial wallet (Slices 1–9)

  • Ledger primitives plus double-entry posting (Slice 1).
  • Reservation primitives plus state machine (Slice 2).
  • Account provisioning, reconciliation cron, transfer watchdog (Slice 3).
  • Consumer top-up, withdraw, web wallet page (Slice 4).
  • Custodial deposit-auth integration plus captureDeposit (Slice 5).
  • Embed redirect mode and insufficient-balance recovery (Slice 6).
  • Settlement processor, BaaS callbacks, reversal scenario 3, partner-API, admin UI (Slice 7).
  • Winnings credit endpoint (Slice 8).
  • AML monitoring hook (Slice 9): every confirmed transfer logs a structured aml.transfer.observed entry for the future rule engine.

Phase 3 — Partner surface

  • REST API. Deposit authorizations (the hot path), settlements, withdrawals, user-status / limits / restriction reads, partner-API key issuance and rotation, webhook endpoint CRUD plus the catalog.
  • Hosted embed. Checkout iframe, redirect-mode checkout, hosted consent at /connect, mobile WebView with custom-scheme returns, partner widget surface. HMAC-signed launches with ±5min timestamp and 24h nonce uniqueness per partner.
  • Outbound webhooks framework. Endpoint CRUD, signed delivery, retry-with-backoff, replay, dead-letter, partner-template rendering.

Phase 3 — Partner SDKs

  • @keel/sdk (Node). Typed client for every partner REST resource (account, deposit-auth, settlements, users, withdrawals, webhook endpoints / deliveries / catalog, launches). HMAC launch-signing helper. Inbound-webhook verifier with timing-safe compare and secret-rotation support. KeelError surfaces RFC 7807 problem details including retryAfterSeconds on rate limits. Dual ESM + CJS, zero runtime deps.
  • @keel/sdk-client (Browser). Keel.checkout.mount() renders a centered modal on desktop, full-bleed on mobile, with backdrop, ESC handling, body- scroll lock, and a polished shield-and-rails loading animation that holds for a minimum window so the loader feels like a UI moment, not a flicker. onReady, onUnavailable, and dismissOnResult options. Auto-falls back to redirect-mode when third-party cookies are blocked. Inline (partner-container) mode also available.
  • Brand theming. Both SDKs expose --keel-sdk-* CSS variables for partner overrides. The iframe content streams its resolved brand tokens (the user's palette preference) back to the SDK overlay via the sg.checkout.ready postMessage, so chrome and content stay in lockstep.
  • parlaypro-demo workspace exercises both SDKs end-to-end as the reference integration.

Personal-finance expansion

  • forecast. Forward-looking cash projection (P50 + P10 daily balance trajectory plus a 30-day calendar of expected bills and income). Feeds the savings recommender and the LowBalanceProjected insight detector.
  • insights. Proactive non-risk pattern detection (“your grocery spending is up sharply,” “paycheck just landed”). Append- only events; separate from the risk pipeline.
  • savings. User-facing savings goals and advisory-mode savings recommendations. Currently recommender-only; auto-savings money-movement is deferred.
  • trusted-person. Consumer-named accountability contacts that receive an email alert when the system observes a serious risk moment (monthly limit reached, cooling-off starts, risk score elevates, hard-stop applied).
  • compliance. Regulatory reporting scaffolding: aggregate responsible-gaming reports, user data exports (CCPA/GLBA right-to-access), intervention history exports for regulators. Phase 4 surface area.

Embed + infrastructure hardening

  • CHIPS partitioned cookies. sg_embed_session ships SameSite=None; Secure; Partitioned in production so the cross-site iframe session survives Chrome's third-party cookie phaseout. Dev (HTTP) falls back to SameSite=Lax for the redirect-mode flow.
  • Rate limiting on embed launches. Per-partner buckets (600/min launch, 1200/min exchange, 3000/min preflight). 429 responses include Retry-After, X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset.
  • Brand picker. Per-user palette preference in localStorage applied via inline boot script. Flows through to the SDK overlay automatically.

Engineering infrastructure

  • ~98 spec files in the api covering wallet, deposit-auth, embed, partner-api, admin, notifications, intervention, allocation, risk-engine, forecast, insights, trusted-person, savings, and webhooks.
  • 6 spec files across the SDKs: 17 tests on @keel/sdk, 55 on @keel/sdk-client.
  • Dual codegen pipeline: graphql-codegen for the consumer GraphQL surface plus openapi-typescript for the partner REST surface. Generated artifacts are checked in so deploys don't need a reachable api.
§28

What’s next

Ordered by what unblocks what, not by what feels exciting.

To go live

  • Real BaaS contract plus production client. SimulatedBaasClient covers every code path the production client will hit. The DI swap is one line once the contract lands. Provider shortlist is being evaluated separately.
  • Hot-path rate limiting. POST /deposit-authorizations, POST /withdrawals, and GET /settlements are currently uncapped. A runaway loop from any single partner could DoS the cluster. Apply the same PartnerThrottlerGuard pattern; target ~6,000/min on deposit-auth, 1,200/min on reads, with per-partner tier overrides on the partner row.
  • Per-partner settlement scheduler. SettlementProcessor accepts on-demand ProcessSettlement jobs but no cron driver fires them at each partner's settlement_timezone end-of-day.
  • Notification listeners for wallet events. WALLET_FUNDED, WALLET_WITHDRAWN, WINNINGS_CREDITED, DEPOSIT_FUNDED all emit; no consumer email or SMS listener is wired. Four @OnDomainEvent handlers plus templates against the established NotificationOrchestrator pattern.
  • Outbound webhook deliveries for wallet events. Webhooks framework is built; per-event mapping and templates for deposit.funded, settlement.confirmed, settlement.failed, winnings.credited are missing.

SDK polish + distribution

  • npm publish flow. Both @keel/sdk and @keel/sdk-client are "private": true today. CI release pipeline (changesets or semantic-release), npm-org provisioning under @keel, signed-tarball verification before partners can install.
  • Partner-facing integration docs. docs-site renders the OpenAPI reference automatically. What's missing is a sequenced “your first integration” guide that walks connect, then deposit-auth, then embed, then settlement, then winnings. parlaypro-demo's README is the working draft.
  • Idempotency cache exempt from rate limits. A retry hitting the idempotency cache should not count against the partner's bucket. Currently it does.
  • Embed jti binding on confirm. Today the cookie alone authorizes confirm. Sending the launch JWT's jti as a header (and verifying server-side that the session matches that jti) closes the multi-tab race where two checkouts for the same partner could complete each other's launch.

Local dev parity

  • Local HTTPS via mkcert. Dev currently can't set Secure cookies, so the embed iframe session is dropped by Chrome and the demo falls back to redirect mode every time. Production works fine over HTTPS; mkcert closes the dev/prod gap.
  • Safari iOS verification. Partitioned cookies on iOS Safari behave slightly differently from Chrome. Needs an end-to-end pass on a real device before any partner sees the demo.

Deferred per the plan

  • AML velocity and structuring rules (Phase 4 compliance). The Slice 9 log stream is forward-compatible; rules read the log-sourced data lake and produce aml.alert.raised rows.
  • Per-cohort reserve release schedule. v1 releases the full PARTNER_RESERVE balance once 2× hold-days has elapsed since the most recent deposit. A per-cohort schedule would proportionalize the release.
  • Compliance exports + right-to-access fulfillment. Scaffolding in the compliance module; report generation and the export pipeline are Phase 4.
  • Auto-savings money movement. The savings recommender produces recommendations only; routing funds to a savings sub-account is deferred until after the partner surface is contracted and stable.

Operations

  • Operator runbooks under /knowledge/runbooks/*. Every failure mode the watchdog and reconciliation cron handle, plus the steps for unsticking. First knowledge-base entry to land after this Overview.
  • Production dashboards. Settlement throughput by partner, deposit-auth p99 latency, BaaS callback success rate, AML observation volume, reservation churn. The Slice 9 AML log stream is the data source for several of these.
§29

Open decisions

Decisions we have not made yet. Each is a place where the answer will be load-bearing for at least one downstream slice. The deferral is intentional but shouldn't go indefinite.

  • BaaS provider selection. Gated on legal sign-off plus a contract review, not an engineering question.
  • Settlement cadence per partner. Daily, hourly, or instant exists as a const-object enum, but no partner has been onboarded yet. The first real partner integration will pin down what “instant” actually means in terms of batching granularity.
  • SDK distribution. npm scope (@keel may not be available), versioning strategy (semver with deprecation policy), release cadence (per-PR vs. weekly cut). Decision needed before the first partner integration ships.
  • Mobile-native SDKs. The mobile launch surface today is a WebView wrapped by the partner's native app. Question is whether to ship first-party Swift / Kotlin / React Native SDKs or keep the WebView story long-term. Triggering condition: first partner whose iOS / Android team asks.
  • Hot-path rate-limit tiers. Default 6,000/min on deposit-auth is fine for small partners; large sportsbooks at peak do 100+ decisions per second and need higher. Per-partner overrides on the partner row would close this. Triggering condition: first high-volume partner contract.
  • Notification copy. Listener wiring is straightforward; the copy (tone, what to send when, opt-in vs. opt-out semantics) is a brand decision that hasn't happened yet.
  • Internationalization. Currency hardcoded USD in the schema and services. Ship without it; revisit when a partner outside the US asks.
  • Feature flags. No flag system today. Slice-by-slice shipping has worked without one. Will add when there are two partners with diverging needs.
  • Internal-naming rename pass. Gambling-coded internal names linger from earlier slices: creditWinnings, WinningsCredited domain event, WalletWinningsCredited audit event. The partner wire and consumer copy are already vertical-neutral. This is internal-only. Rename to creditPayout and PayoutCredited in one pass. Triggering condition: first non-sportsbook partner integration.
The deferred-decision discipline
Open decisions deferred without dates become wishlist items. Every entry here should either have a date or have a triggering condition (e.g. “when partner #1 signs,” “when state X is cleared”). If neither applies, it's a feature we've decided not to build, not a decision we're still making.
IV
Part IV

Reference

Quick lookups

§31

Glossary

Project-specific vocabulary. Standard fintech terms are not redefined here; unfamiliar ones are.

Signal mode
Partner integration shape where Keel returns an approval decision but does not take custody of funds. Lower-commitment integration; partner runs their own pay-in rail.
Custodial mode
Partner integration shape where deposits route through Keel. Consumer funds rest in a consumer liability account between deposits; we settle to the partner on a configurable cadence.
Allocation
A consumer’s calculated monthly and weekly discretionary-spending budget across linked partner platforms. Derived from bank-linked cash flow; recalculated on a cadence. Users cannot directly set their own; they request a review.
Intervention
A one-way restriction applied to a consumer’s ability to deposit. Examples: cooling-off, deposit-pause, hard-stop. Only liftable by its own expiry timer or an operator override with reason recorded.
Risk score
A deterministic 0–100 score per consumer, derived from concrete signals (allocation utilization, deposit velocity, intervention history). Deterministic by design: no ML, every score traceable to the signals that produced it.
Reservation
A 15-minute hold on a consumer’s available balance between deposit approval and capture. Three terminal states beyond Active: Captured, Released, Expired. Each terminal state is one-way.
Partner payable
Ledger account holding deposits collected for a partner, pre-settlement. One per partner and currency. Drained by settlement; debited by fees and reserve hold.
Partner reserve
Holdback against ACH returns. Released back to partner payable after 2× hold-days has elapsed since the most recent confirmed deposit.
Consumer liability
Ledger account holding a consumer’s funds. One per user and currency. Credited by funding and payouts; debited by deposits and cash-outs.
Omnibus cash
Asset account at the BaaS provider. Where real cash sits. Debited by inflows; credited by outflows and settlements.
Settlement
Aggregate of confirmed deposit transfers rolled into a single payout to the partner. Created when the settlement processor selects an eligible cohort (past hold-days, unsettled).
Hold-days
The ACH return window. Per-partner config. Deposits aren’t settleable until older than this; reserves aren’t releasable until 2× this.
Holdback
A percentage of each deposit routed to partner reserve instead of partner payable at capture time. Mitigates ACH-return exposure after the partner has been paid.
BaaS
Banking-as-a-Service provider. Where real money moves. The production provider is under contract review; the simulator covers every code path until the swap lands.
Embed
Hosted partner-iframed UI surface. Three flavors: Checkout (deposit confirm), Widget (partner ops dashboards), Mobile (native WebView).
Connect
Hosted consent flow where a consumer links their Keel account to a partner. HMAC-launched by the partner; results in a stored link when consent is given.
Neutral framing
Convention for partner-facing wire shapes: vertical-neutral labels only. Avoid gambling-coded terms like “cooling_off” or “self_exclusion” and trading-coded terms like “PDT_flagged” in decline codes; emit universal labels.
Transfer type
Nine values: deposit, withdrawal, funding, return, fee, reserve hold, reserve release, settlement, reversal. Each has a strict source and destination account-type contract.
Transfer status
Five values: initiated → provisional → confirmed; initiated or provisional → failed. Book-only types (deposit, withdrawal, fee, reserve hold and release) skip provisional.
Audit entry
Append-only log row written on every state change. Insert-only at three layers: TypeScript readonly, service-layer .insert()-only, and a database trigger that rejects UPDATE / DELETE / TRUNCATE.
Keel — A safety net for the way you spend today