Hidden-counterparty backfill — verification report

PR #24456 · feat(cash): admin backfill to supersede gasless-fee activities
Branch · lisherwin/backfill-activity-hide-counterparties Linear · CASH-3669 Tests · 55/55 passing Sample · 300 prod rows, 202 unique users, RPC-verified
99%
Superseded by parent
Paired to the user's parent Swap or Transfer in the ±5s window. Writes supersededBy + history row, republishes parent envelope with fee in supersededIds.
297 / 300 sampled rows
1%
Left untouched
No parent activity in the ±5s window. The row stays as-is in the DB; the read-time filter (#24373) still hides it from the Cash service's own API.
3 / 300 sampled rows

Adds POST /cash/v1/admin/backfill/hidden-counterparty (admin role, 202), scoped by userIds and/or an occurredAt window. The backfill operates on a set of fee-collector addresses = FIREBLOCKS_FEE_PAYER_ADDRESS ∪ the admin-managed dynamic list (GET /cash/v1/admin/hidden-counterparty-addresses). Every matching row is paired to a parent Swap or Transfer in a ±5s window and superseded. Closes the gap behind William's read-time filter (#24373) for historical babykiss22 fee rows that pre-date his live-ingest fixes (eager stub #24244 / correlator #24317). Idempotent end-to-end: scope hashed to a stable jobId, setSupersededByIfNull won't clobber a value the live path set.

Correlation

For each babykiss22 fee row: find the user's most recent non-superseded parent activity (type ∈ {swap, transfer}) with occurredAt in [fee.occurredAt − 5s, fee.occurredAt + 5s], excluding rows whose counterparty is itself babykiss22. If found, write supersededBy = parent.id + history audit row + publish the parent envelope with the fee in supersededIds. Otherwise tombstone.

DecisionWhat we doWhy
Window is symmetric ±5s, not bounded-below Allow parent occurredAt to land slightly after fee's Fee row's occurredAt from SimpleHash transfer feed is rounded to whole seconds; parent has millisecond precision. Even on the same Solana block, the DB-recorded values can be ordered either way at sub-second resolution. Bounded-below missed 78% of real same-block siblings (raised the rate from 21% → 99% when fixed).
Tolerance = 5s (not wider) Tight window per William's review Same-block siblings settle within 1s. Sensitivity sweep on the 300-row sample: ±5s = 99%, ±10s = 99%, ±30s = 99% — the rate plateaus immediately because the timestamp drift is sub-second in practice. Wider windows would add cross-pair ambiguity without coverage.
Parent type = swap OR transfer Both can trigger a gasless fee The cash service charges a babykiss22 fee for gasless swaps and gasless transfers. Excludes rows whose counterparty IS babykiss22 so a fee row can't pair to another fee row.
Race-safe write UPDATE … WHERE supersededBy IS NULL Live-ingest path (William's eager stub) might have already set supersededBy. The NULL guard makes the backfill a no-op in that case — and re-runs of the backfill are idempotent.
Address set FIREBLOCKS_FEE_PAYER_ADDRESS ∪ admin endpoint list Every entry in the admin-managed list is treated as a gasless fee collector and goes through the same supersede path. There is no tombstone fallback — rows with no parent in the window are left as-is (and stay hidden by the read-time filter on the Cash service API).

Architecture

POST /cash/v1/admin/backfill/hidden-counterparty body: { userIds?, startDate?, endDate? } (≥1 of userIds/startDate) │ ▼ AdminService validates scope, sha256(scope) → jobId │ ▼ ┌──────────────────────────────────────────────────────────┐ │ {hidden-counterparty-backfill-coordinator} (BullMQ) │ │ snapshot hidden set := baseline ∪ dynamic │ │ if empty → return early │ │ else keyset-paginate cash_activity by id (500/page) │ │ checkpoint afterId via job.updateData │ └──────────────┬───────────────────────────────────────────┘ │ one batch per page ▼ ┌──────────────────────────────────────────────────────────┐ │ {hidden-counterparty-backfill-batch} (BullMQ) │ │ │ │ for each row in batch: │ │ skip if outside hidden set / cutoff / already-superseded │ │ │ if counterparty == babykiss22: │ │ parent := findRecentParentForGaslessFee( │ │ userId, fee.occurredAt ± 5s) │ │ if parent && setSupersededByIfNull(fee, parent): │ │ recordSupersededHistory(fee, parent) │ │ publish parent envelope w/ supersededIds=[fee.id]│ │ else: │ │ publish deleted-true envelope │ │ │ │ else (Bridge baseline, etc.): │ │ publish deleted-true envelope │ └──────────────┬───────────────────────────────────────────┘ │ Kafka — partition key = envelope.id ▼ Activity Service feed (dedups by envelope id)

Test inventory

SpecTestsCovers
hidden-counterparty-backfill-batch.worker.spec.ts 12 Supersede via Swap parent, via Transfer parent, history audit, race with live ingest, no-parent → tombstone, skip-already-superseded, EVM normalize, cutoff guard, publish error
hidden-counterparty-backfill-coordinator.worker.spec.ts 8 Baseline always present, keyset pagination, checkpoint resume, date-range scope, malformed date, empty-snapshot short-circuit, scan-failure propagation
hidden-counterparty-backfill.parity.spec.ts 7 Backfill ↔ read-time filter parity for William's scenarios
admin.service.hidden-counterparty-backfill.spec.ts 8 Scope validation, BadRequest on empty body, date validation, queue wiring
queue.service.hidden-counterparty.spec.ts 6 sha256 jobId collision-safety, queue stats inclusion
hidden-counterparty-backfill.dto.spec.ts 6 Cross-field scope guard, userIds/startDate/endDate validation
event-consumer.module.spec.ts 8 Consumer-types include the two new queues

Risk & mitigation

RiskMitigation
Backfill clobbers supersededBy already set by live ingest UPDATE … WHERE supersededBy IS NULL — race-safe at SQL.
Wrong-pair attribution within a 5s window Same-block siblings settle in <1s; ±5s gives almost no ambiguity in practice (0 multi-candidate matches in the 300-row sample at 5s).
Duplicate admin enqueue jobId = sha256(sortedUserIds + start + end).slice(0,16). Identical scope is a BullMQ no-op.
Coordinator crash mid-scan job.updateData(afterId, scanned, nextBatchIndex) per page; retry resumes from checkpoint.
Empty hidden snapshot triggers full scan Coordinator short-circuits with a warn log before any scan.
History write fails after supersededBy is set Non-fatal warn; supersession stays in the DB (better than rolling back a successful supersede).
Activity Service double-counts on re-publish Dedups by envelope id; supersede path re-publishes the parent envelope (idempotent upsert).

Operations

POST /cash/v1/admin/backfill/hidden-counterparty
Authorization: Bearer <admin token>
Content-Type: application/json

# By user ids
{ "userIds": ["3fa85f64-5717-4562-b3fc-2c963f66afa6"] }

# By occurredAt window
{ "startDate": "2026-01-01T00:00:00Z",
  "endDate":   "2026-04-15T23:59:59Z" }

# At least one of userIds (non-empty) or startDate is required.

Returns 202 Accepted with the BullMQ jobId. Progress streams to logs; watch queue state at GET /cash/v1/admin/bullmq-cache/queues.