Hypertext Rails
Documentation
Getting Started
Communication Center
- Automation Workflow Flow
- Trigger Events and Cadence Rules
- Fallback Channel Implementation
- Fallback Channel Testing (dev)
- Twilio SMS Integration Guide
- Email Tracking Setup (sent, delivered, failed, open, click)
- SMS Tracking & Twilio Free Tier
- AWS SES Delivery Tracking (console setup for delivery webhook)
- Compiled Template Guide (layout, components, variables)
- Compiled Template — Layout Spec (layout structure, triggers, data sources)
- Compiled Template — Gap Analysis (new designs vs current app)
- Workflow & Template Features (project-driven recipients, multi-project format)
Procore / Project Groups
- Procore Integration — Complete Guide (installation single/grouped, project groups, token storage, why no migration)
Analytics Database
- Analytics Database — Reference (tables, how they’re created and populated, formulas and variables for PM and developers)
- Analytics DB — Duplication Incident Report (root cause, PK / upsert safeguards)
- Analytics DB — Sync Stoppage Incident & Resilience Fix (April 6, 2026 stall; per-step rescue + guaranteed reschedule)
Other Features
- Heartbeats Dashboard (kiosk connectivity, queries, sample data)
Analytics DB — Duplication Incident Report
Date: April 15, 2026
Status: Root cause confirmed; safeguards implemented in the codebase; production investigation and duplicate-day audit documented below. No analytics DB reset or data repair has been run in production yet — that remains after merge/deploy and operational sign-off.
1. Summary
32 out of 122 projects (26.9%) have duplicated checkin_submissions records in their analytics databases. The remaining 87 are clean, and 3 have no analytics DB. For example, Project 58 had 634,210 rows in analytics but only 16,596 in source — an inflation of 38x. The worst-affected project (Project 58) had 617,619 duplicate rows.
The duplication was caused by repeated analytics sync operations (admin batch actions and scheduled background jobs) writing to analytics tables that were missing a unique constraint on the id column. Without this constraint, the intended upsert logic silently degraded into repeated inserts.
The duplication traces back to a code refactor on April 15, 2025 (commit cc4c6bf) that changed the populate logic from a safe check-then-insert pattern to Sequel's batch import with on_duplicate_key_update: { conflict_target: [:id] } — without adding the unique constraint this approach requires. Projects that had active scheduled jobs during or shortly after this transition accumulated duplicates before the last_synced_at delta mechanism stabilized. See Section 2.5 for full timeline.
2. Root Cause Analysis
2.1 How the analytics pipeline works
Pepcore Source DB Analytics DB (per project)
┌──────────────────────┐ ┌──────────────────────────┐
│ check_in_submission │ ──sync──▶ │ checkin_submissions │
│ (PK on id, unique) │ │ (no PK, no unique index) │
└──────────────────────┘ └──────────────────────────┘
- Each project has an analytics database (e.g.
analytics_44) with acheckin_submissionstable. - Data is copied from the Pepcore source (
check_in_submissiontable) into analytics via a sync/populate process. - The populate code uses Sequel's
importwithon_duplicate_key_update: { conflict_target: [:id] }— an upsert that requires a unique constraint onidto detect conflicts.
2.2 The defect
When analytics tables were created via init_analytics_db, the id column was added as a plain nullable integer — no primary key, no unique index, no constraints at all.
File: app/models/db_credential.rb — create_checkin_submissions_attributes
connection.alter_table :checkin_submissions do
add_column :id, :integer unless attributes.include?(:id)
# ... other columns ...
end
# ← No unique index or PK was created here
2.3 The trigger
Without a unique constraint, conflict_target: [:id] has nothing to conflict against. Every populate run inserted all fetched rows as new records instead of updating existing ones.
Three code paths trigger populate:
| Trigger | File | Method |
|---|---|---|
| Admin batch action: "Populate Analytics Data" | app/admin/projects.rb |
populate_analytics_data |
| Admin batch action: "Kickoff Data Fetching" | app/admin/projects.rb |
kickoff_data_fetching |
| Recurring Delayed Job (hourly) | app/models/db_credential.rb |
fetch_data_and_schedule_next_fetch |
| ProjectGroup Delayed Job (hourly) | app/models/project_group.rb |
fetch_data_and_schedule_next_fetch |
Each execution added duplicate copies of the same source records. Over days/weeks, this compounded into massive inflation.
2.4 Why not all projects were affected equally
Production audit confirmed: both affected and clean projects have last_synced_at set to today (April 16, 2026) and active scheduled jobs. The duplication is historical, not ongoing. The hourly jobs now fetch only delta records (usually empty), so no new duplicates are being created — but old duplicates remain.
Latest production audit (Section 3.3): 119 DbCredential rows checked — 32 duplicated, 87 clean, 0 errors. (122 directory projects minus 3 without a credential / analytics link in that run.)
2.5 Root cause timeline (from git history)
| Date | Event | Impact |
|---|---|---|
| Mar 14, 2025 | Analytics DB feature created (65e4512). Populate used safe check-then-insert: if existing_record.present? → update, else → insert. |
No unique constraint needed. No duplicates possible. |
| Mar 17, 2025 | Most DB credentials created. Jobs kicked off for early projects. | Safe pattern still in use. |
| Apr 15, 2025 | Refactor commit cc4c6bf: populate switched to batch import with on_duplicate_key_update: { conflict_target: [:id] }. last_synced_at column added (defaults to nil for all existing credentials). |
Breaking change: batch upsert requires unique constraint on id, which was never added. |
| Apr 15, 2025 onward | First job runs after deploy: last_synced_at = nil → fetches ALL source records → import runs against table with no unique constraint → duplicates inserted. |
32 projects with active jobs accumulated duplicates during this window. |
| Eventually | last_synced_at stabilized for most projects → subsequent runs fetch empty delta → no new duplicates. |
Duplication stopped growing, but existing duplicates remained. |
Why only 32 of 82 projects with jobs were affected:
The original check-then-insert code (if existing_record.present?) was safe without a unique constraint. The April 15 refactor removed this safety. Projects that had active hourly jobs during the transition window re-inserted all records before last_synced_at could stabilize. Projects whose first post-refactor populate completed cleanly (setting last_synced_at) only ever fetched empty deltas afterward, avoiding duplication.
3. Evidence (sampled April 15, 2026)
3.1 Project 44 (originally reported)
| Metric | Value |
|---|---|
Source DB (check_in_submission) rows |
3,154 (all unique) |
Analytics DB (checkin_submissions) total rows |
55,513 |
| Analytics DB distinct IDs | 3,308 |
| Duplicate rows | 52,205 |
| Inflation ratio | 16.8x |
| Duplicate window | March 24 – April 2, 2026 |
| Data after April 2 | Clean (dup factor 1.0) |
| Last data timestamp | April 10 (no data since April 11) |
| Top duplicated ID (447565) | 1,189 copies |
Note: The duplicate-heavy calendar days for Project 44 are 2026-03-24 through 2026-04-02 (8 days). That is not the same as the April 15, 2025 engineering date when the unsafe batch upsert shipped — see Section 3.4 for all projects’ duplicate-day ranges.
3.2 Random 10-project sample
| Project ID | Project Name | Source Count | Analytics Total | Distinct IDs | Duplicate Rows | Ratio |
|---|---|---|---|---|---|---|
| 86 | Team ML08x | 2,388 | 171,357 | 2,388 | 168,969 | 71.8x |
| 79 | Team FRA 134/135 | 11,435 | 442,778 | 11,430 | 431,348 | 38.7x |
| 72 | Team Newport | 9,055 | 259,518 | 9,050 | 250,468 | 28.7x |
| 51 | Mace Pure Dub01 | 3,702 | 52,435 | 3,700 | 48,735 | 14.2x |
| 113 | Yondr Ops DC FRA1 | 704 | 656 | 656 | 0 | 0.93x |
| 111 | Equinix/DPR DC17 Virginia | 1,055 | 971 | 971 | 0 | 0.92x |
| 100 | Project Advance | 1,386 | 1,274 | 1,274 | 0 | 0.92x |
| 112 | Yondr Ops DC NOV1 | 943 | 803 | 803 | 0 | 0.85x |
| 110 | STO QTS DFW2 DC4 | 74 | 53 | 53 | 0 | 0.72x |
| 122 | Yondr Collen Building C | 0 | — | — | — | N/A |
4 out of 9 projects with analytics DBs had severe duplication (initial sample).
3.3 Full production audit (counts)
| Metric | Value |
|---|---|
| Total projects (directory) | 122 |
DbCredential records checked (prod console) |
119 |
Clean (total_rows == distinct_ids) |
87 |
| Duplicated | 32 (26.9% of checked credentials) |
No checkin_submissions table / other |
0 (in this run) |
| Errors | 0 |
Three projects have no DbCredential in this run (122 − 119 = 3), e.g. not yet linked to an analytics DB.
Top 10 most affected projects:
| Project ID | Project Name | Source Count | Analytics Total | Duplicate Rows | Ratio |
|---|---|---|---|---|---|
| 58 | Team GBL6 | 16,596 | 634,210 | 617,619 | 38.2x |
| 25 | SigmaRoc | 20,601 | 483,924 | 460,713 | 23.5x |
| 79 | Team FRA 134/135 | 11,435 | 442,778 | 431,348 | 38.7x |
| 74 | Team Madrid | 14,930 | 436,722 | 421,799 | 29.3x |
| 52 | SAM3A Team | 8,495 | 396,996 | 388,502 | 46.7x |
| 73 | Team Ersbo | 23,743 | 406,843 | 383,110 | 17.1x |
| 50 | Team Powergate | 44,615 | 417,783 | 373,172 | 9.4x |
| 43 | Team PA14 | 3,891 | 329,637 | 325,748 | 84.7x |
| 71 | IPS Rahway MACS | 8,890 | 296,270 | 287,382 | 33.3x |
| 72 | Team Newport | 9,057 | 259,521 | 250,468 | 28.7x |
All 32 duplicated projects (sorted by duplicate rows): see Section 3.4 for per-project duplicate-day windows.
3.4 Per-project duplicate-day windows (production)
Method: For each project’s analytics DB, days where COUNT(*) > COUNT(DISTINCT id) on checkin_submissions grouped by created_at::date (duplicate-heavy calendar days in analytics).
Executive summary
- March 20, 2026 onward — majority of the volume: In the production audit, 20 of 32 duplicated projects show their first duplicate-heavy day on 2026-03-20 exactly; several others show their first such day a few days later in the same window (2026-03-23–2026-03-30, e.g. Projects 44, 41, 53). Almost all heavy duplication calendar days (where
total_rows > distinct_idsper day) fall between 2026-03-20 and 2026-04-06. That pattern is consistent with a bulk operational action around that time — in particular Admin → Projects → batch “Populate Analytics Data” (or equivalent) run across many projects: each run re-fetched a large slice of rows and inserted again becauseidhad no unique constraint, so the same source IDs appear many times on the samecreated_atdates. - Earlier traces do not contradict that story: 7 of 32 projects have at least one duplicate-heavy day before 2026-03-20 (e.g. 12, 16, 18, 21, 71, 78, 1 — dates in 2024–2025 or scattered). Those are expected from the same structural bug since the April 2025 batch-upsert change: any earlier hourly job or populate that re-fetched overlapping rows could create duplicate-heavy days on whatever calendar dates those check-ins were stamped. So: the big March–April 2026 spike aligns with a deliberate bulk populate; smaller or older spikes show the defect was exploitable before that action as well.
- Project 44 (narrower window): Duplicate-heavy days 2026-03-24 through 2026-04-02 only (8 days) — first dup day after 2026-03-20, so not every project participated on the exact same first day.
- Minimal / single-day (2026): Project 38 (2026-03-24 only), Project 2 (2026-04-02 only), Project 53 (2026-03-23–24).
Evidence vs. proof: The calendar alignment across dozens of projects is strong evidence for a shared trigger (bulk populate). Server logs / admin audit trails would be the definitive proof of who ran which batch action when; the database alone cannot name the user.
Per-project summary (first duplicate-heavy day → last; count of such days)
| Project ID | Name | Dup days | First dup day | Last dup day |
|---|---|---|---|---|
| 58 | Team GBL6 | 14 | 2026-03-20 | 2026-04-04 |
| 25 | SigmaRoc | 16 | 2026-03-20 | 2026-04-04 |
| 79 | Team FRA 134/135 | 14 | 2026-03-20 | 2026-04-05 |
| 74 | Team Madrid | 11 | 2026-03-20 | 2026-04-01 |
| 52 | SAM3A Team | 14 | 2026-03-20 | 2026-04-04 |
| 73 | Team Ersbo | 18 | 2026-03-20 | 2026-04-06 |
| 50 | Team Powergate | 17 | 2026-03-20 | 2026-04-06 |
| 43 | Team PA14 | 14 | 2026-03-20 | 2026-04-04 |
| 71 | IPS Rahway MACS | 17 | 2025-05-05 | 2026-04-04 |
| 72 | Team Newport | 14 | 2026-03-20 | 2026-04-04 |
| 32 | Team GRQ | 18 | 2026-03-20 | 2026-04-06 |
| 86 | Team ML08x | 13 | 2026-03-20 | 2026-04-03 |
| 42 | Team LD14 | 15 | 2026-03-20 | 2026-04-04 |
| 81 | Project NVA05D | 11 | 2026-03-20 | 2026-04-02 |
| 75 | Team Frankfurt | 14 | 2026-03-20 | 2026-04-05 |
| 21 | Peps Peeps | 16 | 2025-01-23 | 2026-04-02 |
| 48 | StructureTone NTT TX4 | 13 | 2026-03-20 | 2026-04-04 |
| 44 | Team FR8.2 | 8 | 2026-03-24 | 2026-04-02 |
| 51 | Mace Pure Dub01 | 12 | 2026-03-20 | 2026-04-03 |
| 80 | Holder SVY01C-F | 8 | 2026-03-20 | 2026-04-04 |
| 76 | Team Harlow | 9 | 2026-03-20 | 2026-04-02 |
| 77 | DPR Expansion Team | 9 | 2026-03-20 | 2026-04-03 |
| 53 | Lebanon Peptide | 2 | 2026-03-23 | 2026-03-24 |
| 14 | Lebanon Gen Site | 5 | 2026-03-20 | 2026-04-03 |
| 41 | AIB | 4 | 2026-03-30 | 2026-04-02 |
| 16 | Yondr Redkite | 100 | 2025-02-13 | 2025-05-31 |
| 38 | StructureTone Hawk | 1 | 2026-03-24 | 2026-03-24 |
| 2 | CS Demos | 1 | 2026-04-02 | 2026-04-02 |
| 78 | PepTalk Live | 31 | 2025-03-20 | 2026-01-22 |
| 12 | Team PepTalk | 46 | 2023-11-10 | 2025-04-01 |
| 18 | John Paul Drogheda | 8 | 2025-06-04 | 2025-06-12 |
| 1 | Legacy | 1 | 2024-03-06 | 2024-03-06 |
Full day lists for each project are in the production console output archive (same order as above).
3.5 Source DB integrity confirmed
check_in_submissionin Pepcore Base has a primary key (check_in_submission_pkey) and a unique index (check_in_submission_couchdb_id_unique).- For Project 44:
COUNT(*) == COUNT(DISTINCT id)— source IDs are fully unique. - The
CheckInSubmissionmodel is read-only (def readonly?; true; end) and no write paths exist in this codebase. - Conclusion: source data is clean. Duplication is entirely analytics-side.
4. Safeguards Implemented
4.1 Shared concern: AnalyticsUpsertSafety
File: app/models/concerns/analytics_upsert_safety.rb
A reusable module included by both DbCredential and ProjectGroup that:
- Checks if a unique index or primary key already exists on the table's
idcolumn. - If missing and no duplicates exist: creates the unique index automatically.
- If missing and duplicates exist: logs a warning and returns
falseso the caller skips populate gracefully (no crash, no further duplication).
# Returns true → safe to upsert
# Returns false → duplicates found, skip populate, log warning
def ensure_unique_id!(connection, table)
4.2 Guard applied to all upsert paths
Both DbCredential and ProjectGroup now call ensure_unique_id! before every upsert on:
- checkin_submissions
- morale_data
If the guard returns false, the populate step is skipped for that table — the process continues with remaining tables and other projects.
4.3 Error isolation in admin batch actions
File: app/admin/projects.rb
All four batch actions (init_analytics_db, populate_analytics_data, reset_analytics_data, kickoff_data_fetching) now wrap each project in rescue => e:
- Failed projects are skipped (not crash the batch).
- Admin sees which projects were skipped and why.
- Remaining projects proceed normally.
5. Remediation Steps
Step 1: Deploy the safeguard code
Push the changes to production. Files changed:
app/models/concerns/analytics_upsert_safety.rb(new)app/models/db_credential.rb(updated)app/models/project_group.rb(updated)app/admin/projects.rb(updated)
Step 2: Reset all analytics databases
SSH into production, open rails console, and run:
errors = []
DbCredential.includes(:project).find_each do |cred|
project = cred.project
next unless project
puts "Resetting project #{project.id} (#{project.name})..."
cred.reset_analytics_data
puts " Done."
rescue => e
errors << { project_id: project&.id, error: e.message }
puts " SKIPPED: #{e.message}"
end
puts "\n=== COMPLETE ==="
puts "Errors: #{errors.size}"
errors.each { |e| puts " Project #{e[:project_id]}: #{e[:error]}" }
What reset_analytics_data does per project:
1. Deletes all rows from all 6 analytics tables.
2. Resets last_synced_at to nil (so next populate fetches all source records).
3. Calls populate_database_with_morale_and_checkin_submissions which:
- Runs ensure_unique_id! → creates unique index on the now-empty table.
- Upserts all source records (now properly deduplicated by the index).
Step 3: Verify all projects are clean
clean = 0; dirty = 0
DbCredential.includes(:project).find_each do |cred|
next unless cred.project
cred.with_analytics_connection do |conn|
next unless conn.table_exists?(:checkin_submissions)
row = conn.fetch("SELECT COUNT(*) AS total, COUNT(DISTINCT id) AS distinct_ids FROM checkin_submissions").first
dupes = row[:total].to_i - row[:distinct_ids].to_i
if dupes > 0
dirty += 1
puts "DIRTY project #{cred.project_id}: #{dupes} duplicates remain"
else
clean += 1
end
end
rescue => e
puts "ERROR project #{cred.project_id}: #{e.message}"
end
puts "\nClean: #{clean}, Dirty: #{dirty}"
Expected result: Dirty: 0
Step 4: Verify unique index exists
DbCredential.includes(:project).find_each do |cred|
next unless cred.project
cred.with_analytics_connection do |conn|
next unless conn.table_exists?(:checkin_submissions)
idx = conn.fetch("SELECT indexname FROM pg_indexes WHERE tablename='checkin_submissions' AND indexname='idx_checkin_submissions_unique_id'").first
status = idx ? "HAS INDEX" : "MISSING INDEX"
puts "Project #{cred.project_id}: #{status}"
end
rescue => e
puts "Project #{cred.project_id}: ERROR #{e.message}"
end
Expected result: all projects show HAS INDEX.
6. Prevention (going forward)
| Risk | Mitigation |
|---|---|
| Repeated populate creating duplicates | Unique index on id enforced before every upsert |
| Dirty DB silently getting worse | ensure_unique_id! detects existing duplicates and skips populate |
| One project error killing entire batch | Per-project rescue in all admin batch actions |
Same bug in morale_data table |
Same safeguard applied to morale_data upserts |
Same bug in ProjectGroup DB |
Same concern included in ProjectGroup model |
7. Files Changed
| File | Change |
|---|---|
app/models/concerns/analytics_upsert_safety.rb |
New. Shared concern for unique constraint enforcement. |
app/models/db_credential.rb |
Include concern; replace old guard; add morale_data guard; use next unless pattern. |
app/models/project_group.rb |
Include concern; replace old guard; add morale_data guard; use next unless pattern. |
app/admin/projects.rb |
Add per-project error handling in all 4 batch actions. |