Hypertext Rails

Documentation

Getting Started

Communication Center

Procore / Project Groups

Other Features

Fallback Channel Implementation

Overview

Goal: Implement automatic fallback to secondary communication channels when primary channels fail, ensuring important messages reach stakeholders even when their preferred channel encounters issues.

Solution: Two-tier fallback logic: 1. At creation time: Try secondary channel if primary can't be used 2. At send time: Create fallback delivery if primary send fails

Executive Summary

Our communication system is designed to prioritize delivery while strictly adhering to stakeholder "noise" constraints. We utilize a Two-Tier Fallback system (Creation-time and Send-time) to ensure that if a preferred channel is blocked or fails, the system automatically attempts delivery via a secondary channel before opting to queue the message for a later date.

Core System Logic

  • Independent Channel Limits: Daily, weekly, and monthly limits are tracked separately for Email, SMS, and Teams.
  • Unified Cooldown: To prevent "spamming" a user across different platforms, the cooldown period (e.g., 2 hours) is combined across all channels.
  • Fallback Requirements: A fallback only triggers if the secondary channel is included in the template, the stakeholder has valid contact info, and they haven't opted out.

Implementation Details

Database Changes

Migrations: - add_secondary_channel_to_stakeholders - Adds secondary_channel column - add_fallback_tracking_to_comms_deliveries - Adds fallback_from_delivery_id and fallback_attempted (with index)

Model Updates

Stakeholder Model: - get_effective_secondary_channel - Checks stakeholder-level first, then persona-level - Helper methods: secondary_channel_display, secondary_channel_class

CommsDelivery Model: - belongs_to :fallback_from_delivery - Links to original delivery - has_many :fallback_deliveries - Links to fallback deliveries

Service Logic

DeliveryScheduleService (app/services/delivery_schedule_service.rb): - determine_channel_for_stakeholder - Tries preferred first, then secondary - check_channel_availability - Validates channel can be used - Logs skipped stakeholders with specific reasons - Logs when fallback channel is used

CommsDeliverySendJob (app/jobs/comms_delivery_send_job.rb): - should_attempt_fallback? - Checks if fallback should be attempted - attempt_fallback_delivery - Creates fallback delivery with proper linking - generate_fallback_payload - Creates initial payload for fallback delivery - Prevents infinite loops with fallback_attempted flag

Fallback Triggers

Creation-time fallback: - Preferred channel not in template → Try secondary - Stakeholder opted out of preferred → Try secondary - No contact info for preferred → Try secondary - Both fail → Skip stakeholder, log reason

Send-time fallback: - Email bounces/invalid → Create SMS if secondary=SMS - SMS gateway error → Create email if secondary=email - Secondary also fails → Mark as failed, no further retry


Cadence Limits & Fallback

How Cadence Limits Work

Per-Channel Limits (Daily/Weekly/Monthly/Type): - Each channel has its own quota (email: 1/day, SMS: 1/day = 2 total possible) - Allows fallback to work: if email hits limit, SMS can still be used - Prevents one channel from blocking another

Combined Cooldown: - Cooldown applies across ALL channels (prevents rapid-fire across channels) - If email sent recently, SMS might be queued due to cooldown

Note: When a preferred channel hits its cadence limit at creation time, the system will attempt fallback to the secondary channel if it's available and within its own limits. This ensures maximum delivery success while respecting per-channel quotas.


Operational Scenarios & Examples

The following scenarios illustrate how the system handles real-world delivery hurdles:

Scenario 1: Primary Channel at Capacity

Context: Alice is set to Email (Primary) and SMS (Secondary). She already received her 1 daily email.

Outcome: The system detects the email limit is reached, checks SMS availability, and creates an SMS delivery for today.

Key Logic: The message is not delayed; it simply switches channels to meet the immediate schedule.

Technical Details: - Creation-time check: Email limit 1/1 reached → Checks SMS limit (0/1 available) → Creates SMS delivery - Delivery created with channel: 'sms', send_at: today, was_fallback: true

Scenario 2: Total Capacity Reached

Context: Bob has hit his daily limit for both Email and SMS.

Outcome: The system queues the delivery for tomorrow using the Primary (Email) channel.

Key Logic: When no channels are available today, the system defaults back to the preferred channel for the next available slot.

Technical Details: - Creation-time check: Email limit 1/1 reached → Checks SMS limit (1/1 also reached) → Queues email delivery for tomorrow - Delivery created with channel: 'email', send_at: tomorrow, status: 'scheduled'

Scenario 3: Unified Cooldown Block

Context: Charlie received an email 1 hour ago. A new report is triggered now, but the system has a 2-hour cooldown.

Outcome: Even though Charlie hasn't received an SMS today, the SMS delivery is blocked because the cooldown is shared.

Key Logic: The delivery is queued to send in 1 hour (when the cooldown expires).

Technical Details: - Creation-time check: Cooldown check fails (only 1 hour since last email, need 2 hours) → Queues delivery - Cooldown applies across ALL channels, so SMS is also blocked - Delivery created with send_at: 1 hour from now (when cooldown expires)

Scenario 4: Template Content Mismatch

Context: Diana's preferred channel is Email, but the specific Report Template only contains SMS content.

Outcome: The system bypasses Email and sends via SMS today.

Key Logic: Lack of channel-specific content in a template acts as an automatic trigger for the fallback channel.

Technical Details: - Creation-time check: Email not in template → Checks SMS (available in template) → Creates SMS delivery - Delivery created with channel: 'sms', was_fallback: true, reason: "channel not in template"

Scenario 5: Send-Time Failure (The "Bounce" Fallback)

Context: An email is successfully created for Eve, but the delivery fails (e.g., a bounce or invalid address).

Outcome: The system marks the email as failed and immediately generates a new SMS delivery as a fallback.

Key Logic: We track fallback_attempted: true to prevent infinite loops if the secondary channel also fails.

Technical Details: - Send-time check: Email send fails → Checks if SMS fallback available → Creates SMS delivery - Original delivery: status: 'failed', fallback_attempted: true - Fallback delivery: channel: 'sms', fallback_from_delivery_id: [original_id], status: 'pending'

Scenario 6: Independent Automation Handling

Context: Frank has two different automations running at 9:00 AM and 2:00 PM. His limit is 1 report/day.

Outcome: 9:00 AM sends via Email. 2:00 PM detects the email limit is reached and sends via SMS.

Key Logic: Each automation run re-evaluates the "best available path" for that specific moment.

Technical Details: - 9:00 AM: Email delivery created and sent (email count: 1/1) - 2:00 PM: Email limit check fails (1/1 reached) → SMS limit check passes (0/1 available) → SMS delivery created - Each run independently evaluates channel availability and limits


Edge Cases Handled

  • ✅ Same channel for preferred/secondary (no duplicate attempt)
  • ✅ Infinite loop prevention (fallback_attempted flag)
  • ✅ Missing contact info
  • ✅ Opt-out handling
  • ✅ Template availability checks
  • ✅ Per-channel cadence limits
  • ✅ Combined cooldown across channels

Key Implementation Notes

  1. N+1 Query Consideration: get_effective_secondary_channel calls Persona.find_by per stakeholder. Monitor in production, optimize if needed.

  2. SMS/Teams Jobs: SMS and Teams send jobs may not exist yet. Fallback deliveries will be created but won't send until jobs are implemented.

  3. Logging: All skipped stakeholders and fallback attempts are logged with specific reasons for debugging.

  4. Safety: fallback_attempted flag prevents infinite fallback loops.