Hypertext Rails

Documentation

Getting Started

Communication Center

Procore / Project Groups

Other Features

Trigger Events and Cadence Rules - Complete Guide

Table of Contents

  1. Introduction
  2. Understanding Trigger Events
  3. The 5-Minute Cron Job
  4. Two Layers of Rate Limiting
  5. Real-World Scenario Walkthrough
  6. Kiosk Health Status - The 5 Attributes
  7. Multiple Issues vs Multiple Emails
  8. Multiple Trigger Events in One Workflow
  9. How Cadence Rules Work
  10. Common Questions

Introduction

This guide explains how trigger-based automation workflows work, focusing on: - How the system detects trigger events (like kiosk health issues) - How the 5-minute cron job checks for these events - How cadence rules prevent email spam - How everything works together

Think of it like a security guard system: - The cron job is like a guard checking every 5 minutes - Trigger events are like alarms going off - Cadence rules are like "don't call the same person too many times" rules


Understanding Trigger Events

What is a Trigger Event?

A trigger event is something that happens in your system that should send an email. Examples: - A kiosk stops charging - A kiosk's battery gets low - Employee morale drops - A kiosk goes offline

How Trigger Events Work

  1. You create a workflow that says "When X happens, send an email to Y people"
  2. The system checks every 5 minutes if X has happened
  3. If X happened, the system creates email deliveries
  4. Cadence rules check if it's okay to send emails to each person
  5. Emails are sent (or queued for later if cadence rules block them)

The 5-Minute Cron Job

What is a Cron Job?

A cron job is a scheduled task that runs automatically at set intervals. Think of it like an alarm clock that goes off every 5 minutes.

Our Cron Job: automation_workflows:process_triggers

Location: lib/tasks/automation_workflows.rake

What it does: 1. Finds all active trigger-type workflows 2. For each workflow, checks if any trigger events have occurred 3. If events occurred, creates email deliveries (if allowed by rate limits)

Frequency: Runs every 5 minutes (configured in Cloud 66)

Why Every 5 Minutes?

  • Too frequent (every 1 minute): Wastes server resources, checks things that haven't changed
  • Too infrequent (every hour): Issues might be resolved before we detect them
  • 5 minutes: Good balance - catches issues quickly without overloading the system

Two Layers of Rate Limiting

This is the most important concept to understand! There are TWO separate layers that prevent email spam:

Layer 1: Workflow-Level Rate Limiting (Trigger Event Cooldown)

Purpose: Prevents the same trigger event from creating deliveries too frequently

How it works: - When a trigger event fires, the system records: "This workflow sent a kiosk_health_status trigger at 10:00 AM" - Before creating new deliveries, it checks: "Has it been at least X hours since the last time this workflow sent this trigger?" - If not enough time has passed → Skip creating deliveries - If enough time has passed → Create deliveries

Default cooldown: 1 hour (3600 seconds)

Where it's configured: - Uses the cooldown_hours from persona cadence rules (if personas are in the workflow's audience) - If no personas or no cooldown configured → defaults to 1 hour

Code location: app/services/trigger_evaluator_service.rbwithin_rate_limit? method

Important: Each trigger event type has its own rate limit tracking!

Layer 2: Persona-Level Cadence Rules

Purpose: Prevents sending too many emails to the same person

How it works: - For each person who should receive an email, the system checks their persona's cadence rules - Note: Stakeholder-specific cadence overrides (saved in cadence_rules table) are currently NOT used - only persona cadence rules are enforced - Rules include: - Cooldown hours: Minimum time between ANY emails to this person - Daily limit: Max emails per day to this person - Weekly limit: Max emails per week to this person - Monthly limit: Max emails per month to this person - Type daily limit: Max emails per day for specific types (e.g., max 5 "alert" emails per day)

If cadence allows: Create delivery immediately If cadence blocks: Queue delivery for later (when cadence allows)

Code location: app/services/delivery_schedule_service.rbcan_send_to_stakeholder? method

Important: The stakeholder form allows overriding cadence rules, but these overrides are saved to the database but not currently enforced. Only the persona's cadence rules are used for delivery scheduling.


Real-World Scenario Walkthrough

Let's walk through a complete example step-by-step.

Setup

  • Workflow: "Kiosk Health Alert" (trigger type)
  • Trigger Event: kiosk_health_status
  • Audience: All stakeholders with persona "Manager" (cooldown_hours = 2)
  • Stakeholders:
    • Alice (Manager persona, cooldownhours = 2, maxper_day = 5)
    • Bob (Manager persona, cooldownhours = 2, maxper_day = 5)
  • Cron job: Runs every 5 minutes

Timeline

10:00 AM - First Detection

Cron job runs: 1. Checks all kiosks for health issues 2. Finds: Kiosk #123 has NOT_CHARGING issue 3. Layer 1 Check (Workflow-level): - Last time this workflow sent kiosk_health_status trigger: Never (first time) - Rate limit check: ✅ PASSED (never sent before) 4. Create deliveries: - For Alice: Layer 2 Check (Stakeholder-level) - Last email to Alice: Never - Daily count: 0/5 - Cooldown: N/A (no previous email) - ✅ PASSED → Create delivery immediately - For Bob: Layer 2 Check - Last email to Bob: Never - Daily count: 0/5 - Cooldown: N/A - ✅ PASSED → Create delivery immediately 5. Record: last_trigger_sent_at('kiosk_health_status') = 10:00 AM 6. Result: - CommsInstance #1 created - 2 deliveries created (Alice + Bob) - Emails sent immediately

10:05 AM - Second Detection (Different Issue)

Cron job runs: 1. Checks all kiosks for health issues 2. Finds: Kiosk #123 now has LOW_BATTERY issue (charging fixed, but battery low) 3. Layer 1 Check (Workflow-level): - Context: kiosk_health_status:123:45:LOW_BATTERY - Last time this workflow sent this specific context: Never - Rate limit check: ✅ PASSED (different issue, independent!) 4. Create deliveries: - For Alice: Layer 2 Check (Stakeholder-level) - Last email to Alice: 10:00 AM (5 minutes ago) - Daily count: 1/5 - Cooldown: ❌ Only 5 minutes since last email, need 2 hours - ❌ BLOCKED → Queue delivery for 12:00 PM - For Bob: Layer 2 Check - Last email to Bob: 10:00 AM (5 minutes ago) - Daily count: 1/5 - Cooldown: ❌ Only 5 minutes since last email, need 2 hours - ❌ BLOCKED → Queue delivery for 12:00 PM 5. Record: last_trigger_sent_at('kiosk_health_status:123:45:LOW_BATTERY') = 10:05 AM 6. Result: - CommsInstance #2 created - 2 deliveries created (Alice + Bob) - queued for 12:00 PM - Emails will be sent when cadence allows

10:10 AM - Third Detection (Same Issue as First)

Cron job runs: 1. Checks all kiosks for health issues 2. Finds: Kiosk #123 STILL has NOT_CHARGING issue 3. Layer 1 Check: - Context: kiosk_health_status:123:45:NOT_CHARGING - Last time: 10:00 AM (10 minutes ago) - Rate limit: 2 hours - Time since last send: 10 minutes - ❌ BLOCKED → Same kiosk, same project, same issue, within cooldown 4. Result:NO deliveries created (same context, within cooldown)

12:00 PM - 2 Hours Later (Cooldown Expired)

Cron job runs: 1. Checks all kiosks for health issues 2. Finds: - Kiosk #123 STILL has NOT_CHARGING issue - Kiosk #456 has LOW_BATTERY issue 3. Layer 1 Check: - Context 1: kiosk_health_status:123:45:NOT_CHARGING - Last time: 10:00 AM (2 hours ago) ✅ PASSED - Context 2: kiosk_health_status:456:45:LOW_BATTERY - Last time: Never ✅ PASSED (different kiosk, independent!) 4. Create deliveries for both contexts: - For Alice: Layer 2 Check - Last email to Alice: 10:00 AM (2 hours ago) - Daily count: 1/5 - Cooldown: 2 hours have passed ✅ - ✅ PASSED → Create deliveries immediately (for both contexts) - For Bob: Layer 2 Check - Last email to Bob: 10:00 AM (2 hours ago) - Daily count: 1/5 - Cooldown: 2 hours have passed ✅ - ✅ PASSED → Create deliveries immediately (for both contexts) 5. Record: - last_trigger_sent_at('kiosk_health_status:123:45:NOT_CHARGING') = 12:00 PM - last_trigger_sent_at('kiosk_health_status:456:45:LOW_BATTERY') = 12:00 PM 6. Result: - CommsInstance #3 and #4 created (one per context) - 4 deliveries created (Alice + Bob for each context) - Emails sent immediately

12:05 PM - Too Soon Again

Cron job runs: 1. Finds same issues 2. Layer 1 Check: - Last time: 12:00 PM (5 minutes ago) - ❌ BLOCKED → Need to wait 1 hour 55 minutes 3. Result:NO deliveries created

Key Takeaways from This Scenario

  1. Layer 1 (Workflow-level) prevents the workflow from creating deliveries more than once per cooldown period for each unique context (kiosk + project + issues)
  2. Different kiosks, different projects, and different issues each have their own independent rate limits
  3. Same kiosk + same project + same issue respects the cooldown period
  4. Layer 2 (Stakeholder-level) is checked AFTER Layer 1 passes, so it only matters if deliveries are being created
  5. The cron job runs every 5 minutes, but deliveries are only created when both layers allow it

Kiosk Health Status - The 5 Attributes

How Rate Limiting Works for Kiosk Health Status

The system uses context-aware rate limiting for kiosk_health_status triggers. Each unique combination of kiosk + project + health issue(s) has its own independent rate limit.

The 5 Attributes

The KioskHealthStatusEvaluator checks these 5 things:

  1. NOT_CHARGING - Kiosk is not plugged in (charging: false)
  2. SCREEN_OFF - Kiosk screen is off (screen_state: false)
  3. LOW_BATTERY - Battery level is below 20%
  4. OFFLINE - Kiosk hasn't sent a heartbeat in 5 minutes OR connection_status: false
  5. MODE - (Currently not implemented, placeholder for future use)

How It Works

# Simplified version of what happens
def check_kiosk_health(kiosk)
  issues = []

  if kiosk.charging == false
    issues << 'NOT_CHARGING'
  end

  if kiosk.screen_state == false
    issues << 'SCREEN_OFF'
  end

  if kiosk.battery_level < 20
    issues << 'LOW_BATTERY'
  end

  if kiosk.connection_status == false || no_heartbeat_in_5_minutes
    issues << 'OFFLINE'
  end

  # If ANY issues found, return the kiosk as "unhealthy"
  return issues.any? ? kiosk : nil
end

Important Points

  1. A kiosk can have MULTIPLE issues at once:

    • Example: Kiosk #123 is NOTCHARGING AND LOWBATTERY AND OFFLINE
    • All issues are included in the email context
  2. Each unique combination has its own rate limit:

    • Each kiosk is tracked separately
    • Each project is tracked separately
    • Each health issue (or combination of issues) is tracked separately
    • Rate limit key format: kiosk_health_status:{kiosk_id}:{project_id}:{sorted_issues}
  3. Different issues trigger independently:

    • If Kiosk #123 (Project #45) has LOW_BATTERY at 10:00 AM → Email sent ✅
    • If Kiosk #123 (Project #45) has NOT_CHARGING at 10:05 AM → Email sent ✅ (different issue, independent!)
    • If Kiosk #123 (Project #45) still has LOW_BATTERY at 10:10 AM → Blocked ❌ (same kiosk, same project, same issue, within cooldown)

Multiple Issues vs Multiple Emails

Your Question: "Will we send separate emails for different issues?"

Scenario: - 10:00 AM: Kiosk #123 (Project #45) has NOT_CHARGING - 10:05 AM: Kiosk #123 (Project #45) has NOT_CHARGING + LOW_BATTERY - 10:10 AM: Kiosk #123 (Project #45) has NOT_CHARGING + SCREEN_OFF

Answer: YES, separate emails for different issue combinations!

How it works:

  1. Each unique combination has its own rate limit:

    • 10:00 AM: kiosk_health_status:123:45:NOT_CHARGING → Email sent ✅
    • 10:05 AM: kiosk_health_status:123:45:LOW_BATTERY,NOT_CHARGING → Email sent ✅ (new combination, independent!)
    • 10:10 AM: kiosk_health_status:123:45:NOT_CHARGING,SCREEN_OFF → Email sent ✅ (new combination, independent!)
  2. Same combination respects cooldown:

    • If kiosk_health_status:123:45:NOT_CHARGING was sent at 10:00 AM
    • And the same combination is detected at 10:05 AM
    • It's blocked until cooldown expires (e.g., 1 hour later)
  3. Different kiosks trigger independently:

    • Kiosk #123 (Project #45) with LOW_BATTERY → Email sent ✅
    • Kiosk #456 (Project #45) with LOW_BATTERY → Email sent ✅ (different kiosk, independent!)
    • Kiosk #123 (Project #67) with LOW_BATTERY → Email sent ✅ (different project, independent!)

What Gets Included in Each Email?

Each email includes all current issues for that specific kiosk at that moment:

Example: - Email 1 (10:00 AM): Kiosk #123 has NOT_CHARGING - Email 2 (10:05 AM): Kiosk #123 has NOT_CHARGING + LOW_BATTERY (includes both issues) - Email 3 (10:10 AM): Kiosk #123 has NOT_CHARGING + SCREEN_OFF (includes both issues)

Visual Timeline

10:00 AM: NOT_CHARGING detected
  → kiosk_health_status triggered
  → Email sent ✅
  → last_trigger_sent_at('kiosk_health_status') = 10:00 AM

10:05 AM: NOT_CHARGING + LOW_BATTERY detected
  → kiosk_health_status would trigger
  → Rate limit check: Only 5 min since 10:00 AM ❌
  → Email NOT sent ❌

10:10 AM: NOT_CHARGING + SCREEN_OFF detected
  → kiosk_health_status would trigger
  → Rate limit check: Only 10 min since 10:00 AM ❌
  → Email NOT sent ❌

12:00 PM: NOT_CHARGING + LOW_BATTERY + SCREEN_OFF still present
  → kiosk_health_status triggered
  → Rate limit check: 2 hours since 10:00 AM ✅
  → Email sent ✅ (includes ALL 3 current issues)
  → last_trigger_sent_at('kiosk_health_status') = 12:00 PM

Multiple Trigger Events in One Workflow

Your Question: "If a workflow has multiple trigger events, will each send an email?"

Scenario: - Workflow has these triggers enabled: - kiosk_offline - morale_drop_baseline - kiosk_health_status - Cooldown: 1 hour - Timeline: - 10:00 AM: kiosk_offline fires - 10:05 AM: morale_drop_baseline fires - 10:10 AM: kiosk_health_status fires - 10:15 AM: kiosk_health_status fires again (different issue)

Answer: YES, each trigger event type has its own rate limit!

Why?

Looking at the code:

# app/services/trigger_evaluator_service.rb
trigger_events.each do |event_key|
  # ... evaluate trigger ...
  if within_rate_limit?(workflow, event_key)  # ← event_key is passed here!
    # Skip
  else
    process_triggered_entities(workflow, event_key, triggered_entities)
    workflow.update_last_trigger_sent_at!(event_key, Time.current)  # ← Each event_key tracked separately!
  end
end

Key points: 1. Each event_key (like kiosk_offline, morale_drop_baseline, kiosk_health_status) is tracked separately 2. last_trigger_sent_at(event_key) stores a timestamp for EACH eventkey 3. Rate limit check uses the specific `eventkey` to look up its last sent time

Detailed Timeline

10:00 AM - Kiosk Offline Detected

Cron job runs: 1. Checks kiosk_offline trigger 2. Finds: Kiosk #123 is offline 3. Rate limit check for kiosk_offline: - last_trigger_sent_at('kiosk_offline') = Never - ✅ PASSED (never sent before) 4. Create deliveries → Emails sent ✅ 5. Record: last_trigger_sent_at('kiosk_offline') = 10:00 AM

State after 10:00 AM: - last_trigger_sent_at('kiosk_offline') = 10:00 AM - last_trigger_sent_at('morale_drop_baseline') = nil (never sent) - last_trigger_sent_at('kiosk_health_status') = nil (never sent)

10:05 AM - Morale Drop Detected

Cron job runs: 1. Checks kiosk_offline trigger → No new offline kiosks 2. Checks morale_drop_baseline trigger → Finds: Team morale dropped 3. Rate limit check for morale_drop_baseline: - last_trigger_sent_at('morale_drop_baseline') = nil (never sent) - ✅ PASSED (never sent before) 4. Create deliveries → Emails sent ✅ 5. Record: last_trigger_sent_at('morale_drop_baseline') = 10:05 AM

State after 10:05 AM: - last_trigger_sent_at('kiosk_offline') = 10:00 AM - last_trigger_sent_at('morale_drop_baseline') = 10:05 AM - last_trigger_sent_at('kiosk_health_status') = nil (never sent)

10:10 AM - Kiosk Health Status Detected (First Time)

Cron job runs: 1. Checks kiosk_offline trigger → No new offline kiosks 2. Checks morale_drop_baseline trigger → No new morale drops 3. Checks kiosk_health_status trigger → Finds: Kiosk #123 has NOTCHARGING 4. **Rate limit check for `kioskhealthstatus:** -lasttriggersentat('kioskhealthstatus')= nil (never sent) - ✅ **PASSED** (never sent before) 5. **Create deliveries** → Emails sent ✅ 6. **Record:**lasttriggersentat('kioskhealth_status') = 10:10 AM`

State after 10:10 AM: - last_trigger_sent_at('kiosk_offline') = 10:00 AM - last_trigger_sent_at('morale_drop_baseline') = 10:05 AM - last_trigger_sent_at('kiosk_health_status') = 10:10 AM

10:15 AM - Kiosk Health Status Detected Again (Different Issue)

Cron job runs: 1. Checks kiosk_offline trigger → No new offline kiosks 2. Checks morale_drop_baseline trigger → No new morale drops 3. Checks kiosk_health_status trigger → Finds: Kiosk #123 now has NOTCHARGING + LOWBATTERY 4. Rate limit check for kiosk_health_status: - last_trigger_sent_at('kiosk_health_status') = 10:10 AM (5 minutes ago) - Rate limit: 1 hour - Time since last send: 5 minutes - ❌ BLOCKED → Only 5 minutes have passed, need 1 hour 5. Result:NO deliveries created

Summary Table

Time Trigger Event Last Sent At Rate Limit Check Result
10:00 AM kiosk_offline Never ✅ Passed Email sent ✅
10:05 AM morale_drop_baseline Never ✅ Passed Email sent ✅
10:10 AM kiosk_health_status Never ✅ Passed Email sent ✅
10:15 AM kiosk_health_status 10:10 AM ❌ Blocked (5 min < 1 hour) Email NOT sent ❌
11:10 AM kiosk_health_status 10:10 AM ✅ Passed (1 hour passed) Email sent ✅

Key Takeaways

  1. Each trigger event type has its own rate limit tracking

    • kiosk_offline can fire at 10:00 AM
    • morale_drop_baseline can fire at 10:05 AM (even though kiosk_offline fired 5 minutes ago)
    • kiosk_health_status can fire at 10:10 AM (even though other triggers fired recently)
  2. Same trigger event type shares the same rate limit

    • If kiosk_health_status fires at 10:10 AM
    • Another kiosk_health_status detection at 10:15 AM will be blocked
  3. Different trigger event types are independent

    • They don't affect each other's rate limits
    • Each has its own last_trigger_sent_at timestamp

How Cadence Rules Work

What Are Cadence Rules?

Cadence rules are limits that prevent sending too many emails to the same person. They're configured per Persona (a group of stakeholders with similar roles).

Types of Cadence Rules

1. Cooldown Hours

What it means: "Wait at least X hours between sending ANY email to this person"

Example: - Cooldown: 2 hours - Last email to Alice: 10:00 AM - New email attempt: 11:00 AM - Result: ❌ BLOCKED (only 1 hour has passed, need 2 hours) - Next attempt: 12:00 PM - Result: ✅ ALLOWED (2 hours have passed)

2. Daily Limit

What it means: "Send at most X emails per day to this person"

Example: - Daily limit: 5 emails - Emails sent to Alice today: 4 - New email attempt - Result: ✅ ALLOWED (4 < 5) - Emails sent to Alice today: 5 - New email attempt - Result: ❌ BLOCKED (5 = 5, limit reached) - Next day: ✅ ALLOWED (counter resets)

3. Weekly Limit

What it means: "Send at most X emails per week to this person"

Example: - Weekly limit: 20 emails - Emails sent to Alice this week: 19 - New email attempt - Result: ✅ ALLOWED (19 < 20)

4. Monthly Limit

What it means: "Send at most X emails per month to this person"

Example: - Monthly limit: 80 emails - Emails sent to Alice this month: 79 - New email attempt - Result: ✅ ALLOWED (79 < 80)

5. Type Daily Limit

What it means: "Send at most X emails per day of TYPE Y to this person"

Example: - Alert type daily limit: 5 emails - Alert emails sent to Alice today: 4 - New alert email attempt - Result: ✅ ALLOWED (4 < 5) - Alert emails sent to Alice today: 5 - New alert email attempt - Result: ❌ BLOCKED (5 = 5, alert limit reached) - BUT: A "report" type email could still be sent (different type!)

Priority Order

When multiple limits are exceeded, the system uses this priority:

  1. Cooldown (most restrictive) - "Wait X hours"
  2. Type Daily Limit - "Max X alerts per day"
  3. Daily Limit - "Max X emails per day"
  4. Weekly Limit - "Max X emails per week"
  5. Monthly Limit - "Max X emails per month"

Default Cadence Rules

If no cadence rules are configured, the system uses these defaults:

{
  max_per_day: 1,
  max_per_week: 2,
  max_per_month: 0,  # 0 means unlimited
  cooldown_hours: 1,
  type_limits: {
    report: 2,
    alert: 5,
    story_so_far: 1,
    custom: 2
  }
}

How Cadence Rules Are Applied

Important: Cadence rules are checked at TWO different times to ensure compliance:

1. Creation Time (Primary Check)

When: During delivery creation in DeliveryScheduleService

Step-by-step process:

  1. Workflow triggers (Layer 1 passes)
  2. System creates CommsInstance (groups all deliveries)
  3. For each stakeholder:
    • Get stakeholder's persona
    • Get persona's cadence rules
    • Check all limits:
      • Cooldown hours: ✅ or ❌
      • Daily limit: ✅ or ❌
      • Weekly limit: ✅ or ❌
      • Monthly limit: ✅ or ❌
      • Type daily limit: ✅ or ❌
    • If ALL limits pass: Create delivery with send_at = now
    • If ANY limit fails: Create delivery with send_at = next_available_at (queued for later)

Result: Most deliveries are properly scheduled at creation time based on current cadence limits.

2. Send Time (Safety Check)

When: Right before sending in CommsDeliverySendJob

Why: Edge cases can occur where: - A delivery was created when limits allowed it, but limits were reached by send time - Multiple workflows create deliveries simultaneously - Cadence rules were changed after delivery creation

What happens: 1. Before sending, the system re-checks cadence limits for the current channel 2. If limit is hit: - Checks if secondary/fallback channel is available and within limits - If fallback available → Creates fallback delivery for secondary channel - If no fallback → Marks delivery as failed with cadence reason 3. If limits pass: Delivery proceeds normally

Result: Ensures no delivery violates cadence rules, even if created before limits were reached.

Example Scenario: - 10:00 AM: Delivery created (limit: 1/day, current count: 0) → ✅ Allowed, send_at = 10:00 AM - 10:30 AM: Another delivery created (limit: 1/day, current count: 0) → ✅ Allowed, send_at = 10:30 AM - 10:30 AM: First delivery attempts to send → ❌ Limit check: 1/1 already sent today → Attempts fallback to SMS - 10:30 AM: Second delivery attempts to send → ❌ Limit check: 1/1 already sent today → Attempts fallback to SMS

Example: Cadence Rules in Action

Setup: - Alice has persona "Manager" - Manager persona cadence: - Cooldown: 2 hours - Daily limit: 5 emails - Alert type limit: 3 emails per day

10:00 AM - First Email: - Kiosk health alert triggered - Check Alice's cadence: - Cooldown: ✅ (no previous email) - Daily: ✅ (0/5) - Alert type: ✅ (0/3) - Result: ✅ Create delivery immediately

10:30 AM - Second Email (Different Workflow): - Morale drop alert triggered - Check Alice's cadence: - Cooldown: ❌ (only 30 minutes since last email, need 2 hours) - Daily: ✅ (1/5) - Alert type: ✅ (1/3) - Result: ❌ Queue delivery for 12:00 PM (2 hours after 10:00 AM)

12:00 PM - Queued Email Sends: - The delivery queued at 10:30 AM is now sent - Check Alice's cadence: - Cooldown: ✅ (2 hours since 10:00 AM) - Daily: ✅ (1/5, now becomes 2/5) - Alert type: ✅ (1/3, now becomes 2/3) - Result: ✅ Email sent

12:05 PM - Third Email: - Another kiosk health alert - Check Alice's cadence: - Cooldown: ❌ (only 5 minutes since 12:00 PM, need 2 hours) - Daily: ✅ (2/5) - Alert type: ✅ (2/3) - Result: ❌ Queue delivery for 2:00 PM


Common Questions

Q1: Why does the cron job run every 5 minutes if rate limiting blocks most attempts?

Answer: The cron job needs to run frequently to: 1. Detect new issues quickly - If a kiosk goes offline at 10:03 AM, we want to know by 10:05 AM, not 10:30 AM 2. Check if issues are resolved - If a kiosk was offline but comes back online, we want to know quickly 3. Handle rate limit expiration - When the cooldown period expires, we want to send the email as soon as possible (within 5 minutes)

Think of it like: A security guard checking every 5 minutes. Even if they don't call the police every time (rate limiting), they still need to check frequently to catch new problems.

Q2: What if I want different rate limits for different health issues?

Answer: Currently, this is not supported. All health issues (NOTCHARGING, LOWBATTERY, OFFLINE, etc.) trigger the same kiosk_health_status event type, so they share the same rate limit.

To implement this, you would need to: 1. Create separate trigger event types (e.g., kiosk_not_charging, kiosk_low_battery) 2. Modify the evaluator to return different event types based on the issue 3. Update the workflow to support multiple related trigger events

Q3: What happens if a kiosk has multiple issues at once?

Answer: All issues are included in the email context. The system doesn't create separate emails for each issue - it creates one email that lists all current issues.

Example: - Kiosk #123: NOTCHARGING + LOWBATTERY + OFFLINE - Email sent includes all 3 issues in the context - Rate limiting still applies once per cooldown period

Q4: Can I have different rate limits for different stakeholders?

Answer: Yes! Rate limits are configured per Persona, and stakeholders are assigned personas. So: - Managers might have: cooldown = 2 hours, daily limit = 5 - Executives might have: cooldown = 4 hours, daily limit = 3 - Technicians might have: cooldown = 1 hour, daily limit = 10

Q5: What if the cron job is delayed or misses a run?

Answer: The next run will catch up. The system checks the current state of kiosks, not a queue of events. So: - If cron job is delayed by 10 minutes, the next run will check all kiosks and find any current issues - Rate limiting is based on last_trigger_sent_at, which is stored in the database, so it persists across delays

Q6: How do I know if an email was blocked by Layer 1 vs Layer 2?

Answer: - Layer 1 (Workflow-level) blocking: Check logs for: "Rate limit not yet reached for workflow X, trigger 'kiosk_health_status'. Skipping delivery." - Layer 2 (Stakeholder-level) blocking: Check the CommsDelivery record - if send_at is in the future and status is 'scheduled', it was queued due to cadence rules

Q7: What's the difference between "cooldown hours" in Layer 1 and Layer 2?

Answer: - Layer 1 cooldown: Prevents the workflow from creating deliveries too frequently (workflow-level) - Layer 2 cooldown: Prevents sending emails to a specific person too frequently (stakeholder-level)

They can be different: - Layer 1 might be: 1 hour (workflow uses default) - Layer 2 might be: 2 hours (persona has cooldown_hours = 2)

Both must pass for an email to be sent immediately.

Q8: If multiple trigger events fire in the same cron run, will they all send emails?

Answer: Yes! Each trigger event type is evaluated independently. If: - kiosk_offline fires → Email sent ✅ - morale_drop_baseline fires → Email sent ✅ (even if kiosk_offline just fired) - kiosk_health_status fires → Email sent ✅ (even if other triggers just fired)

As long as each trigger event type hasn't fired recently (within its own cooldown), it will send.


Summary

Key Concepts

  1. Trigger events are detected by evaluators (like KioskHealthStatusEvaluator)
  2. Cron job runs every 5 minutes to check for trigger events
  3. Layer 1 (Workflow-level) rate limiting prevents the workflow from creating deliveries too frequently
  4. Layer 2 (Stakeholder-level) cadence rules prevent sending too many emails to the same person
  5. Both layers must pass for an email to be sent immediately
  6. For kiosk_health_status: Each unique combination of kiosk + project + health issue(s) has its own rate limit
  7. Different trigger event types (like kiosk_offline, morale_drop_baseline, kiosk_health_status) have separate rate limit tracking

The Flow (Simple Version)

Every 5 minutes:
  1. Cron job runs
  2. Check all kiosks for health issues
  3. If issues found:
     a. Group by context (kiosk + project + issues) for kiosk_health_status
     b. For each unique context:
        - Check Layer 1: Has this workflow sent this specific context recently?
          - If YES → Skip (don't create deliveries)
          - If NO → Continue to step c
     c. Create CommsInstance
     d. For each stakeholder:
        - Check Layer 2: Can we send to this person?
          - If YES → Create delivery with send_at = now
          - If NO → Create delivery with send_at = future (queued)
     e. Record: last_trigger_sent_at = now (for this specific context)
  4. Wait 5 minutes, repeat

Code Locations (For Reference)

  • Cron job: lib/tasks/automation_workflows.rakeprocess_triggers task
  • Trigger evaluator: app/services/trigger_evaluator_service.rb
  • Kiosk health evaluator: app/services/triggers/kiosk_health_status_evaluator.rb
  • Layer 1 rate limiting: app/services/trigger_evaluator_service.rbwithin_rate_limit?
  • Layer 2 cadence rules: app/services/delivery_schedule_service.rbcan_send_to_stakeholder?
  • Persona cadence rules: app/models/persona.rbcadence_data
  • Trigger event tracking: app/models/automation_workflow.rblast_trigger_sent_at(event_key)

End of Documentation