What stack does Sentinel run on?

Sentinel runs on AWS Bedrock with Anthropic Claude Sonnet 4.6 as the underlying model, invoked via the US-bounded inference profile for data residency. Each agent is an investigate-only workload with a discrete surveillance scope, a scheduled execution cadence, and structured Bedrock tool access. Detection events serialize to Slack via webhook, where the approval gate runs. Sentinel runs on the same Bedrock substrate iSimplifyMe deploys for client validator-architecture engagements.

How does the Slack approval gate work?

When a Sentinel agent identifies an incident requiring action, it posts a Block Kit card to the relevant Slack channel with the diagnosis, the recommended remediation, and a small set of action buttons. A human reviewer clicks one, and only then does the remediation fire. The pattern came from a 2026 incident where an unguarded automated drip in the Retell Phone Bridge sent the same follow-up email 48 times to three leads. Sentinel design rule since: automated detection is fine, automated remediation requires a human in the loop.

What does the Diagnostics Agent catch?

The Diagnostics Agent investigates client tenant sites that have failed three consecutive uptime checks. It runs curl, dig, and Cloudflare 5xx-breakdown probes via custom Bedrock tools, classifies root cause, and files a markdown bug-report ticket with timeline, evidence, and recommended fix. Verified cost: six cents per incident on synthetic test cases. It runs in production today across the iSM tenant fleet.

How does Sentinel differ from Nexus and Apex?

Three distinct layers. Nexus is the customer-facing AI intelligence platform, with nine modules including the Synapse orchestration layer for customer agent workflows. Apex is the multi-tenant client portal, with nine modules including Bot Analytics for tenant-facing AI-crawler intelligence. Sentinel runs internally on AWS Bedrock to keep the rest of the iSM stack honest — it is iSM's own production proof point of the validator-architecture pattern offered to regulated-industry clients. The same architecture is available to clients as a productized Sentinel-pattern monitoring retainer.

Sentinel — iSimplifyMe Labs

Production AI ops layer on AWS Bedrock — investigate-only Claude agents, Slack-gated escalation.

Abstract

Sentinel is iSimplifyMe's production AI operations layer — a fleet of investigate-only Claude agents running on AWS Bedrock that monitor iSM's own infrastructure for regressions, anomalies, and operational incidents. It is internal infrastructure, not a customer-facing product, and runs to keep the rest of the platform honest. The same architecture is offered to clients as a productized Sentinel-pattern monitoring retainer.

Problem

Production AI infrastructure has more silent failure modes than monitorable ones. A Bedrock model that responds with semantically wrong answers passes a 200 OK health check. A retrieval pipeline that surfaces stale data clears every uptime probe.

Manual log review does not scale across a multi-site network with thirty in-production engagements. Status pages tell you what is on; they do not tell you what is wrong.

Approach

The agent topology

Each Sentinel agent is an investigate-only Bedrock-hosted workload with a discrete surveillance scope and a defined cadence. The agent reads from a constrained set of operational signals — logs, recent error events, model-call traces — runs a Claude Sonnet 4.6 inference pass via the us. US-bounded inference profile to classify the situation, and decides whether the finding warrants escalation. No agent writes to client systems; the architecture is investigate-and-notify only.

Slack as the approval gate

When an agent identifies something worth escalating, it posts a Block Kit card to the appropriate channel with the diagnosis, the recommended remediation, and a small set of action buttons. A human reviewer clicks one. Only then does any remediation fire.

The design rule came directly from a 2026 incident where an unguarded automated drip in the Retell Phone Bridge sent the same follow-up email 48 times to three leads. Sentinel's discipline since then: automated detection is fine, automated remediation requires a human in the loop.

Workload #1: Diagnostics Agent

The Diagnostics Agent investigates client tenant sites that have failed three consecutive uptime checks — running curl, dig, and Cloudflare 5xx-breakdown probes via custom Bedrock tools — then files a markdown bug-report ticket with timeline, root cause, evidence, and recommended fix. Verified cost: $0.06 per incident on synthetic test cases (Claude Sonnet 4.6, ~50 seconds active runtime, ~28k tokens).

Workload #2: GH Triage Agent

The GH Triage Agent polls iSimplifyMe org repository workflow runs every fifteen minutes, detects failures, and runs an inference pass classifying root cause across eight categories: test_flake, regression, infrastructure, auth, dependency, lint_or_typecheck, build_config, and unknown. Output is a structured ticket with markdown body covering Failure Summary, Classification, Failed Jobs, Recent Commits, and investigator Notes. Verified cost: $0.065 per run (Claude Sonnet 4.6, ~44 seconds active runtime).

Idempotent — once a failed run is investigated, a 24-hour DDB lock prevents re-investigation, so flapping CI does not produce duplicate tickets.

Workload #3: Pipeline Hang Detector

The Pipeline Hang Detector watches the iSM multi-site content pipeline for anomaly states — stuck topic-proposal runs, malformed MDX rejections, frontmatter envelope drift, write-post Lambda failures — and runs an inference pass to classify the cause and identify the affected tenants. Output uses the same structured ticket format as the other agents and routes to a content-pipeline-specific Slack channel.

The three workloads share infrastructure: one generic SQS-triggered runner Lambda dispatches the right agent based on a SENTINEL_AGENT_SLUG kickoff message, an atomic conditional-write lock at INCIDENT#OPEN race-protects parallel detection paths, and the same file_ticket and notify_slack tools serve all three. Adding a new Sentinel workload is a registry entry plus a detector handler; everything else is shared.

Eat-our-own-dogfood proof point

Sentinel runs on the same AWS Bedrock substrate (BedrockRuntimeClient + ConverseStreamCommand + DynamoDB ticket store + EventBridge cron + SQS queue + IAM-scoped Bedrock perms) that iSimplifyMe deploys for client validator-architecture engagements. iSM operates Sentinel as a production proof point of the architecture it proposes for regulated-industry clients — every workload type is in production at iSM before being offered to clients.

Status

Sentinel runs in production on AWS Bedrock as iSM's internal AI operations infrastructure. Three workloads are live: Diagnostics Agent, GH Triage Agent, and Pipeline Hang Detector.
Total platform cost: under $50/month across all three workloads at current activity volume.
Architecture is investigate-only by design — no agent writes to client systems, no agent fires remediation without human approval through the Slack gate.

Roadmap

Sentinel's roadmap continues across two tracks: additional workloads against iSM's own properties, and productization as a client-facing service line.

iSM property monitoring (internal expansion)

Lighthouse regression detector — nightly Lighthouse audits across the iSM editorial atlas network (Marque Cars, Subdial, Eldercare Atlas, RoofingTechPro) and client websites; threshold-based detection of performance regressions before they affect AEO rankings.
AEO drift and citation surveillance — schedule-driven probes against ChatGPT, Gemini, AI Overview, and Perplexity for the citation-protected substrate pages currently cited as authoritative sources; alerts on framing or citation drift.
Cost anomaly detector — CloudWatch billing and Cost Explorer probes for AWS spend spikes across the iSM project portfolio.
Weekly audit agent — cross-repo health rollups across the thirty-seven iSimplifyMe org repositories.
DNS watcher — Cloudflare zone monitoring for the brand-citation infrastructure across all iSM-managed domains.

Client engagements (productized)

The Sentinel architecture is available to client engagements as a productized retainer: Sentinel-pattern monitoring. iSM operates the same investigate-only agent topology on the buyer's AWS Bedrock infrastructure to provide continuous validator-gate hit/miss telemetry, drift detection, and incident response. The retainer pairs with the Validator Architecture build engagement — audit, architecture, then operate — and is priced as a custom monthly retainer sized to scope.

The pattern is repeatable per-client: discovery (which validator gates does the buyer deploy?), Sentinel deployment (investigate-only Claude agents on the buyer's Bedrock account), Slack-gated escalation (findings route to a buyer-designated channel; remediation requires human approval), and quarterly reviews against iSM's reference architecture. Mid-market and enterprise regulated industries only.

Sentinel

What is Sentinel?