Monitoring Lytics
Introduction
Lytics provides a variety of valuable metrics that downstream monitoring tools, such as Google Stackdriver or New Relic, can consume. Using industry-standard, inexpensive tools built for alerting and monitoring, you can have visibility into Lytics within your existing ecosystem.
Consuming Metrics Downstream
Once you have connected Lytics to your monitoring tool, there are various ways you can apply metrics from Lytics to support your operational and IT audit processes. Below are a few examples:
- Active monitoring - defined by alerting, requires additional configuration within your downstream tools to monitor Lytics.
- Oncall distribution lists - control who within your operational teams to inform.
- Quiet hours - manage within a tool where you are already doing that for other metrics.
- Correlation of metrics - show existing metrics (e.g., website performance) in the context of Lytics metrics.
- Operational users - watch for signals without creating a Lytics admin user account.
- Anomaly detection - use threshold-based alerts that most monitoring tools have that go beyond the capabilities Lytics provides natively.
Platform Monitoring via Metric API
The Metric API provides access to a variety of metrics that are recorded in the Lytics platform. This API allows you to access segment size metrics, events received per hour, and many workflow-specific metrics.
- Heartbeats: metrics with a value of 1 for "up & healthy" and 0 for "not healthy" (or missing).
Name | Description | Updated |
---|---|---|
monitoring_heartbeat | A simple 1 (up) for each minute a workflow runs indicates the overall integrations platform. | Every minute* |
collection_count | A metric for the count (gauge) for the 1-minute window in total events ingressed (web collection or import workflows). | Every minute |
stream_count | Metric per stream for a count of events seen this cycle. | Every hour |
Coming Soon | Lytics has several more metrics (API Status Heartbeat, Backlog/Latency) coming soon or via a request. | N/A |
*Availability of Metrics: the Lytics Metric API and all Lytics export workflows run inside Lytics' work runtime system in Kubernetes. During deploys or scaling events, these processes can move between servers, potentially resulting in 1 or 2-minute gaps in metrics. Therefore, alerts on single heartbeat misses are not recommended. Instead, look for a missing window of 5.
Account Activity Monitoring via System Events API
System Events provide visibility into system changes such as creating segments, updating schema, deleting items, new authorizations added, and users added to roles. This is a wealth of audit data about changes in your account. These events are shown in many places inside Lytics, such as the history of work sync events, status changes, work failures due to expiring authorization, or password changes on the source for OAuth tokens. See our System Events API documentation for more information.
Status Events Monitoring via Webhooks
Work status events can be observed by creating a webhook subscription that POSTs data (or JSON) to a specific URL. These updates, like email alerting and reporting, can be consumed downstream for your monitoring use cases. Some common examples include listening for audience exports created/updated/deleted or being notified whenever a batch import or export for a given integration fails.
These events have three attributes used for filtering:
- Subject Type: what the event is about, such as work, workflow, user, campaign. See the list of subject types below.
- Subject ID: identifier of a subject, such as work ID, workflow ID, campaign ID, etc.
- Verb: action described by the event performed on a subject. See the list of available verbs below.
Verb | Description | Frequency |
---|---|---|
synccomplete | For the completion of one synchronization cycle. Emitted when a work cycle finishes successfully. Shown at the end when there are multiple cycles per scheduled sync or when there is a sleep cycle. | Real-time |
update | For when work configuration is modified. It may occur multiple times per work. | Real-time, batch |
created | For when works are created. This only occurs only once per work. | Real-time, batch |
deleted | For when a work is deleted. | Real-time, batch |
synced | For the completion of one sync unit (multiple units may happen per sleep cycle). | Real-time |
completed | For the final successful completion of a work. This occurs once per work. | Batch |
started | For the first time, work is started. This occurs once per work. | Real-time, batch |
failed | For the final failure of a work. This occurs once per work unless work is bounced. | Real-time, batch |
syncing | For the start of a series of sync cycles for a work. | Real-time |
Subject Types
- account
- auth
- campaign
- data
- entity
- experience
- journey
- program
- provider
- query
- report
- rollup
- schema
- schematable
- scoring
- segment
- segmentcollection
- segmentml
- stream
- subscription
- topic-document
- user
- variation
- work
- workflow
Updated 2 months ago