Back to Blog
TechnicalApril 3, 20266 min readUpdated April 3, 2026

What Is Regime Classification and Why Does It Matter for Anomaly Detection?

By Roger Hahn | JD | MBA | MS Engineering | USPTO Reg. No. 46,376

What Is Regime Classification and Why Does It Matter for Anomaly Detection?

Key Takeaways

  • Binary anomaly/not-anomaly detection misses the gradual degradation path that precedes most industrial equipment failures.
  • Canary Edge classifies each detection into one of four regimes: HEALTHY, ACTIVE, TRANSITION, and SHOCK, providing a graded warning system.
  • Per-machine calibration means regime thresholds are tuned to each specific piece of equipment, eliminating the false alarms that come from one-size-fits-all models.
  • Real-world experience shows that bearing failures typically progress through all four regimes over 3 to 6 weeks, giving maintenance teams a long intervention window.

Why Is Binary Anomaly Detection Insufficient?

Traditional anomaly detection systems give you one bit of information per reading: anomaly or not anomaly. This binary output made sense when computing was expensive and models were simple. It does not make sense for industrial equipment health monitoring.

Consider a pump bearing that is failing. The failure does not happen suddenly. It develops over weeks through a progression of increasingly severe vibration patterns. A binary system tells you nothing until the fault crosses a threshold. By then, you have days or hours, not weeks.

What engineers actually need is a graded signal. Is the machine healthy and operating normally? Is something starting to change? Is it actively degrading? Is an imminent failure likely?

SCADA threshold alarms have the same limitation. You set a high-vibration alarm at 0.5g and wait. The bearing produces 0.48g for three weeks while it degrades, then spikes to 1.2g and fails. The alarm did not help because it only captured the final stage.

Regime classification addresses this directly by replacing the binary output with a four-level health state that reflects where the machine is in its degradation trajectory.

What Are the Four Regimes?

Canary Edge uses four regime labels that correspond to distinct stages of machine health:

HEALTHY: The machine is operating within its learned normal envelope. Prediction error is low. No intervention is needed. This is the baseline state that the model learned from your historical healthy operating data.

ACTIVE: The machine has moved outside its normal envelope in a statistically meaningful but not yet alarming way. Something has changed. The shift may be a new operating condition, a minor process change, or the earliest signs of developing wear. Monitor more closely and note the date.

TRANSITION: The machine is in a clearly degraded state. Prediction error is elevated and sustained. This is not a transient fluctuation. A developing fault is likely. Schedule inspection and maintenance within days to weeks, depending on criticality.

SHOCK: The machine is in a severely anomalous state. Prediction energy has spiked well above the normal range. Imminent failure is possible. Immediate action is warranted.

Each regime maps to a z-score range of the prediction energy. HEALTHY is below 2.0. ACTIVE is 2.0 to 3.0. TRANSITION is 3.0 to 5.0. SHOCK is 5.0 and above. These thresholds are calibrated per machine during baseline creation.

How Does Temporal Learning Enable Regime Detection?

Regime classification is only possible because Canary Edge learns how each machine behaves over time, not just what values it produces.

A static threshold system knows that 0.5g is the alarm limit. It does not know that this bearing typically runs at 0.08g, that it has been at 0.11g for the past week, and that it reached 0.13g today. The trend is diagnostic. The absolute value is not.

The JEPA model (Joint Embedding Predictive Architecture) learns the temporal dynamics of your machine. It compresses each sensor window into a latent representation and predicts what the next latent state should be. Prediction error accumulates as the machine deviates from its learned pattern.

This temporal prediction error is what drives regime classification. A machine that was HEALTHY last week and is ACTIVE today has undergone a real behavioral change. The regime label captures that change even if every individual sensor reading is still within a normal range.

This is fundamentally different from anomaly scoring systems that evaluate each point independently with no memory of what came before.

Why Does Per-Machine Calibration Matter?

No two machines are identical, even when they are the same model from the same manufacturer installed on the same day.

One pump runs at the end of a long discharge line against high head. Another runs at the front of a short circuit at low head. The vibration signatures are different. The bearing load is different. The thermal profile is different. A regime threshold calibrated for the first pump would generate constant false alarms on the second.

Canary Edge calibrates regime thresholds individually for each machine during baseline creation. You provide a sample of healthy operating data. The model learns the prediction error distribution for that specific machine at its specific operating point. Regime thresholds are then set at statistically meaningful multiples of the baseline error standard deviation.

The result is that HEALTHY means healthy for this machine at this installation, not healthy according to a generic industry table. SHOCK means a severe deviation from what this machine normally does, not an exceedance of a fixed number.

Per-machine calibration is what allows regime classification to work across diverse fleets where identical thresholds would produce meaningless results.

What Does Regime Progression Look Like on a Real Pump?

Here is a representative example drawn from a centrifugal pump with a developing outer race bearing defect, monitored over five weeks.

Weeks 1 to 3: The pump operates in HEALTHY regime throughout. Vibration is stable at 0.08g on the X-axis. The model predicts each window accurately. Prediction error remains below the ACTIVE threshold.

Week 4, day 1: The regime shifts to ACTIVE. X-axis vibration has risen to 0.11g. Individual readings are still within what most threshold systems would consider normal. But the model detects that the vibration-to-temperature correlation has changed. The bearing is producing more vibration without generating proportional heat, consistent with early rolling element fatigue.

Week 5, day 3: The regime shifts to TRANSITION. Vibration is at 0.19g. Temperature is beginning to rise as friction increases. Per-channel scores show the X-axis channel contributing 71% of prediction error. The maintenance team schedules a bearing inspection for the following weekend.

Week 5, day 5 (before the scheduled maintenance): The regime reaches SHOCK briefly during a load spike, then returns to TRANSITION. The team moves the inspection to that day. They find a spall on the outer race. The bearing is replaced. Estimated cost of unplanned failure avoided: $45,000 in emergency repair and downtime.

Compare this to a SCADA threshold monitor set at 0.5g vibration. That system would have generated zero alerts during this entire five-week period.

What Does the Regime Classification Look Like in the API Response?

The regime field is returned directly in every detection response. Here is a representative response from the univariate detection endpoint:

json
{
  "machine_id": "pump-circuit-7",
  "timestamp": "2026-04-03T14:32:00Z",
  "regime": "TRANSITION",
  "prediction_energy": 4.12,
  "z_score": 3.84,
  "is_anomaly": true,
  "channel": "vibration_x",
  "thresholds": {
    "active": 2.0,
    "transition": 3.0,
    "shock": 5.0
  }
}

The regime field gives you the interpretable label. The prediction_energy and z_score give you the raw numbers if you want to build your own logic downstream. The thresholds block shows the calibrated cutoffs for this specific machine.

For multivariate detection, the response also includes per-channel contribution scores that show which sensor is driving the anomaly. This combination of regime label plus contribution scores is what turns a detection event into actionable diagnostic information.

See the API documentation for the full response schema, or schedule a call to discuss your specific monitoring use case.

Frequently Asked Questions

Comments

Loading comments...

Leave a comment