Proactive soft-failure prediction in optical transport networks via physics-inspired features and Infrastructure-as-Code orchestration

Main comparison on real data

Table 2 summarizes the test-set performance of all seven models on the real Mendeley benchmark. Figures are mean ± 95% CI across seeds for learned models.

Table 2 Model comparison on the Ghosh–Adhya (2025) real-data benchmark. All models share identical trajectory-level train/val/test splits and feature vectors. MAE reported in seconds; “approaching MAE” filters to samples where failure is within the trajectory window (non-censored targets).

Three findings stand out. First, the tree-ensemble methods (RF and XGBoost) produce the lowest MAE on both overall and approaching-failure subsets, with an inter-estimator gap below 1%: the physics-inspired feature set is the dominant driver of performance, not the specific learner. Second, the \(\sim\)6\(\times\) MAE gap between tree ensembles and heuristic baselines demonstrates that the proposed approach provides substantial value over industry-standard threshold rules; Reviewer-flagged concerns about weak baselines are addressed both by adding XGBoost/LSTM/CNN and by the enlarged gap on real data. Third, deep sequence models (LSTM, 1D-CNN) do not improve on tree ensembles under fair-comparison conditions–consistent with the bimodal censored-regression structure of the TTF target (64% of test samples are censored at the cap). We note that 1D-CNN exhibits substantial seed variance (\(\pm 37\) s), reinforcing the importance of multi-seed evaluation.

Per-class performance

Table 3 decomposes MAE by failure class, restricted to approaching-failure samples where a non-trivial regression target exists.

Table 3 Approaching-failure MAE (seconds) per failure class, real benchmark.

Three observations follow. First, the near-zero MAE on ECL and no-failure classes for tree ensembles reflects correct identification of the “ceiling” regime: these trajectories never cross threshold in the observation window, and the models correctly predict the cap. Deep models show \(\sim\)30–50 s calibration error on these classes, consistent with softer output activations. Second, the EDFA class is the hardest, with \(\sim\)127 s MAE even for the best models; this reflects the wide variance in EDFA decay rates across the 756 lightpaths (ranging from marginal \(\sim\)3 dB declines to full exponential collapses). Third, the NLI class benefits more from XGBoost than RF, with approaching \(R^2\) rising from 0.08 to 0.14–consistent with boosting’s better handling of accelerating, non-monotonic degradation signatures.

Synthetic-data cross-physics validation

On the synthetic multi-physics benchmark (details in supplementary material), the same Random Forest model achieves 17.9 s MAE with \(R^2 = 0.914\). Cross-model validation (train on one physics, test on another) yields 10–32 s MAE on transfers between gradual modes (OU \(\leftrightarrow\) exponential \(\leftrightarrow\) Weibull \(\leftrightarrow\) oscillatory), and 38–52 s MAE on the step-failure class–which is physically unpredictable from pre-failure telemetry and serves as a control case. These results are consistent with the real-data findings in that gradual degradation modes are learnable and transferable, while catastrophic step changes are not (Figs. 4 and 5).

Per-alarm interpretability: case studies

To demonstrate that the framework provides operationally meaningful explanations at decision time, we examine four trajectories drawn from the real-data test set representing the four qualitative outcomes an operator may encounter:

Case 1 – EDFA, true positive (traj 70).: OSNR declines steadily from 19 dB over 800 samples. The alarm fires at \(t{=}724\), 44 s before the 15 dB crossing (Fig. 6). SHAP attribution at the alarm moment identifies current OSNR (−490 SHAP), rolling standard deviation (−85), SNR\(_{t-1}\) (−55), and rolling mean (−45) as the dominant drivers–an operator-readable diagnostic of “current signal is low, recent history is consistently low with little noise, and this is a genuine degradation rather than a measurement transient.”

Case 2 – NLI, true positive (traj 2087).: OSNR declines sharply from 17 dB to 15 dB in 350 samples. The alarm fires at \(t{=}337\), 30 s before failure (Fig. 7). The shorter lead time reflects the accelerating degradation characteristic of NLI. SHAP attribution at alarm follows the same ranking as the EDFA case, indicating that a single decision logic applies across physically distinct failure modes–a property valuable for operator training.

Case 3 – stable link, true negative (traj 1872).: OSNR remains at 19 dB for the full 900 samples. The predicted TTF remains pegged at the 880 s ceiling with small (\(\sim\)30 s) transient dips that never approach the 60 s alarm threshold. This demonstrates that the persistence filter and learned decision boundary combine to produce “trusted silence” on stable links.

Case 4 – false positive (traj 1215).: OSNR declines slowly from 18.8 dB to \(\sim\)15.2 dB over the full trajectory without crossing the 15 dB threshold within the observation window. An alarm fires at \(t{=}894\), six samples before trajectory end (Fig. 8). SHAP attribution reveals that current OSNR and rolling mean drive the decision–reasoning that would likely resolve to a true positive had the observation continued beyond sample 900. This case illustrates the intended use of SHAP in production: operators can examine the explanation, classify marginal alerts, and feed disposition back for continuous learning.

Global feature importance

Aggregating SHAP values across 5000 real test samples produces a markedly different distribution from the synthetic-data analysis(Fig. 9). On real data, current OSNR contributes 77.9% of total mean-absolute SHAP, with SNR\(_{t-10}\) (6.9%), rolling mean (6.3%), and rolling standard deviation (5.1%) forming the next tier; velocity contributes 0.75% and acceleration 0.1%. This concentration reflects the smoother, less jittery character of real telemetry compared to stochastic-simulator trajectories (which inject controlled noise to differentiate failure modes).

Interpretation: at the aggregate level, current OSNR is the strongest single predictor. At alarm moments specifically (Sec. Per-Alarm Interpretability: Case Studies), rolling statistics contribute \(\sim\)25–30% of per-decision attribution. The derivative features remain informative for distinguishing among failure modes (as demonstrated in the cross-physics synthetic validation), but their contribution is concentrated at the decision boundary rather than uniformly across the operational envelope.

End-to-end latency budget

Table 4 reports stage-wise wall-clock latency of the proposed pipeline, measured over 200 iterations per stage. Stages 1–4 are directly measured; stages 5–6 are estimated from published Kubernetes controller benchmarks and Sgambelluri et al.’s¹⁶ reported Terraform-to-OpenROADM apply times, as we do not currently operate a physical optical device in the loop (Fig. 10).

Table 4 Measured end-to-end latency budget (mean, milliseconds).

Three observations follow. First, ML inference is negligible: feature extraction plus RF inference totals 25.2 ms–less than 0.5% of the end-to-end budget. Physics-inspired tabular features plus tree inference yield a decision pipeline that is operationally invisible. Second, the persistence filter is the dominant deliberate delay, representing a design choice (two additional polls at 1 s intervals) that reduces the synthetic-data false-alarm rate from 12% to 2%. Third, orchestration stages (Kubernetes + Terraform) together account for 68% of the total budget. These stages are the legitimate optimization target for future work; options include pre-compiled Terraform plans, event-driven reconciliation instead of polling, and direct NETCONF APIs bypassing the Terraform layer for time-critical migrations.

The measured 6.7 s budget fits comfortably within the observed lead times from Sec. Per-Alarm Interpretability: Case Studies: 44 s for the EDFA case and 30 s for the NLI case. At the shortest NLI lead time, the pipeline consumes 22% of the available budget; at typical EDFA lead times, 15%. The orchestration layer is therefore fast enough to act within the prediction horizon for gradual failures. Although the real-data MAE of 73.2 s is larger than the simulation-only result (17.9 s), the shortest observed lead time (30 s for NLI failures) still exceeds the measured 6.7 s orchestration budget by 3.5\(\times\), providing adequate operational margin for make-before-break migration.

Source link

Former Olympian reveals how he built a…

David Jonsson Was Iced Out in Cartier…

Lee & Associates Greenville / Spartanburg Welcomes…

Silver Rate Today in Junagadh 28th July…

Bitcoin Collateral Reaches DeFi Without Wrapping: Hashi…

Scotland’s para mixed pair to meet England…

GCC economies lead global competitiveness indicators as…

Brookfield raises $2-billion for Middle East private…

Silver Rate Today in Siddipet 28th July…

Gold SWOT: Both DPM metals and discovery…

Proactive soft-failure prediction in optical transport networks via physics-inspired features and Infrastructure-as-Code orchestration

Main comparison on real data

Per-class performance

Synthetic-data cross-physics validation

Per-alarm interpretability: case studies

Global feature importance

End-to-end latency budget

Leave a Comment Cancel Reply

Former Olympian reveals how he built a $10m, nine-property portfolio

David Jonsson Was Iced Out in Cartier for His ‘Black Panther 3’ Reveal at Comic-Con

Lee & Associates Greenville / Spartanburg Welcomes Brian Young, SIOR, CCIM as Executive Vice President/Principal

Silver Rate Today in Junagadh 28th July 2026 : 1 KG, Silver Price in Junagadh

Bitcoin Collateral Reaches DeFi Without Wrapping: Hashi Testnet Launches on Sui

Monday's charts for gold, silver, platinum and palladium,...

BitGo Prime Launches Financing Offering

Editor's Picks

Scotland’s para mixed pair to meet England in gold medal match

GCC economies lead global competitiveness indicators as economic policies and reforms drive transformation

Brookfield raises $2-billion for Middle East private equity fund

Weekly Featured

MB&F Debut the LM101 EVO NYC Edition in Rose Gold and Black

Taj Chandigarh Earns 13-Year EarthCheck Platinum Milestone

5 Best Long Term Stocks to Buy According to Hedge Funds

SUBSCRIBE TO OUR NEWSLETTER

SUBSCRIBE TO OUR NEWSLETTER

Main comparison on real data

Per-class performance

Synthetic-data cross-physics validation

Per-alarm interpretability: case studies

Global feature importance

End-to-end latency budget

Related posts

Leave a Comment Cancel Reply