Version: 1.0.0

ML Models

TwinEdge's AutoML platform enables you to build, train, and deploy machine learning models for anomaly detection, predictive maintenance, and forecasting without writing code.

Overview

ML Capabilities

Capability	Description	Use Cases
Anomaly Detection	Detect unusual patterns	Equipment faults, process deviations
Classification	Categorize data points	Fault classification, quality control
Regression	Predict continuous values	Energy consumption, yield prediction
Time Series Forecasting	Predict future values	Demand forecasting, capacity planning
RUL Prediction	Remaining useful life	Predictive maintenance scheduling

Supported Algorithms

Anomaly Detection:

Isolation Forest
Autoencoder Neural Network
One-Class SVM

Classification & Regression:

XGBoost
Random Forest
Gradient Boosting

Time Series:

Prophet (Facebook)
ARIMA
LSTM Neural Network

RUL Prediction:

XGBoost with survival analysis
Weibull distribution fitting

Dataset Management

Before training models, you need to prepare your data.

Creating a Dataset

Go to ML → Datasets → New Dataset
Choose data source:
- From Telemetry: Query historical data
- File Upload: CSV or Parquet files
- Manual Entry: Direct input

From Telemetry

Query your existing sensor data:

SELECT timestamp, vibration_x, vibration_y, temperature, current
FROM telemetry
WHERE asset_id = 'Pump_001'
  AND timestamp >= NOW() - INTERVAL '30 days'

File Upload

Supported formats:

CSV (comma or tab separated)
Parquet (recommended for large datasets)

Requirements:

Header row with column names
Timestamp column (if time-series)
Numeric columns for features

Data Profiling

After upload, TwinEdge automatically profiles your data:

Row count: Total records
Column types: Numeric, categorical, datetime
Missing values: Percentage per column
Statistics: Min, max, mean, std dev
Distributions: Histograms for each column

Data Labeling

For supervised learning, label your data:

Manual Labeling

Open dataset → Labels tab
Select rows to label
Apply label (e.g., "Normal", "Fault", "Bearing Failure")

Rule-Based Labeling

Create automatic labeling rules:

IF vibration_x > 7 THEN label = "High Vibration"
IF temperature > 90 AND current > 30 THEN label = "Overload"

Import Labels

Upload a CSV with row IDs and labels.

Training Models

Starting a Training Job

Go to ML → Training → New Training Job
Complete the wizard:

Step 1: Select Dataset

Choose a prepared dataset
Review data summary

Step 2: Choose Algorithm

Select model type (anomaly detection, classification, etc.)
Pick specific algorithm

Step 3: Configure Hyperparameters

Use defaults (recommended for beginners)
Or customize:
- n_estimators: Number of trees (50-500)
- max_depth: Tree depth (3-20)
- contamination: Expected anomaly rate (0.01-0.1)

Step 4: Review & Launch

Confirm settings
Name your model
Click Start Training

Hyperparameter Optimization

Enable Auto-Optimize to automatically find the best hyperparameters:

Uses Bayesian optimization
Runs multiple training iterations
Selects best performing configuration

Training Progress

Monitor training in real-time:

Progress percentage
Current epoch/iteration
Training loss curve
Validation metrics

Training Logs

View detailed logs:

[2026-01-06 10:00:01] Loading dataset: pump_data_30d
[2026-01-06 10:00:05] Dataset loaded: 50,000 rows, 15 columns
[2026-01-06 10:00:06] Starting feature preprocessing
[2026-01-06 10:00:10] Training Isolation Forest model
[2026-01-06 10:05:22] Training complete - Accuracy: 94.2%
[2026-01-06 10:05:25] Model exported to ONNX format

Model Evaluation

Performance Metrics

After training, review model performance:

Classification/Anomaly Detection:

Accuracy: Overall correct predictions
Precision: True positives / (True + False positives)
Recall: True positives / (True + False negatives)
F1 Score: Harmonic mean of precision and recall

Regression:

MAE: Mean Absolute Error
RMSE: Root Mean Square Error
R²: Coefficient of determination

Confusion Matrix

Visual breakdown of predictions:

                Predicted
              Normal  Anomaly
Actual Normal   450      50
       Anomaly  30      470

Feature Importance

See which features most influence predictions:

vibration_x: ████████████████ 45%
vibration_y: ██████████░░░░░░ 35%
temperature: ██████░░░░░░░░░░ 20%

Testing Models

Test with sample data before deployment:

Go to model details → Test Model

Enter sample values:

{
  "vibration_x": 2.5,
  "vibration_y": 2.1,
  "temperature": 65.0,
  "current": 12.5
}

View prediction and confidence score

Model Deployment

Deploying to Cloud

Deploy models to run on incoming telemetry:

Go to model details
Click Deploy → Cloud
Configure:
- Data sources: Which assets to monitor
- Inference interval: How often to run (1s - 1h)
- Threshold: Anomaly score threshold (0.0-1.0)
Click Deploy

Deploying to Edge

Deploy models to edge devices for local inference:

Go to model details
Click Deploy → Edge
Select target devices
Click Deploy

The ONNX model is pushed via OTA to edge devices.

Deployment Status

Monitor deployed models:

Active: Running on X data sources
Inference rate: Predictions per minute
Avg latency: Prediction time
Alert count: Anomalies detected

Working with Predictions

Viewing Predictions

See model predictions in dashboards:

Add a Prediction Chart widget
Select the deployed model
View predictions over time

Prediction Alerts

Create alerts based on model output:

When anomaly_score > 0.8 for 60 seconds
Trigger Critical alert

Prediction Export

Export predictions for analysis:

ML → Models → Select model
Click Export Predictions
Choose date range and format (CSV/JSON)

Model Lifecycle

Version Control

Each training creates a new model version:

pump_anomaly_v1.0.0 - Initial model
pump_anomaly_v1.1.0 - Retrained with more data
pump_anomaly_v2.0.0 - New algorithm

Model Comparison

Compare multiple model versions:

Select models to compare
View side-by-side metrics
A/B test in production

Retiring Models

When a model is no longer needed:

Undeploy from all data sources
Archive or delete the model
Historical predictions are preserved

Best Practices

Data Quality

Sufficient data: 1000+ rows minimum
Representative: Include normal and anomalous examples
Clean: Handle missing values and outliers
Balanced: Anomalies should be 1-10% of data

Model Selection

Scenario	Recommended Algorithm
Equipment fault detection	Isolation Forest
Complex pattern recognition	Autoencoder
Fault classification	XGBoost
Time-series forecasting	Prophet
RUL prediction	XGBoost + survival

Retraining Schedule

Monthly: For stable processes
Weekly: For dynamic environments
On-demand: After process changes

Monitoring Performance

Track prediction accuracy over time
Monitor for model drift
Compare predicted vs actual outcomes
Set up alerts for degraded performance

API Reference

List Models

GET /api/v1/ml/models
Authorization: Bearer YOUR_API_KEY

Get Prediction

POST /api/v1/ml/models/{id}/predict
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "features": {
    "vibration_x": 2.5,
    "vibration_y": 2.1,
    "temperature": 65.0
  }
}

Response:

{
  "prediction": "normal",
  "confidence": 0.94,
  "anomaly_score": 0.12
}

Next Steps

Query Builder - Create custom dataset queries
Scheduled Reports - Automate ML reports
Edge Devices - Deploy models to edge

Overview​

ML Capabilities​

Supported Algorithms​

Dataset Management​

Creating a Dataset​

From Telemetry​

File Upload​

Data Profiling​

Data Labeling​

Manual Labeling​

Rule-Based Labeling​

Import Labels​

Training Models​

Starting a Training Job​

Hyperparameter Optimization​

Training Progress​

Training Logs​

Model Evaluation​

Performance Metrics​

Confusion Matrix​

Feature Importance​

Testing Models​

Model Deployment​

Deploying to Cloud​

Deploying to Edge​

Deployment Status​

Working with Predictions​

Viewing Predictions​

Prediction Alerts​

Prediction Export​

Model Lifecycle​

Version Control​

Model Comparison​

Retiring Models​

Best Practices​

Data Quality​

Model Selection​

Retraining Schedule​

Monitoring Performance​

API Reference​

List Models​

Get Prediction​

Next Steps​