Version: 1.0.0

Troubleshooting

This guide helps you diagnose and resolve common issues with TwinEdge Edge devices.

Quick Diagnostics

System Status Check

Run the diagnostic script:

twinedge diagnose

Or manually check:

# Check all services
docker compose ps

# View resource usage
docker stats

# Check disk space
df -h

# Check memory
free -h

Common Status Indicators

Status	Meaning	Action
🟢 Healthy	All systems operational	None needed
🟡 Warning	Non-critical issue	Investigate soon
🔴 Critical	System impaired	Immediate action

Connection Issues

Cannot Connect to Device

Symptoms:

Dashboard not loading
SSH connection refused
Ping timeouts

Solutions:

Verify network connectivity

# From another device on the network
ping DEVICE_IP

Check if device is powered on
- Verify power LED
- Check power supply voltage
Verify IP address
```
# On the device
hostname -I
ip addr show
```
Check firewall rules
```
sudo ufw status
sudo iptables -L
```

Cloud Connection Failed

Symptoms:

"Disconnected" status in dashboard
Data not syncing to cloud
Remote access unavailable

Solutions:

Verify internet connectivity

ping 8.8.8.8
curl -I https://api.twinedgeai.com

Check DNS resolution
```
nslookup api.twinedgeai.com
```
Verify API credentials
```
cat /opt/twinedge/.env | grep API_KEY
```
Check MQTT connection
```
docker compose logs mqtt-client
```
Review firewall/proxy settings
- Port 443 (HTTPS) must be open outbound
- Port 8883 (MQTTS) must be open outbound

OPC UA Connection Issues

Symptoms:

"Connection refused" errors
"BadSecurityChecksFailed" errors
Timeout errors

Solutions:

Verify server endpoint

# Test connectivity
nc -zv SERVER_IP 4840

Check security settings
- Match security mode/policy with server
- Verify certificates are installed
Test with UA Expert or another client
- Isolate whether issue is TwinEdge or server
Review server logs
```
docker compose logs opcua-server
```

Certificate issues

# List trusted certificates
ls /opt/twinedge/certs/trusted/

# Import server certificate
cp server_cert.der /opt/twinedge/certs/trusted/
docker compose restart opcua-server

Modbus Connection Issues

Symptoms:

No response from device
Incorrect values
Timeout errors

Solutions:

Verify network connectivity

ping MODBUS_DEVICE_IP
telnet MODBUS_DEVICE_IP 502

Check unit ID
- Verify slave address matches device configuration
- Try unit ID 1 (common default)

Test with mbpoll

# Install mbpoll
apt install mbpoll

# Read holding registers
mbpoll -a 1 -r 1 -c 10 DEVICE_IP

Verify register addresses
- Check device documentation
- Note: Some devices use 0-based, others 1-based addressing
Check byte order
- Test different byte order settings
- Common: big_endian, little_endian

Data Issues

No Data Appearing

Symptoms:

Dashboard shows no data
Graphs are empty
"No data" messages

Solutions:

Verify data source connection

# Check if data is being collected
docker compose logs opcua-server | tail -50

Check storage service

docker compose logs storage-service

# Query recent data
docker compose exec storage-service sqlite3 /data/telemetry.db \
  "SELECT * FROM sensor_data ORDER BY timestamp DESC LIMIT 10"

Verify tag subscription
- Ensure tags are correctly configured
- Check tag paths in data source settings

Check Node-RED flows

# Open Node-RED editor
http://DEVICE_IP:1880

# Check flow status
# Look for error indicators

Incorrect Data Values

Symptoms:

Values don't match device display
Random or nonsensical numbers
Off by factor of 10/100/1000

Solutions:

Check data type configuration
- Verify int16 vs uint16 vs float32
- Match to device documentation

Verify byte order

# Common configurations
byte_order: big_endian      # AB CD
byte_order: little_endian   # CD AB
byte_order: mid_big_endian  # BA DC
byte_order: mid_little_endian # DC BA

Check scale factor

scale: 0.1   # Divide raw value by 10
scale: 10    # Multiply raw value by 10
offset: -40  # Subtract 40 from value

Verify register address
- Off-by-one errors are common
- Check device documentation carefully

Data Delays

Symptoms:

Dashboard shows old data
Significant lag between device and display
Updates come in batches

Solutions:

Check polling interval

polling_interval_ms: 1000  # Reduce for faster updates

Check network latency
```
ping -c 10 OPC_UA_SERVER
```

Review queue settings

docker compose logs -f node-red | grep queue

Check storage service performance
```
docker stats storage-service
```

Service Issues

Service Won't Start

Symptoms:

Container keeps restarting
"Exit code 1" in docker ps
Service unavailable

Solutions:

Check container logs

docker compose logs SERVICE_NAME
docker compose logs --tail 100 SERVICE_NAME

Check for port conflicts
```
netstat -tulpn | grep PORT_NUMBER
```

Verify configuration files

docker compose config  # Check for YAML errors

Check resource limits
```
docker stats
free -h
```
Restart the service
```
docker compose restart SERVICE_NAME
```

Recreate the container

docker compose up -d --force-recreate SERVICE_NAME

High CPU Usage

Symptoms:

Device running hot
Slow response times
Fan running constantly

Solutions:

Identify culprit
```
docker stats
top
htop
```

Check for runaway processes

docker compose logs SERVICE_NAME | grep -i error

Reduce polling frequency

polling_interval_ms: 5000  # Increase from 1000

Disable unused services

docker compose stop ml-inference  # If not using ML

High Memory Usage

Symptoms:

Out of memory errors
Services being killed
System slowdown

Solutions:

Check memory usage
```
docker stats --no-stream
free -h
```

Set memory limits

# docker-compose.yml
services:
  ml-inference:
    deploy:
      resources:
        limits:
          memory: 512M

Clear old data

# Clear old telemetry (keeps last 7 days)
docker compose exec storage-service python cleanup.py --days 7

Restart services to release memory
```
docker compose restart
```

Disk Space Issues

Symptoms:

"No space left on device" errors
Services failing to write
Database errors

Solutions:

Check disk usage
```
df -h
du -sh /opt/twinedge/*
```

Clear Docker resources

docker system prune -a
docker volume prune

Clear old logs

docker compose logs --no-log-prefix SERVICE | tail -1000 > temp.log
truncate -s 0 /var/lib/docker/containers/*/CONTAINER_ID-json.log

Reduce data retention

# Reduce from 30 to 7 days
retention_days: 7

Alert Issues

Alerts Not Triggering

Symptoms:

Conditions met but no alert
Alert history empty
No notifications

Solutions:

Verify alert configuration
- Check condition thresholds
- Verify data source is correct
Check alert service
```
docker compose logs alert-service
```
Verify data is flowing
- Check if tag has recent data
- Verify tag name matches exactly
Check notification channels
- Verify email/Slack configuration
- Test notifications manually

Too Many Alerts

Symptoms:

Alert fatigue
Constant notifications
Flapping alerts

Solutions:

Adjust thresholds
- Increase threshold values
- Add hysteresis

Add duration requirement

duration_seconds: 60  # Must persist for 1 minute

Use deadband
```
deadband: 5  # Ignore changes < 5%
```
Implement alert grouping
- Group related alerts
- Rate limit notifications

ML Inference Issues

Model Not Loading

Symptoms:

"Model not found" errors
Inference returns errors
No predictions

Solutions:

Check model file exists
```
ls -la /opt/twinedge/models/
```

Verify ONNX format

python -c "import onnx; onnx.checker.check_model('/opt/twinedge/models/model.onnx')"

Check ML service logs
```
docker compose logs ml-inference
```
Verify input shape
- Check model expects correct number of features
- Verify feature names match

Slow Inference

Symptoms:

Predictions delayed
High latency
Timeouts

Solutions:

Check model size
```
ls -lh /opt/twinedge/models/
```
Use quantized model
- Use INT8 instead of FP32
- Reduces size and speeds inference
Reduce batch size
```
batch_size: 1  # Reduce from 10
```
Check CPU load
```
docker stats ml-inference
```

Getting Help

Collecting Diagnostics

Before contacting support, collect:

# Run full diagnostic
twinedge diagnose --full > diagnostics.txt

# Or manually
docker compose ps > docker_status.txt
docker compose logs > docker_logs.txt
dmesg > system_log.txt

Support Channels

Documentation: docs.twinedgeai.com
Community: discord.gg/twinedge
Email: support@twinedgeai.com
Enterprise: Dedicated support portal

Information to Include

Device type and specs
TwinEdge version
Error messages (exact text)
Steps to reproduce
Diagnostic output
Recent changes made

Next Steps

Installation - Reinstall if needed
Protocols - Protocol-specific troubleshooting
OTA Updates - Update to fix known issues

Quick Diagnostics​

System Status Check​

Common Status Indicators​

Connection Issues​

Cannot Connect to Device​

Cloud Connection Failed​

OPC UA Connection Issues​

Modbus Connection Issues​

Data Issues​

No Data Appearing​

Incorrect Data Values​

Data Delays​

Service Issues​

Service Won't Start​

High CPU Usage​

High Memory Usage​

Disk Space Issues​

Alert Issues​

Alerts Not Triggering​

Too Many Alerts​

ML Inference Issues​

Model Not Loading​

Slow Inference​

Getting Help​

Collecting Diagnostics​

Support Channels​

Information to Include​

Next Steps​

Quick Diagnostics

System Status Check

Common Status Indicators

Connection Issues

Cannot Connect to Device

Cloud Connection Failed

OPC UA Connection Issues

Modbus Connection Issues

Data Issues

No Data Appearing

Incorrect Data Values

Data Delays

Service Issues

Service Won't Start

High CPU Usage

High Memory Usage

Disk Space Issues

Alert Issues

Alerts Not Triggering

Too Many Alerts

ML Inference Issues

Model Not Loading

Slow Inference

Getting Help

Collecting Diagnostics

Support Channels

Information to Include

Next Steps