Skip to main content
Version: 1.0.0

Troubleshooting

This guide helps you diagnose and resolve common issues with TwinEdge Edge devices.

Quick Diagnostics

System Status Check

Run the diagnostic script:

twinedge diagnose

Or manually check:

# Check all services
docker compose ps

# View resource usage
docker stats

# Check disk space
df -h

# Check memory
free -h

Common Status Indicators

StatusMeaningAction
🟢 HealthyAll systems operationalNone needed
🟡 WarningNon-critical issueInvestigate soon
🔴 CriticalSystem impairedImmediate action

Connection Issues

Cannot Connect to Device

Symptoms:

  • Dashboard not loading
  • SSH connection refused
  • Ping timeouts

Solutions:

  1. Verify network connectivity

    # From another device on the network
    ping DEVICE_IP
  2. Check if device is powered on

    • Verify power LED
    • Check power supply voltage
  3. Verify IP address

    # On the device
    hostname -I
    ip addr show
  4. Check firewall rules

    sudo ufw status
    sudo iptables -L

Cloud Connection Failed

Symptoms:

  • "Disconnected" status in dashboard
  • Data not syncing to cloud
  • Remote access unavailable

Solutions:

  1. Verify internet connectivity

    ping 8.8.8.8
    curl -I https://api.twinedgeai.com
  2. Check DNS resolution

    nslookup api.twinedgeai.com
  3. Verify API credentials

    cat /opt/twinedge/.env | grep API_KEY
  4. Check MQTT connection

    docker compose logs mqtt-client
  5. Review firewall/proxy settings

    • Port 443 (HTTPS) must be open outbound
    • Port 8883 (MQTTS) must be open outbound

OPC UA Connection Issues

Symptoms:

  • "Connection refused" errors
  • "BadSecurityChecksFailed" errors
  • Timeout errors

Solutions:

  1. Verify server endpoint

    # Test connectivity
    nc -zv SERVER_IP 4840
  2. Check security settings

    • Match security mode/policy with server
    • Verify certificates are installed
  3. Test with UA Expert or another client

    • Isolate whether issue is TwinEdge or server
  4. Review server logs

    docker compose logs opcua-server
  5. Certificate issues

    # List trusted certificates
    ls /opt/twinedge/certs/trusted/

    # Import server certificate
    cp server_cert.der /opt/twinedge/certs/trusted/
    docker compose restart opcua-server

Modbus Connection Issues

Symptoms:

  • No response from device
  • Incorrect values
  • Timeout errors

Solutions:

  1. Verify network connectivity

    ping MODBUS_DEVICE_IP
    telnet MODBUS_DEVICE_IP 502
  2. Check unit ID

    • Verify slave address matches device configuration
    • Try unit ID 1 (common default)
  3. Test with mbpoll

    # Install mbpoll
    apt install mbpoll

    # Read holding registers
    mbpoll -a 1 -r 1 -c 10 DEVICE_IP
  4. Verify register addresses

    • Check device documentation
    • Note: Some devices use 0-based, others 1-based addressing
  5. Check byte order

    • Test different byte order settings
    • Common: big_endian, little_endian

Data Issues

No Data Appearing

Symptoms:

  • Dashboard shows no data
  • Graphs are empty
  • "No data" messages

Solutions:

  1. Verify data source connection

    # Check if data is being collected
    docker compose logs opcua-server | tail -50
  2. Check storage service

    docker compose logs storage-service

    # Query recent data
    docker compose exec storage-service sqlite3 /data/telemetry.db \
    "SELECT * FROM sensor_data ORDER BY timestamp DESC LIMIT 10"
  3. Verify tag subscription

    • Ensure tags are correctly configured
    • Check tag paths in data source settings
  4. Check Node-RED flows

    # Open Node-RED editor
    http://DEVICE_IP:1880

    # Check flow status
    # Look for error indicators

Incorrect Data Values

Symptoms:

  • Values don't match device display
  • Random or nonsensical numbers
  • Off by factor of 10/100/1000

Solutions:

  1. Check data type configuration

    • Verify int16 vs uint16 vs float32
    • Match to device documentation
  2. Verify byte order

    # Common configurations
    byte_order: big_endian # AB CD
    byte_order: little_endian # CD AB
    byte_order: mid_big_endian # BA DC
    byte_order: mid_little_endian # DC BA
  3. Check scale factor

    scale: 0.1   # Divide raw value by 10
    scale: 10 # Multiply raw value by 10
    offset: -40 # Subtract 40 from value
  4. Verify register address

    • Off-by-one errors are common
    • Check device documentation carefully

Data Delays

Symptoms:

  • Dashboard shows old data
  • Significant lag between device and display
  • Updates come in batches

Solutions:

  1. Check polling interval

    polling_interval_ms: 1000  # Reduce for faster updates
  2. Check network latency

    ping -c 10 OPC_UA_SERVER
  3. Review queue settings

    docker compose logs -f node-red | grep queue
  4. Check storage service performance

    docker stats storage-service

Service Issues

Service Won't Start

Symptoms:

  • Container keeps restarting
  • "Exit code 1" in docker ps
  • Service unavailable

Solutions:

  1. Check container logs

    docker compose logs SERVICE_NAME
    docker compose logs --tail 100 SERVICE_NAME
  2. Check for port conflicts

    netstat -tulpn | grep PORT_NUMBER
  3. Verify configuration files

    docker compose config  # Check for YAML errors
  4. Check resource limits

    docker stats
    free -h
  5. Restart the service

    docker compose restart SERVICE_NAME
  6. Recreate the container

    docker compose up -d --force-recreate SERVICE_NAME

High CPU Usage

Symptoms:

  • Device running hot
  • Slow response times
  • Fan running constantly

Solutions:

  1. Identify culprit

    docker stats
    top
    htop
  2. Check for runaway processes

    docker compose logs SERVICE_NAME | grep -i error
  3. Reduce polling frequency

    polling_interval_ms: 5000  # Increase from 1000
  4. Disable unused services

    docker compose stop ml-inference  # If not using ML

High Memory Usage

Symptoms:

  • Out of memory errors
  • Services being killed
  • System slowdown

Solutions:

  1. Check memory usage

    docker stats --no-stream
    free -h
  2. Set memory limits

    # docker-compose.yml
    services:
    ml-inference:
    deploy:
    resources:
    limits:
    memory: 512M
  3. Clear old data

    # Clear old telemetry (keeps last 7 days)
    docker compose exec storage-service python cleanup.py --days 7
  4. Restart services to release memory

    docker compose restart

Disk Space Issues

Symptoms:

  • "No space left on device" errors
  • Services failing to write
  • Database errors

Solutions:

  1. Check disk usage

    df -h
    du -sh /opt/twinedge/*
  2. Clear Docker resources

    docker system prune -a
    docker volume prune
  3. Clear old logs

    docker compose logs --no-log-prefix SERVICE | tail -1000 > temp.log
    truncate -s 0 /var/lib/docker/containers/*/CONTAINER_ID-json.log
  4. Reduce data retention

    # Reduce from 30 to 7 days
    retention_days: 7

Alert Issues

Alerts Not Triggering

Symptoms:

  • Conditions met but no alert
  • Alert history empty
  • No notifications

Solutions:

  1. Verify alert configuration

    • Check condition thresholds
    • Verify data source is correct
  2. Check alert service

    docker compose logs alert-service
  3. Verify data is flowing

    • Check if tag has recent data
    • Verify tag name matches exactly
  4. Check notification channels

    • Verify email/Slack configuration
    • Test notifications manually

Too Many Alerts

Symptoms:

  • Alert fatigue
  • Constant notifications
  • Flapping alerts

Solutions:

  1. Adjust thresholds

    • Increase threshold values
    • Add hysteresis
  2. Add duration requirement

    duration_seconds: 60  # Must persist for 1 minute
  3. Use deadband

    deadband: 5  # Ignore changes < 5%
  4. Implement alert grouping

    • Group related alerts
    • Rate limit notifications

ML Inference Issues

Model Not Loading

Symptoms:

  • "Model not found" errors
  • Inference returns errors
  • No predictions

Solutions:

  1. Check model file exists

    ls -la /opt/twinedge/models/
  2. Verify ONNX format

    python -c "import onnx; onnx.checker.check_model('/opt/twinedge/models/model.onnx')"
  3. Check ML service logs

    docker compose logs ml-inference
  4. Verify input shape

    • Check model expects correct number of features
    • Verify feature names match

Slow Inference

Symptoms:

  • Predictions delayed
  • High latency
  • Timeouts

Solutions:

  1. Check model size

    ls -lh /opt/twinedge/models/
  2. Use quantized model

    • Use INT8 instead of FP32
    • Reduces size and speeds inference
  3. Reduce batch size

    batch_size: 1  # Reduce from 10
  4. Check CPU load

    docker stats ml-inference

Getting Help

Collecting Diagnostics

Before contacting support, collect:

# Run full diagnostic
twinedge diagnose --full > diagnostics.txt

# Or manually
docker compose ps > docker_status.txt
docker compose logs > docker_logs.txt
dmesg > system_log.txt

Support Channels

Information to Include

  1. Device type and specs
  2. TwinEdge version
  3. Error messages (exact text)
  4. Steps to reproduce
  5. Diagnostic output
  6. Recent changes made

Next Steps