Troubleshooting · OCPP · 10 min read

Debugging OCPP WebSocket Disconnections

The 6 most common reasons a Charge Point drops from the CSMS — Heartbeat timeout, TLS errors, proxy interference, cellular IP changes — and exactly how to diagnose each one with logs and the Simulator.

Published · Reviewed against official OCA specification
RC
By Rodolfo Carrillo
Quick answer

The six most common causes of OCPP disconnections are: missed Heartbeat, WebSocket ping/pong timeout, expired or mismatched TLS certificate, reverse proxy blocking the WebSocket upgrade, CSMS rate limiting, and cellular IP address change. Each has a distinct log signature — this guide shows you how to identify and fix each one.

How disconnections happen

An OCPP connection is a WebSocket connection — a persistent, full-duplex TCP channel established with an HTTP upgrade handshake. Once open, the connection stays alive until one side closes it or the underlying TCP session times out.

There are two distinct failure modes. A graceful close sends a WebSocket close frame (0x88) before dropping — both sides know the connection ended intentionally. An ungraceful close happens when the TCP connection drops without warning: the remote side is unresponsive, a firewall silently kills the connection, or the physical link fails. Most production disconnections are ungraceful.

The CSMS detects an ungraceful disconnect via keepalive timeout — either the WebSocket-layer ping/pong mechanism or the OCPP Heartbeat — whichever fires first. Understanding which layer is triggering the drop is the first step in debugging.

1. Heartbeat timeout

The OCPP Heartbeat is an application-level keepalive. A Charge Point must send a Heartbeat request if no other message has been sent within HeartbeatInterval seconds. If the CSMS receives nothing within a configurable dead-time window — commonly two to three times the Heartbeat interval — it marks the Charge Point offline and may close the connection.

Common trigger: long transaction with no meter value messages

During an active charging session the firmware may suppress Heartbeat because it assumes MeterValues messages count as keepalive. Some CSMS implementations disagree — they only reset the dead-time counter on explicit Heartbeat or BootNotification messages. Send a Heartbeat even during active sessions if your CSMS has a strict timeout.

How to diagnose

  • In the CSMS log: look for heartbeat_timeout, connection_idle_timeout, or offline events on the Charge Point record.
  • In the Charge Point log: confirm the firmware is sending Heartbeat at the configured interval. A firmware bug may stop sending after a specific event (e.g. RFID tap, connector lock).
  • Cross-check the HeartbeatInterval config key on the device vs. the dead-time value configured in the CSMS.

2. WebSocket ping/pong timeout

The WebSocket protocol defines its own keepalive: the server sends a ping frame (0x89) and expects a pong frame (0x8A) back within a timeout. If the pong never arrives, the server closes the connection with code 1001 (Going Away) or simply drops the TCP socket.

This is independent of the OCPP Heartbeat. A network path that passes OCPP messages correctly can still time out WebSocket pings if the ping interval or timeout is misconfigured — particularly on high-latency cellular links where round-trip times can be 300–800 ms.

How to diagnose

  • WebSocket close code 1001 or 1006 (abnormal closure) in the Charge Point log.
  • CSMS log shows ping_timeout or ws_close_1001.
  • Check the CSMS WebSocket server's ping interval and timeout settings. For cellular chargers, a ping interval under 30 seconds with a timeout under 10 seconds is too aggressive.
  • Use the Simulator on a cellular hotspot to reproduce the latency and confirm the threshold.

3. TLS / certificate errors

If the CSMS URL uses wss://, the WebSocket is wrapped in TLS. Certificate issues are a leading cause of connections that fail at the handshake stage — the TCP session establishes but the TLS handshake fails, so no OCPP message is ever exchanged.

!

Expired server certificate

The CSMS certificate passed its Not After date. Charge Points with strict TLS validation will refuse to connect. Fix: renew the certificate and redeploy.

!

Hostname mismatch

The certificate's Common Name or Subject Alternative Names don't match the hostname in the WebSocket URL. Happens when moving from a dev to a production domain. Fix: issue a new certificate that includes the exact hostname used by Charge Points.

!

Incomplete certificate chain

The server doesn't send intermediate CA certificates. Desktop browsers fetch them automatically; Charge Point firmware usually doesn't. Fix: configure the CSMS to serve the full chain (leaf + intermediates).

!

Clock skew on the Charge Point

If the Charge Point's real-time clock is wrong, the certificate's Not Before date may be in the future — causing an immediate TLS failure. Fix: ensure the device syncs NTP before attempting the WebSocket connection, or use a gracious time window in the TLS stack.

How to diagnose

  • Charge Point log: TLS error codes like SSL_ERROR_RX_RECORD_TOO_LONG, CERTIFICATE_VERIFY_FAILED, HANDSHAKE_FAILURE.
  • Run openssl s_client -connect <csms-host>:443 -showcerts from a machine with the same root CA bundle as your Charge Points.
  • Use curl -v wss://<csms-host>/ocpp/CP001 to see the full TLS negotiation and any certificate errors in the output.

4. Proxy blocking the WebSocket upgrade

Many network deployments sit a reverse proxy (Nginx, HAProxy, AWS ALB) or a corporate firewall in front of the CSMS. If the proxy is not configured to pass WebSocket connections, it may:

  • Return HTTP 200 instead of 101 Switching Protocols
  • Strip the Connection: Upgrade or Upgrade: websocket headers
  • Close the connection after an idle timeout (e.g. 60 seconds) that hits during a quiet session
Correct WebSocket upgrade response (HTTP 101)
HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo= Sec-WebSocket-Protocol: ocpp1.6

Nginx configuration fix

nginx.conf — required WebSocket headers
location /ocpp/ proxy_pass http://csms-backend; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_read_timeout 86400s; /* keep idle connections alive */

How to diagnose

  • Check the HTTP response code during the WebSocket upgrade. Anything other than 101 is a proxy or server-side rejection.
  • If disconnections happen at exactly regular intervals (e.g. every 60 s), suspect a proxy idle timeout — not OCPP.
  • Test a direct connection to the CSMS backend port (bypassing the proxy) to isolate the issue.

5. CSMS rate limiting or connection cap

Under high load or after a large-scale power outage (all chargers reconnecting simultaneously), the CSMS may start rejecting new WebSocket connections with HTTP 429 Too Many Requests or 503 Service Unavailable. Charge Points that don't implement backoff will hammer the CSMS repeatedly, delaying recovery — the thundering herd problem.

Signs in the logs

  • Rapid connect → disconnect cycles with no OCPP messages exchanged
  • HTTP 429 or 503 in the WebSocket upgrade response
  • CSMS metrics show CPU or connection count spikes correlated with large-scale events (power restoration, CSMS restart)

Mitigation

  • Charge Point firmware: implement exponential backoff with jitter — start at 1–5 s and cap at 5–10 minutes. Add ±30 s random jitter to prevent synchronized retry storms.
  • CSMS side: implement a connection queue with a configurable rate limit per IP range. Prioritize reconnections from Charge Points with active sessions.

6. Cellular network IP address change

LTE/5G chargers are among the most common sources of unexplained disconnections. Cellular networks are not designed for persistent long-lived TCP connections:

  • The carrier's NAT state expires after a period of inactivity (often 30–120 seconds on some carriers) — packets sent to the old mapping are silently dropped.
  • During a cell-tower handoff the device may receive a new IP address, terminating all existing TCP sessions.
  • Some carriers use carrier-grade NAT (CGNAT) which adds a second NAT layer and further restricts inbound keepalive behaviour.
💡

Rule of thumb for cellular deployments

Set the Heartbeat interval to 60 seconds or less on cellular Charge Points. This keeps the carrier NAT mapping alive and gives the CSMS a fast signal when a charger goes offline. See the Heartbeat Interval guide for a full breakdown by connectivity type.

How to diagnose

  • Charge Point log: TCP connection reset around the same time as IP_CHANGE or CELL_HANDOFF events in the cellular modem log.
  • CSMS: connection drops from cellular Charge Points cluster around network-intensive times of day (rush hour, large events nearby).
  • Use the modem's AT command interface (AT+CGDCONT? or equivalent) to monitor IP reassignment events if accessible.

Reconnect logic

Per OCPP 1.6 §3.3, when a Charge Point loses its WebSocket connection it should attempt to reconnect. Key rules:

  1. Do not send BootNotification on every reconnect. Only send BootNotification if the Charge Point has actually rebooted since the last successful connection. Reconnecting after a transient outage should resume normal operations silently.
  2. Use exponential backoff. The spec does not mandate a specific algorithm but warns against overwhelming the CSMS. A doubling delay from 1 s to a maximum of 5 minutes is a common production pattern.
  3. Add random jitter. If your fleet is large, synchronized reconnects (e.g. all Charge Points reconnecting exactly 60 s after a CSMS restart) can crash the recovering CSMS. Add ±10–30 % random variation to the backoff delay.
  4. Resume queued messages. Store any messages generated while offline (StatusNotification, MeterValues, transaction events) and resend them after the connection is re-established and accepted.

Log checklist

Use this quick-reference table to map a log signature to the most likely root cause:

Log signature Most likely cause First check
WS close 1001 / ping timeout WebSocket ping/pong timeout CSMS ping interval setting
heartbeat_timeout / offline Heartbeat not sent by CP HeartbeatInterval config key
CERTIFICATE_VERIFY_FAILED Expired / mismatched certificate openssl s_client output
HTTP 200 (expected 101) Proxy not passing upgrade Nginx/HAProxy proxy headers
HTTP 429 / 503 CSMS rate limiting CSMS connection queue metrics
Drop every ~60 s on cellular Carrier NAT state expiry Heartbeat interval vs NAT timeout
Drop after IP_CHANGE event Cellular tower handoff Reconnect backoff in firmware
RC

Rodolfo Carrillo

OCPP integration engineer and creator of OCPP Tools. All articles are verified against the official Open Charge Alliance specification and tested using the on-site tools.

Frequently asked questions

What is the OCPP-recommended way to detect that a Charge Point has disconnected?
The CSMS should track the time since the last message received from each Charge Point (Heartbeat, status update, or any response). If no message arrives within a configurable timeout — typically 2–3× the Heartbeat interval — the CSMS should mark the Charge Point as offline.
Does OCPP define a maximum time a Charge Point can go without sending a message?
OCPP 1.6 defines HeartbeatInterval (configuration key) as the maximum time between Heartbeats. The spec says a Charge Point SHALL send a Heartbeat if no other message has been sent within that interval. There is no spec-mandated reconnect timeout; CSMS vendors set their own.
What HTTP status code should a Charge Point get when the CSMS rejects the WebSocket upgrade?
A successful WebSocket upgrade returns HTTP 101 Switching Protocols. Common error codes: 401 (authentication failed), 403 (Charge Point not provisioned), 404 (unknown endpoint path), 503 (CSMS overloaded). A 200 response instead of 101 usually indicates a proxy intercepting the connection.
What OCPP message should a Charge Point send after reconnecting?
Per OCPP 1.6 §3.3, after re-establishing a WebSocket connection the Charge Point should only send a BootNotification if it has restarted since the previous connection. If it reconnects after a transient network outage without rebooting, it should resume normal operations without a new BootNotification — although some CSMS vendors require one regardless.
Can a Charge Point change IP address mid-session and keep the WebSocket connection alive?
No. WebSocket connections are TCP connections. An IP address change terminates the TCP session, which closes the WebSocket. The Charge Point must reconnect and re-establish the WebSocket from the new IP. For cellular chargers this is a common scenario during carrier handoff.
What is exponential backoff and why should Charge Points use it?
Exponential backoff means each consecutive reconnect attempt waits twice as long as the previous one (e.g., 1s, 2s, 4s, 8s…) up to a maximum cap. This prevents hundreds of chargers from hammering a recovering CSMS simultaneously after an outage — a "thundering herd" problem that can keep the CSMS from recovering.

Sources & further reading

  • · Open Charge Alliance. (2015). Open Charge Point Protocol 1.6, Edition 2, §3 — WebSocket. https://openchargealliance.org/my-oca/ocpp/
  • · Fette, I., & Melnikov, A. (2011). The WebSocket Protocol. RFC 6455. https://www.rfc-editor.org/rfc/rfc6455
  • · Rescorla, E. (2018). The Transport Layer Security (TLS) Protocol Version 1.3. RFC 8446. https://www.rfc-editor.org/rfc/rfc8446
Last technical review: May 12, 2025

Continue learning