• Home
  • Announcements

2025-04-25 Web Socket connections for ATL, GRR and LAS servers failing to connect

Written by Mira Beltre

Updated at May 8th, 2025

Contact Us

  • The Essentials
    FAQs Forms
  • Announcements
    Carrier Events mFax Events Platform Events Release Notes
  • Billing Administration
    Datagate OneBill
  • Faxing
    mFax - Analog mFax - Digital Native Fax
  • Hardware & Software
    Manual Configuration Provisioning NDP Axis Cisco Fanvil Grandstream Polycom Snom Yealink Mobile Applications Desktop Applications Mobile-X SNAPbuilder TeamMate Connector UC Integrator
  • Hosted Voice
    Auto Attendants Branding Call Queues Call Routing CDRs Conferencing E-911 Features Fraud Integrations Inventory / Phone Numbers Local & Toll Free Porting Onboarding Recommendations SNAP.HD SIP Trunking SMS / MMS Users Voicemail Caller ID
  • Troubleshooting
    VoIPmonitor Firewalls PBX
  • Ray's Stuff
+ More

Table of Contents

Affected Services Event Timeline April 25, 2025 Root Cause Impact Summary Future Preventative Action Long-term action

Affected Services 

  • SNAPmobile Web

Event Timeline

April 25, 2025

  • 6:19 PM ET – OIT's on-call team was alerted to failed web socket connections for the ATL, GRR, and LAS core servers.
  • 6:30 PM ET – OIT support verified the connection failure and began investigating.
  • 6:40 PM ET – NetSapiens support was engaged after identifying an SSL failure on the web socket service.
  • 6:55 PM ET – OIT support contacted NetSapiens support by phone to escalate the ticket.
  • 7:56 PM ET – NetSapiens support forced a reload of the web socket SSL into system memory in an attempt to clear the error.
  • 8:06 PM ET – After the web socket memory refresh was completed, SNAPmobile web was retested on all servers, with GRR and LAS confirmed resolved, while ATL was still failing to connect.
  • 8:45 PM ET – OIT support continued working with NetSapiens support to determine why the resolution for GRR and LAS did not work on the ATL server.
  • 9:07 PM ET – After further investigation with NetSapiens support, we identified a misconfiguration with the default SSL configuration file. After correcting the misconfiguration, OIT engineers restarted the web socket service and Apache to load the changes into memory.
  • 9:10 PM ET – OIT support confirmed that SNAPmobile web on the ATL server was now connecting successfully.

Root Cause

The default SSL file for all core servers had inadvertently been updated to reference a different SSL file name than what the core server's FQDN was, which caused a common name mismatch when performing a certificate check. This was due to a previously unidentified bug in a new certificate management system that NetSapiens had provided for use.

Due to a limitation in the web socket service's core functionality, the service was not able to successfully renegotiate the SSL certificate without correcting the common name mismatch and forcing a renewal of the SSL certificate into memory.

Impact Summary

  • Web phones were unable to fully connect to the ATL, GRR, and LAS servers due to an SSL certificate common name mismatch error.
  • Automatic failover to alternate servers did not activate because web phones were partially registered. Manual failover was not an option at this time, as all core servers were impacted.

Future Preventative Action

Long-term action

With the migration to the new IAD and PHX servers and the decommissioning of the GRR and LAS servers, SSL certificates will be handled directly by OIT, removing this potential cause of failure in the future.
An improvement request was submitted to NetSapiens to account for the partial registrations and to include these scenarios in the automatic failover mechanism.
OIT also submitted a bug report to NetSapiens to address the recently found bug with the certificate management system.

 

transferred post foreign product

Was this article helpful?

Yes
No
Give feedback about this article

Related Articles

  • Announcement Policy
  • Known Issues
  • 2024-07-15 Atlanta Core Server Outage (Resolved)
  • 2025-04-14 Call Recordings Unavailable in Manager Portal
  • 2025/01/29 - Inbound and Outbound calls failing on the LAS and GRR servers (Resolved)

Knowledge Base Software powered by Helpjuice

Expand