Performance Degradation across multiple systems

Updates

The incident , which occurred between 2025-12-10 13:36 CET/CEST and 2025-12-10 19:31 CET/CEST, has been resolved.

Our engineering team will review the issue and implement additional measures to prevent similar incidents in the future.

If you continue to experience any problems, please open a ticket with our support team.

We apologize for any inconvenience caused.

Incident Summary
On 10 December 2025, starting at approximately 14:00 CET, a significant issue with database performance impacted a select number of our dedicated customer database clusters. The root cause was identified as a recently deployed update to a database connection component, which inadvertently triggered an excessive number of status queries directed at these specific database clusters. The incident was resolved at 15:35 CET, and the situation has been stable and monitored since.

Impact on Users
Because services in the upstream of affected databases experienced a higher latency for a significant amount of their requests, their latency and error rates were increased and overall performance for all other customers was slightly decreased between 13:45 and 15:35 CET. These services include:

Perfect Gym Web Application

Access Control System

Partner Integrations & Developer Portal (Open API)

Member Platform

Our Response
14:09 CET: An alert was triggered due to a “High request error rate” for several services.

14:11 CET: The on-call-duty team started the investigation, which initially showed logs related to difficulties establishing database connections.

14:15 CET: It was concluded that there was a general issue with database performance for the affected clusters.

14:16 CET: All available resources existing of experienced operation engineers were mobilized to investigate the issue

14:40 CET: A recent database connection component update was identified as the possible root cause, leading to excessive database traffic.

14:45 CET: Work to implement a possible fix was started

14:50 CET: Failovers were performed for the affected database clusters to temporarily free up depleted connections.

15:05 CET: A rollback of the faulty update was merged and deployment started.

15:15 CET: A service processing heavy scheduled operations was temporarily shut down to further decrease pressure on the databases.

15:35 CET: The deployment finished, the fix was applied, and the database situation eased.

Resolution
The incident was resolved by implementing a rollback of the recently deployed database connection component update, which stopped the excessive database status queries and stabilized the system. During the deployment, failovers were performed on the affected database clusters and heavy internal scheduled operations where stoped to mitigate the immediate impact of depleted connections and high database pressure. The latter did not have an impact on user experience.

Lessons Learned
The incident underscores the need to discuss and implement solutions to mitigate depleted database connections and to investigate the specific technical explanation for the observed behavior under unique database cluster configurations. While extensive testing was performed, the incident highlights a gap in our testing and coverage for rare and unique configurations for certain clients. This gap will be addressed in the remediation steps. Further action will be taken to mitigate the impact of such errors in the future.

December 15, 2025 · 12:20 CET

Resolved

The implemented countermeasures fixed the issue, the situation stayed stable and the incident was resolved.

December 10, 2025 · 20:31 CET

Monitoring

The fix was successfully applied and we are monitoring the situation

December 10, 2025 · 15:35 CET

Update

The deployment with the fix is still in progress.

December 10, 2025 · 15:28 CET

Update

We identified the issue being related to high database pressure caused by a recent jdbc-driver update.

The team is working on it and a deployment with a fix is on its way.

December 10, 2025 · 15:07 CET

Investigating

The team is investigating a major outage across multiple systems

December 10, 2025 · 14:38 CET

Investigating

We are currently experiencing performance degradation and intermittent availability across several systems. Some features may be slow or unavailable. Our team is investigating the root cause.

December 10, 2025 · 14:36 CET

← Back