Incident report: Oct 22, 2021 – 23:22 CET

Overview

A faulty storage unit caused service outage on some of our hosts. Some hosts were not responsive and were restarted.

Evaluation

Severity: High

Data risk: none

Timeline

Oct 22, 2021 23:22:00 CET: Monitoring has reported several virtual machines as unresponsive.

Oct 22, 2021 23:23:00 CET: All reported machines are localized on a single physical host. Looking into a networking issue.

Oct 22, 2021 23:24:00 CET: Networking check has confirmed there is no present networking problem. Field technician is send to check the host.

Oct 22, 2021 23:36:00 CET: A problem with a faulty storage unit has led to storage failure. Some cloud server were unresponsive and were rebooted.