HP Gen 10 Plus Zpool on reboot

This is going to be a bit of a head scratcher.

I have two HP Gen 10 Plus servers. Jaguar and Panther. They are fitted with SSD's. Both have a general OS SSD in bay 0. They then have 8TB Samsung QVO drives in bays 1, 2 and 3 forming a Raid 5 ZFS pool (there is still a spinning rust in bay 2 of server 2, but I don't think that has an effect on what's happening.) What happened on Jaguar also happened on Panther and the results were the same.

Static hostname: jaguar
Icon name: computer-desktop
Chassis: desktop

Machine ID: 103129ff23a64aa1b7987000ae53604b
Boot ID: a248f3747b0a432d80465dd383c104b0
Operating System: Debian GNU/Linux 12 (bookworm)
Kernel: Linux 6.1.0-21-amd64
Architecture: x86-64
Hardware Vendor: HPE
Hardware Model: ProLiant MicroServer Gen10 Plus
Firmware Version: U48

I rebooted the servers and both of them came up with a failure of the first drive in the pools of both servers.

This seemed to be more than a coincidence to me.

The system saw the drives.

An online failed with ....

Code:

NAME                    STATE     READ WRITE CKSUMjaguar                  DEGRADED     0     0     0  raidz1-0              DEGRADED     0     0     0    261934978400867995  UNAVAIL      0     0     0  was /dev/sda1    sdc                 ONLINE       0     0     0    sdd                 ONLINE       0     0     0errors: No known data errorsroot@jaguar:/home/mich# zpool online jaguar /dev/sda1cannot online /dev/sda1: cannot relabel '/dev/sda1': unable to read disk capacity

I tried various options but then I ended up doing an export and import and that cleared it...

Code:

root@jaguar:~# zpool export jaguarroot@jaguar:~# zpool import jaguarroot@jaguar:~# zpool status  pool: jaguar state: ONLINEstatus: One or more devices has experienced an unrecoverable error.  Anattempt was made to correct the error.  Applications are unaffected.action: Determine if the device needs to be replaced, and clear the errorsusing 'zpool clear' or replace the device with 'zpool replace'.   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P  scan: scrub repaired 0B in 04:23:03 with 0 errors on Sun Aug 11 04:47:04 2024config:NAME                                             STATE     READ WRITE CKSUMjaguar                                           ONLINE       0     0     0  raidz1-0                                       ONLINE       0     0     0    ata-Samsung_SSD_870_QVO_8TB_S5SSNF0W506592B  ONLINE       0     0     1    sdc                                          ONLINE       0     0     0    sdd                                          ONLINE       0     0     0errors: No known data errorsroot@jaguar:~# zpool clear jaguar

My support with HP ran out about six months ago, so I have no support from them and can't get any further firmware. I think my last firmware patches to the servers were early this year.

So I'm not sure what I'm dealing with.

The fact that it happened to both servers exactly the same, makes me believe that this is not drive failure per-se. The drives were bought months apart.

I am either dealing with something hardware, firmware or OS related, but I can't figure what. Particularly as it happened to both systems at the same time.

For safety, next time I reboot the servers I'll be exporting the sets before reboot, and then importing... but there is the obvious question as to why this happened and I'm scratching my head.

Grateful for any thoughts please.

Statistics: Posted by msknight — 2024-08-11 14:48 — Replies 1 — Views 23

HP Gen 10 Plus Zpool on reboot

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112