Einführung in ZFS, Teil 4
Weiter gehts mit dem ZFS Special, dieses mal Hardware Fehler:
Um einen Hardware Fehler zu simulieren, habe ich aus dem JBOD eine Disk während dem Betrieb entfernt
Apr 10 10:38:02 newponit scsi: WARNING: /pci@1f,4000/scsi@5/sd@a,0 (sd52): Apr 10 10:38:02 newponit SYNCHRONIZE CACHE command failed (5) Apr 10 10:38:04 newponit scsi: WARNING: /pci@1f,4000/scsi@5/sd@a,0 (sd52): Apr 10 10:38:04 newponit disk not responding to selection Apr 10 10:38:06 newponit scsi: WARNING: /pci@1f,4000/scsi@5/sd@a,0 (sd52): Apr 10 10:38:06 newponit disk not responding to selection Apr 10 10:38:06 newponit scsi: WARNING: /pci@1f,4000/scsi@5/sd@a,0 (sd52): Apr 10 10:38:06 newponit SYNCHRONIZE CACHE command failed (5) Apr 10 10:38:08 newponit scsi: WARNING: /pci@1f,4000/scsi@5/sd@a,0 (sd52): Apr 10 10:38:08 newponit disk not responding to selection Apr 10 10:38:08 newponit scsi: WARNING: /pci@1f,4000/scsi@5/sd@a,0 (sd52): Apr 10 10:38:08 newponit SYNCHRONIZE CACHE command failed (5) Apr 10 10:38:10 newponit scsi: WARNING: /pci@1f,4000/scsi@5/sd@a,0 (sd52): Apr 10 10:38:10 newponit disk not responding to selection Apr 10 10:38:13 newponit scsi: WARNING: /pci@1f,4000/scsi@5/sd@a,0 (sd52): Apr 10 10:38:13 newponit disk not responding to selection Apr 10 10:38:15 newponit scsi: WARNING: /pci@1f,4000/scsi@5/sd@a,0 (sd52): Apr 10 10:38:15 newponit disk not responding to selection Apr 10 10:38:17 newponit scsi: WARNING: /pci@1f,4000/scsi@5/sd@a,0 (sd52): Apr 10 10:38:17 newponit disk not responding to selection Apr 10 10:38:18 newponit scsi: WARNING: /pci@1f,4000/scsi@5/sd@a,0 (sd52): Apr 10 10:38:18 newponit disk not responding to selection SUNW-MSG-ID: ZFS-8000-D3, TYPE: Fault, VER: 1, SEVERITY: Major EVENT-TIME: Tue Apr 10 10:38:38 CEST 2007 PLATFORM: SUNW,Ultra-60, CSN: -, HOSTNAME: newponit SOURCE: zfs-diagnosis, REV: 1.0 EVENT-ID: 2ed00edd-5d3f-e4bc-b09b-fc9c5dd449cf DESC: A ZFS device failed. Refer to http://sun.com/msg/ZFS-8000-D3 for more information. AUTO-RESPONSE: No automated response will occur. IMPACT: Fault tolerance of the pool may be compromised. REC-ACTION: Run 'zpool status -x' and replace the bad device.
Bei parallel SCSI Systemen kann es unter Umständen einige Minuten dauern, bis es einen Timeout gibt und die Disk als Failed angezeigt wird:
root@newponit # zpool status pool: pool0 state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-D3 scrub: none requested config: NAME STATE READ WRITE CKSUM pool0 DEGRADED 0 0 0 mirror ONLINE 0 0 0 c1t0d0 ONLINE 0 0 0 c5t8d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c5t9d0 ONLINE 0 0 0 mirror DEGRADED 0 0 0 c1t2d0 ONLINE 0 0 0 c5t10d0 UNAVAIL 268 1.24K 0 cannot open mirror ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c5t11d0 ONLINE 0 0 0 errors: No known data errors
Nachdem die neue Disk ins JBOD eingeschoben wurde, muss man diese nur noch aktivieren:
root@newponit # zpool replace pool0 c5t10d0
Da ZFS genau weiss, welche Daten gesynct werden müssen, geht das wiedereinbinden denkbar schnell. Nach nicht mal 5 Minuten war die Disk wieder in Betrieb:
root@newponit # zpool status pool: pool0 state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress, 77.35% done, 0h0m to go config: NAME STATE READ WRITE CKSUM pool0 DEGRADED 0 0 0 mirror ONLINE 0 0 0 c1t0d0 ONLINE 0 0 0 c5t8d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c5t9d0 ONLINE 0 0 0 mirror DEGRADED 0 0 0 c1t2d0 ONLINE 0 0 0 replacing DEGRADED 0 0 0 c5t10d0s0/o UNAVAIL 268 1.24K 0 cannot open c5t10d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c5t11d0 ONLINE 0 0 0 errors: No known data errors root@newponit # zpool status pool: pool0 state: ONLINE scrub: resilver completed with 0 errors on Tue Apr 10 10:57:34 2007 config: NAME STATE READ WRITE CKSUM pool0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t0d0 ONLINE 0 0 0 c5t8d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c5t9d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c5t10d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c5t11d0 ONLINE 0 0 0 errors: No known data errors
Comments
Du hast eine Maschine newponit genannt?
Ja, dass der MX in der Firma heisst so.