Skip to content

Instantly share code, notes, and snippets.

@hiroyuki-sato
Last active August 29, 2015 14:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hiroyuki-sato/db1608ad031fd18cdae3 to your computer and use it in GitHub Desktop.
Save hiroyuki-sato/db1608ad031fd18cdae3 to your computer and use it in GitHub Desktop.
megacli memo

megacli smartの見方

MegaCLIでデバイスIDを取得します。

# /opt/MegaRAID/MegaCli/MegaCli64 -LDPDInfo -a0 | grep 'Device Id' 
Device Id: 8
Device Id: 9
Device Id: 11
Device Id: 14
Device Id: 15
Device Id: 13
Device Id: 16
Device Id: 17
Device Id: 18

smartctlを実行します。

# smartctl -a -d megaraid,18 /dev/sdb

18は、上記コマンドの実行結果です。

# smartctl -a -d megaraid,8 /dev/sda

実行結果例

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-123.9.3.el7.scst.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

/dev/sda [megaraid_disk_08] [SAT]: Device open changed type from 'megaraid,8' to 'sat+megaraid,8'
=== START OF INFORMATION SECTION ===
Model Family:     Samsung based SSDs
Device Model:     Samsung SSD 840 PRO Series
Serial Number:    S12SNEAD307504A
LU WWN Device Id: 5 002538 550276c2b
Firmware Version: DXM04B0Q
User Capacity:    512,110,190,592 bytes [512 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Dec 15 10:31:42 2014 JST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(53956) seconds.
Offline data collection
capabilities: 			 (0x53) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					No Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  35) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       4103
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       130
177 Wear_Leveling_Count     0x0013   099   099   000    Pre-fail  Always       -       1
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   075   045   000    Old_age   Always       -       25
195 ECC_Error_Rate          0x001a   200   200   000    Old_age   Always       -       0
199 CRC_Error_Count         0x003e   100   100   000    Old_age   Always       -       0
235 POR_Recovery_Count      0x0012   099   099   000    Old_age   Always       -       66
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       1238017862

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
  255        0    65535  Read_scanning was never started
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartdのよる監視

  • RHEL7の場合:/etc/smartmontools/smartd.conf
  • RHEL6の場合:/etc/smartd.conf

初期状態は、次のようにDEVICESCANが有効になっているコメントをみるとわかるように、 この行があるご、以降の行を読んでくれないのでコメントにする

# The word DEVICESCAN will cause any remaining lines in this
# configuration file to be ignored: it tells smartd to scan for all
# ATA and SCSI devices.  DEVICESCAN may be followed by any of the
# Directives listed below, which will be applied to all devices that
# are found.  Most users should comment out DEVICESCAN and explicitly
# list the devices that they wish to monitor.
DEVICESCAN -H -m root -M exec /usr/libexec/smartmontools/smartdnotify -n standby,10,q

変更後

#DEVICESCAN -H -m root -M exec /usr/libexec/smartmontools/smartdnotify -n standby,10,q

最終的な設定は次のようになる。コメントを抜くと必要なのは次の行だけ

DEFAULT -m xxxx@xxx.com
/dev/sda -a -d megaraid,8

次のように記述するとメール送信のテストをすることができる。smartdの再起動が必要

DEFAULT -m xxxx@xxx.com -M test

つぎのような場合あ、sat+を先頭につける

# smartctl -a -d megaraid,8 /dev/sda
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-2.6.32-431.el6.scst.x86_64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

/dev/sda [megaraid_disk_08] [SAT]: Device open changed type from 'megaraid' to 'sat'
Smartctl open device: /dev/sda [megaraid_disk_08] [SAT] failed: SATA device detected,
MegaRAID SAT layer is reportedly buggy, use '-d sat+megaraid,N' to try anyhow

smartctl -a -d sat+megaraid,8 /dev/sdb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment