Commands |
Description |
pureblade list |
To list the blades in FB |
purehw list --spec |
To show the blade serial number |
hal-show -e /local | jq '.family, .id, .asy, .desc' |
To show blade type and model |
hal-eeprom -e /local/eeprom/id -r | jq . |
To show blade serial number and Node ID |
fbdiag wait-helper -v -n1.3 |
To verify Process status in blade |
rpc.py blades_available | jq . | egrep "ir.*8001|cluster_id" |
To verify blade in cluster geometry |
exec.py -n$chassisNum.$bladeNum "ir_version | grep build" |
To verify the Blade Purity / Build version |
exec.py -nx -- sudo netstat -anot | grep <CLIENT_IP:PORT> |
To verify any connections to blade |
sudo supervisorctl status nfsd |
verify NFS is running on the blade |
fbdiag nfs-health-check --mgmt-vip ch1-fb1 | grep -c "running on server" |
To verify any authority running on the blade |
nfs_control.py -n$chassisNum.$bladeNum stop -v |
Stop NFS on the blade |
nfs_control.py -n$chassisNum.$bladeNum start -v |
Start NFS on the blade |
purehw setattr --identify on CH$chassisNum.FB$bladeNum |
Turn on Locator LED on blade |
purehw setattr --identify off CH$chassisNum.FB$bladeNum |
Turn off locator LED on blade |
puregrep -E "is_evacuating changed to true|start evacuation" .fb/nfs.log |
To check when Evac started on the blade <run from FUSE> |
puregrep -E "Calling blade removal RPC|blades observed geom change" [af]m?/platform.log |
To check when Evac completed on the blade <run from FUSE> |
date; fbupgrade power-ctl -v -nc.b cycle |
Powercycle the blade |
hal-show -e /local/temperature --headers name endpoint value units |
How to check Blade temperature |
lsblk |
How to check Blade filesystem usage |
sudo smartctl -i /dev/sda |
How to check Blade SSD information |
sudo dmidecode -t 17 |
How to check Blade DIMM information |
date; exec.py -n1-2,4-11,13,14 -- 'time= date +%H:%M:
; zgrep "$time. counter.S3.*allocated_aus" /logs/nfs.log' | awk '{a=a+$12;b=b+$13}END{print a,b}' |
To check EVAC is Progressing or not , Run twice at 10 second interval |
|
How to evacuate the blade (there are few ways, here we are stopping the NFS )? |
hal-slot -l |
How to check Blade is powered ON or OFF in slot |
fbdiag nfs-health-check --mgmt-vip ch1-fb1 |grep 'booting for' |
commands to check Authority booted |
exec.py -na -- 'zgrep -a " [AEK] " /logs/nfs.log|tail' |
Check for any AEK errors on Blades |
exec.py -nx.x 'sudo rsync -auv ch1-fb1:/ssd/nfs_conf.json /ssd/nfs_conf.json' |
Copy tunable from one blade to another |
zgrep "flash_read_uncorrectable, all Vt retry options were unsuccessful" nfs.log | perl -n -e '/(\[U\d+\]).<(SM=\d+ BNK=\d+ CE=\d+ LUN=\d+ BLK=\d+)/&& print $1 . " " . $2 . "\n"' | sort | uniq | perl -n -e '/(\[U\d+\]).(SM=\d+ BNK=\d+ CE=\d+ LUN=\d+) BLK=(\d+)/&& print $1 . " " . $2 . " PLANE=" . $3%4 . "\n"' | sort | uniq -c |
Checking for Bad blocks and bad planes |
ir@ch1-fb5:~$ /opt/ir/devcat_lookup.py device_health|grep -A10 "sketchy_block_ages_sec" |
How to check Bad block rebuild progress , check for total count reached to "0" |
tgrep -a "Blade nand type detected" chfb/platform.log* | uniq -f4 |
Getting blade type information from FUSE logs |
fb dump hdiag --key puresmb.status |
To check SMB type configured from FUSE |
fb info smb |
To check SMB type configured from FUSE |