Show Menu
Cheatography

Arista CVP 2019.X Cheat Sheet (DRAFT) by

CloudVision Portal and Telemetry

This is a draft cheat sheet. It is a work in progress and is not finished yet.

CLI "­sho­w" Commands

cvpi status <co­mpo­nen­t>/all [-v=3]
-shows running, disabled, and failed compon­ents. It will show components that are failing. -v=3 adds verbosity.
cvpi resources [-v=3]
-shows memory, storage, disk throughput (>2­0MBps min for healthy disk, >40MBps recomm­ended), CPUs, and NTP sync (mandatory for multi-­node, at least ntpd UP for single­-node)
cvpi deps <co­mpo­nen­t> <st­art­/st­op>
gives the depend­encies for the component to be able to start/stop
cvpi debug
-collects logs for all components for troubl­esh­ooting; collect on the primary node.
cvpi logs <co­mpo­nen­t>
-to find where logs are located for a particular component. i.e. 'cvpi logs aeris' shows you /cvpi/­app­s/a­eri­s/logs/. Also good for finding which node a component and subsequent logs can be found. i.e. 'cvpi logs turbin­e-r­ate­-in­tf-­cou­nters' shows it resides on the tertiary node and its path.
cvpi info <co­mpo­nen­t>
-great command to learn about a component; includes actions that can be taken, ports used, config, logging, etc.
cvpi status all -v=3 | grep disabled
-to see which processes are disabled
history
-shows list of all commands run
cvpi version
-shows version of CVP
cvpi env or cat /etc/c­vpi/env
-shows enviro­nmental variables and if they are correctly set
cvpi check all
-checks that everything is set up correctly; confirms nodes are talking to each other and have same config­s/e­nv/etc.
dmesg -T
-shows kernel message buffer for checking disk/s­torage issues

CLI "­Con­fig­" Commands

cvpi start/stop <co­mpo­nen­t>/all
-start­s/stops all availa­ble­/sp­ecified components
cvpi -v=3 start/stop <co­mpo­nen­t>/all
-start­s/stops all availa­ble­/sp­ecified components with verbosity (detail regarding failures if subcom­ponents fail to start
cvpi start/stop cvpi
-start­s/stops cvpi stack
cvpi reset all
-resets the CVP app to its initial state via deleting all HBASE and Hadoop data
cvpi reset aeris
-deletes all Telemetry data; can be used for expedited upgrades from 2018.2.X to 2019.1.X
cvpReI­nstall
-case-­sen­sitive; in the event of an install failure, execute on primary node to set all 3 nodes back to default.
cvpi config <co­mpo­nen­t>/all
-confi­gures the components
cvpi backup cvp
-new backup procedure in 2018.2.X and on
cvpi restore cvp cvp..tgz cvp.eo­sim­age­s..tgz
-new restore procedure in 2018.2.X and on; can't restore across major releases due to data formatting changes (i.e. can't restore from 2018.X to 2019.X)
cvpi enable cvpi
-enables components of CVP to be automa­tically restarted if they stop
cvpi init
-gets rid of corrupted data folders; recreates directory structure; repairs any damage by removing whole direct­ories
hdfs dfsadmin -safemode get
-checks to see if hadoop­/hbase in safe mode
hdfs dfsadmin -safemode leave
-try to get primar­y/s­eco­ndary to leave safe mode; then try to start it again
hdfs hbck
-checks for incons­ist­enc­ies­/co­rru­ptions; prints OK or gives Errors; run several times as some incons­ist­encies are transient
hdfs hbck -repair
-repair incons­ist­encies; run 5-10 times if necessary
/cvpi/­zoo­kee­per­/bi­n/z­kSe­rver.sh start/stop
-if seeing zookeeper issues; zookeeper won't be stopped via 'cvpi stop all'
systemctl stop cvpi-w­atc­hdo­g.timer
In a cluster, will need to stop the watchdog timer when stopping zookeeper on all three nodes otherwise it will spawn a new zookeeper process.

MINIMUM Requir­ements

Lab (<25 devices)
Production (<=500 devices)
CPUs: 16 cores
CPUs: 16 cores
RAM: 16GB
RAM: 22GB
Disk: 125GB
Disk: 1 TB
Disk Throug­hput: 20MB/s
Disk Throug­hput: 40++MB/s
More might be needed based on feature sets in use. For example:

For CloudV­ision Wifi:
+4 CPU
+8 GB RAM
+100GB Disk storage
+10 charisma

For Elasti­csearch (MAC/IP search feature):
+4 CPU

Also for Produc­tion, 16 Cores could be 8 CPU x 2 Core or 16 CPU x1 Core.

Where are the debug files?

Device­/In­terface Scale (multi­-node cluster)

As customers close in on these numbers, expect give and take with additional beta features, latency, etc. as resources reach capacity.

Where is it?

From root ==> su cvp ==> /cvpi
all scripts, packages, config files, logs
Logs
/cvpi/­logs; /cvpi/­hba­se/­logs; /cvpi/­had­oop­/logs; /cvpi/­tom­cat­/logs
Shortcut to logs
Also just run $ cvpi logs <co­mpo­nen­t> which shows path to logs.
Config Files
/cvpi/­con­f/c­omp­one­nts/; /cvpi/­app­s/t­urb­ine­/co­nfigs/; /cvpi/­app­s/a­eri­s/c­onf/; /cvpi/­app­s/c­vp/­conf/; /cvpi/­app­s/g­eig­er/­conf/; /cvpi/­app­s/w­ifi­man­age­r/conf
Backups
/data/­cvp­backup/ on the primary; backups are run nightly at 2am UTC by default; check via crontab -l as root user; 5 backups stored

Minimum Config­uration on EOS Device

Confirm the daemon is correctly installed.
!
daemon TerminAttr
   exec /usr/bin/TerminAttr -ingestgrpcurl=10.81.110.104:9910 -cvcompression=gzip -ingestauth=key,cvp -smashexcludes=ale,flexCounter,hardware,kni,pulse,strata -ingestexclude=/Sysdb/cell/1/agent,/Sysdb/cell/2/agent -ingestvrf=default -taillogs
   no shutdown
!
ntpd needs to be enabled for single node; NTP sync essential for multi-node.
!
ntp server 10.81.111.240 prefer iburst
ntp server 10.81.111.241 iburst
!
Turn up api for http for EAPI to work; turn up unix-socket so TerminAttr can talk to ConfigAgent (nginx method).
management api http-commands
   protocol http
   protocol unix-socket
   no shutdown
!
TerminAttr has 2 mechanisms to talk to Config­Agent:
Default VRF - via unix socket directly, no additional config required
Non-de­fault VRF - cannot talk directly (Confi­gAgent only listens in the Default VRF) so the connection has to go via nginx; protocol unix-s­ocket required under management api http-c­omm­ands.

Enabling LANZ on EOS CLI

queue-monitor length
!
queue-monitor streaming ⇒ TerminAttr runs in default VRF so this has to be in default as well!
no shutdown
!
Can confirm in bash via curl localh­ost­:60­60/­res­t/L­ANZ­/co­nge­stion