Show Menu
Cheatography

cassandra 2.0.X Cheat Sheet (DRAFT) by

This is a draft cheat sheet. It is a work in progress and is not finished yet.

nodetool

bin/no­detool status
get the status of nodes, UN: up + nomal
bin/no­detool info -h 127.0.0.1
detailed infomation of the node 127.0.0.1
bin/status ring
get the ring inform­ation with the

system

bin/ca­ssandra
start the cassandra; with -f run in the foreground
$ ps aus | grep cass
get the cassandra pid
$ kill pid
close the cassandra service
conf/c­ass­and­ra.yaml
config­uration file
config­/lo­g4j­-se­rve­r.p­rop­erties
where is log fille written, size the max file size

cql - crud

source 'filen­ame.cql'
run a file with cql commands
insert into
insert into <ta­ble> (xxx,xxx) values­('x­xx'­,'xxx')
insert value to table
sstabl­eloader tool
select * from <ta­ble>
select xxx,xxx from <ta­ble>
copy from
import .csv file
copy to
export .csv file
copy <ta­ble> (xxx,xxx) from 'file path' with header = true and delimiter = '|'
copy csv file example, notice: if a record already there and duplicated with the primary key with the file, thus the record will be simply replaced
bin/ca­ssa­ndr­a-cli
start cli (thrift)
use <db­nam­e>
cli command, use keyspace
list <ta­ble­nam­e>
cli command, list how the table is stored

cassan­dra-cli

bin/ca­ssa­ndr­a-cli
start cassan­dra-cli tool
use <ke­ysp­ace>
go into the keyspace
list <ta­ble>
show the storage of the table
bin/no­detool flush home_s­ecurity
flush the memtable to disk
bin/ss­tab­le2json /var/l­ib/­cas­sandra data/h­ome­_se­cur­ity­/ac­tiv­ity­/ho­me_­sec­uri­ty_­act­ivi­ty-­jb-­1-D­ata.db
see the sstable, notice: use the Data.db file

data modeling

no join
no join in cassandra, the query should just work in one talbe
select * from <ta­ble> where <pa­rtition key> = 'xxx' and <pr­imary key> = "­xxx­"
where
need to include one partiiton key
secondary index
a index beyond the partition key and clustering columns, for each secondary index, cassandra creates a hidden talbe on each node in the cluster, it doesn't improve the speed
create a table for each query
this can improve the speed
create index <in­dex­_na­me> on <ta­ble> (code_­used)
create a secondary index
composite partition key
a partition key with more than one column
create table <ta­ble­nam­e>(XXX XXX, ..., primary key((xxx, xxx), xxx))
create a composite partition key
 

cql

datast­ax.c­om­/do­cum­ent­ati­on/cql
cql docume­ntation
bin/cqlsh
start cql comments
describe cluster
describe cluster
help <co­mme­nd>
help
exit
exit
describe keyspaces
list all the databases
describe keyspace <db­nam­e>
details about the database
create keyspace <db­nam­e> with replic­ation = {'clas­s':­'Ne­two­rkT­opo­log­ySt­rat­egy', 'dc1':3, 'dc2':2}
create a database across multiple data center
create keyspace <db­nam­e> with replic­ation = {'clas­s'=­'Si­mpl­eSt­rat­egy', 'repli­cat­ion­_fa­cto­r'=1}
create a database in one cluster
drop keyspace <ta­ble­nam­e>
delete a database
create table <ta­ble­nam­e>(­home_id text, datetime timestamp, event text, code_used text primary key (home_id, datetime)) with clustering order by (datetime DESC)
create a table
drop table <ta­ble­nam­e>
delete table
use <db­nam­e>
use a keyspace
ascii, bigint, blob, boolean, counter, decimal, double, float, inet, int, list, map, set, text, timestamp, uuid, timeuuid, varchar, varint
cql data types
primary key
a way to uniquely identify a record in a table
partition key
first primary key, to determine which node store the record. (old name: row key) Partit­ioner hash the partition key
create table <ta­ble­nam­e> (...) with clustering order by (datetime desc)
define the order of table, it default is ascend, if descend, than it takes longer to write, since the record is inserted at the start of a partition, but improves read perfor­mance. The order can not be changed by the command "­alter <ta­ble­>"
 

applic­ations

planet­cas­san­dra.or­g/c­lie­nt-­dri­ver­s-tool
cassandra drivers
Cluster cluster = Cluste­r.b­uil­der­().a­dd­Con­tac­tPo­int­s("1­27.0.0.1", "­127.0.0.2­"­).b­uild();
build a cluster with java driver, it is better more than one contact point exist

update data

update <ta­ble> set xxx='xxx', xxx='xxx' where xxx='xxx'
update record
update location using ttl 100 set XXX=XXX, XXX=XXX where XXX=XXX and XXX= XXX
updating with time to live
delete
delete a value in a column, or a row or rows
delete column from <ta­ble> where ...
delete the column value where ...
delete from <ta­ble> where ...
delete a row where ...
truncate
delete all of the rows in a talbe
drop
delete a table or keyspaces
drop table <ta­ble>
drop keyspace <ke­ysp­ace>

tombstone

gc_gra­ce_­seconds
the minimum existence of the deleted record, it is 864000(10 days) by default
compaction
data deleted, then reclaim the disk space from deleted data
bin/no­detool compact
manually do the compac­tion, but it is usually automa­tically
TTL
Time To Live, a way to specify an expiration date for data that is being inserted
insert into locati­on(xxx, xxx) values ('xxx', 'xxx') using ttl 30
inserted data will live 30 seconds
sstabl­e2json <ss­tab­le>
in the records, "­d": deletion (after TTL), "­e": expire (before TTL)