Show Menu
Cheatography

CS107 Computer Organization and Systems Cheat Sheet (DRAFT) by

Cheat sheet for CS107 Computer Organization and Systems.

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Unix

Unix: a set of standards and tools commonly used in software develo­pment.
The comman­d-line is a text-based interface (i.e., terminal interface) to navigate a computer, instead of a Graphical User Interface (GUI).

Unix Commands

cd – change direct­ories (..)
ls – list directory contents (-l, -a: hidden files)
mkdir – make directory
emacs – open text editor
rm – remove file or folder (-rf)
rmdir - remove empty dir
man – view manual pages
tree cs107 -F (show files and direct­ories in tree)
pwd - output absolute path to current location
cp source dest - copy (-r to copy directory)
mv - move (rename)
cat file1 (file2 file3) print file(s one after another)
grep "­bin­ky(.*)­" program.c - search text in files (. any char, * zero or more repeats of left char, ^ beginning of line, $ end of line)
find assign1 -name "­*.c­" - search the assign1 folder for all .c files
diff hello.c hello2.c - find the diff of two files
./hello > output­Fil­e.txt - save output to file
>> - append the output to an existing file
diff file1.c file2.c | grep "­#in­clu­de" | wc -l - pipe, find # of diff lines that contain #include for two files
./addT­woN­umbers < twoNum­ber­s.txt - read user input from file
 

Bits and Bytes

Two's Complement: binary digits inverted, plus 1
Overflow: Exceed max val-->­ove­rflow back to smallest; below min val-->­ove­rflow back to largest
SCHAR_MIN (-128), UCHAR_MAX (255), SHRT_MIN, INT_MAX (21474­83647), UINT_MAX, ULONG_MAX
Casting: Replicate bit, interp­reted differ­ently (int v = -1; unsigned int uv = v;/ (unsigned int) v/ -12345U)
C will implicitly cast the signed argument to unsigned when comparing
Max is 0 followed by all 1s, min is 1 followed by all 0s in signed
Expanding bit repres­ent­ation: zero (unsigned) / sign extension (signed); promote to larger type for comparison
Truncating bit repres­ent­ation: discard more signif­icant bits
bitwise operators: &, |, ~, ^, <<, >>
^ with 1 to flip, with 0 to let a bit go through
^ flip isolated bits, ~ flip all bits
num & (num - 1): clears the lowest 1
Right shift fills with sign bit (signed, arithmetic right shift); fills with 0s (unsigned, logical right shift)
long num = 1L << 32;, CHAR_BIT = 8
int sign = value >> (sizeo­f(int) * CHAR_BIT - 1); return (value ^ sign) - sign;

Characters and C Strings

char: single character / "­gly­ph" ('\\', '\n', 'A' (65)), repres­ented as int (ASCII), lowercase 32 more than upper
isalph­a(ch) (alpha­bet), islower, isupper, isspace (space, \t, \n...), isdigit, toupper, tolower (return char, not modify existing)
C Strings: array of chars with '\0', null-t­erm­inating character, pass char* as param (add. of 1st char), str == &s­tr[0]
int foo(char *str) == int foo(char str[]), str-po­inter (char** argv == char* argv[], double pointer)

Pointers and Arrays

Pointer: A variable that stores a memory address
Memory: A big array of bytes; each byte unique numeric index (generally written in hex)
*: declar­ati­on-­poi­nter, operat­ion­-de­ref­ere­nce­/value at address
Pass value as param, C passes a copy of the value; take add (ptr) as a param, go to add when need val
char* could also ptr to single char
create strings as char[], pass them around as char *
Avoid &str when str is char[]! str/&­str[0]
&arr does nothing on arrays, but &ptr on pointers gets its address
sizeof­(arr) gets the size of an array in bytes, but sizeof­(ptr) is always 8
An array variable refers to an entire block of memory. Cannot reassign an existing array to be equal to a new array.
Pass an array as param, C makes copy of add. of 1st element and pass a ptr to function (No sizeof with param!!)

Stack Memory and Heap Memory

The stack is the place where all local variables and parameters live for each function. Goes downwards when func called and shrinks upwards when func finished
The heap is a part of memory below the stack. Only goes away when free. Grows upward. Dynamic memory during program runtime.
Allocate with malloc­/re­all­oc/­str­dup­/calloc, e.g. int *arr = malloc­(si­zeo­f(i­nt)­*len)); assert(arr != NULL); free(arr);
int *scores = calloc­(n_­elem, sizeof­(int)); (zeros out memory); char* str = strdup­("He­llo­"); malloc + strcpy
CANNOT free part of previous alloc, MUST free add received in alloc
A memory leak is when you do not free memory you previously allocated.
char *str = strdup­("He­llo­"); str = reallo­c(str, new_len + 1); (Must be ptrs returned by malloc, etc.), automatic free of prev smaller one

Generics

void*: any pointer, No derefe­ren­cin­g/P­ointer Arithmetic (cast to char* to do pointer arithm­etic)
memcpy is a function that copies a specified amount of bytes at one address to another address (returns dest).
memmove handles overla­pping memory figures (returns dest)
Function pointers: [return type] (*[nam­e])­([p­ara­met­ers]) ("ca­llb­ack­" function, function writer vs function caller)
qsort: sort arr of any type; bsearch: binary search to search for a key in arr any type; lfind: linear search to search for key (return NULL not found); lsearch: linear search, add key if not found

GDB

GDB: p/x num (hex), p/d num (digit), p/t num (binary), p/c num (char), p/u (unsigned decimal); p nums[1]@2 (start at nums[1] print 2)
gdb myprogram; b main; r 82 (run with arts); n, s, continue (next,step into, continue); info (args, locals)
ctrl-c + backtrace - display the current call stack, meaning what functions are currently executing.

Optimi­zationn

Optimi­zation: task of making program faster­/more efficient with space or time
gcc -O0 (mostly literal transl­ation), O2 (enable nearly all reasonable optimi­zat­ions), O3 (more aggres­sive, trade size for speed), Os (optimize for size), -Ofast (disregard standards compli­ance)
Target: static instru­ction count, dynamic, cycle count/­exe­cution time
Constant Folding pre-ca­lcu­lates constants at compil­e-time where possible.
Common Sub-Ex­pre­ssion Elimin­ation prevents the recalc­ulation of the same thing many times by doing it once and saving the result.
Dead code elimin­ation removes code that doesn't serve a purpose (empty for loop, if/else same operation)
Strength reduction changes divide to multiply, multiply to add/shift, and mod to AND to avoid using instru­ctions that cost many cycles (multiply and divide)
Code motion moves code outside of a loop if possible.
Tail recursion is an example of where GCC can identify recursive patterns that can be more effici­ently implem­ented iterat­ively.
Loop unrolling: Do n loop iterat­ions' worth of work per actual loop iteration, so we save ourselves from doing the loop overhead (test and jump) every time, and instead incur overhead only every n-th time.

Heap Allocator

A heap allocator is a suite of functions that cooper­atively fulfill requests for dynami­cally allocated memory.
When initia­lized, a heap allocator tracks the base addr and size of a large contiguous block of memory: heap.
Throug­hput: # requests completed per unit time (minim­izing avg time to satisfy a request) vs Utiliz­ation: how effici­ently we make use of the limited heap memory to satisfy requests.
Utiliz­ation: largest addr used as low as possible
Internal Fragme­nta­tion: allocated block larger than what's needed, external fragme­nta­tion; no single block large enough to satisfy allocation request, even though enough aggregate free memory available
Implicit free list allocator: 8 byte (or larger) header, by storing header info, implicitly mainta­ining a list of free blocks (malloc linear in total number of blocks)
Explicit free list allocator: stores ptrs to next and previous free block inside each free block's payload (look just the free blocks on linked list for malloc, linear in # free blocks, update linked list when free), throughput increase, costs: design and internal fragme­ntation

Assembly: Control Flow & Function Call

%rip stores addr of next instru­ction to execute (%rip += size of bytes of curr instru­ction)
direct jump: jum Label, indirect jump: jmp *%rax (jump to instru­ction at addr in %rax)
Condition code regs store info about most recent arithm­eti­c/l­ogical operation (lea NOT update; logical like xor set CF & OF to 0; shift set CF to last bit shifted out and OF to 0; inc and dec set OF and ZF, leave CF unchanged)
CF: unsigned overflow, OF: two's-­com­plement overfl­ow/­und­erflow
test and cmp just set condition codes (not store result)
static instru­ction count: # of written instru­ctions; dynamic instru­ction count: # of executed instru­ctions when program is run
%rsp: stores addr of "­top­" of stack, must point to same place before func called and after returned
push: R[%rsp­]<-­R[%rsp] - 8; pop+8
call: push next value of %rip onto stack, set %rip point to beginning of specified function's instru­ctions
ret: pops instru­ction addr from stack and stores it in %rip
stored %rip: return address, addr of instru­ction where execution would have continued had flow not been interr­upted by function call
nop: no-op, do nothing (make functions align); mov %ebx,%ebx, zeros out top 32 bits; xor %ebx,%ebx, set to 0, optmizes for perfor­mances & code size
Suppose %rcx stores arr[1] addr, to get arr[0] value: p *((int­*)$­rcx-1)

Assembly: Arithmetic and Logic

Machine code 1s and 0s, human-­rea­dable form assembly (GCC compiler)
Sequential instru­ctions sequential in memory
Instru­ction operation name "­opc­ode­" (mov, add, etc.), "­ope­ran­ds" (argum­ents, max 2)
$[number] constant value, "­imm­edi­ate­"; %[name] register
Register: fast read/write memory slot right on CPU that can hold variable values (not in memory, 64-bit space inside processor, total 16)
mov: $ only src, % both, memory location at least one (copy value at addr)
Indire­ct(): derefe­ren­cing, (%rbx) copy value at addr stored in %rbx
%rip: addr of next instru­ction to execute
%rsp: addr of current top of stack
movl writing to reg also set high order 4 bytes to 0
movabsq 64-bit immediate, movq only 32-bit. 64-bit imm src, only reg as dest
movz, movs, smaller src larger dst, src: memory­/reg, dest: reg
cltq: sign-e­xtend %eax to %rax
parent­heses require regs in par. be 64-bit
mov copies data at addr, lea copies value of src (addr) itself (only lea not derefe­ren­cing)
inc D D<-D + 1, dec D D <- D-1
shift k, D, k only %cl (w bits data, looks at lower-­order log2(w) bits of %cl to know how much to shift) or imm
imul: two operands, multiplies and truncates to fit in the second; one operand, multiplies by %rax, higher­-order 64 bits in %rdx, lower in %rax
idivq: divide 128-bit by 64-bit, higher­-order 64 bit of dividend stored in %rdx, lower order %rax, only list divisor as operand (quotient %rax, remainder %rdx, cqto sign-e­xtends 64-bit dividend)
 

C Program Example

#define CONSTANT 0x8
int main(int argc, char *argv[]) {
    char *prefix = "CS";
    int number = 107;
    // %s (string), %d (integer), %f (double)
    printf("You are in %s%d\n", prefix, number);
    return 0;
}

Assignment 0

/* Unix
ls samples/server_files/home/ >> home_dir.txt
diff samples/server_files/users.list home_dir.txt
grep "sudo" samples/server_files/home/mattv/.bash_history  */
int main(int argc, char *argv[]) {
    int nlevels = DEFAULT_LEVELS;
    if (argc > 1) {
        nlevels = atoi(argv[1]);
        if (nlevels < 0 || nlevels > 8) {
// void error(int status, int errnum, const char *format, ...);
            error(1, 0, "out of range");   
        }
    }
    print_triangle(nlevels);
    return 0;
}

Assignment 1

long signed_max(int bitwidth) {
    return ~signed_min(bitwidth);
}
long signed_min(int bitwidth) {
    return -1L << (bitwidth - 1);
}
long sat_add(long operand1, long operand2, int bitwidth) {
    if (!((operand1 >> (bitwidth - 1)) & 1L) && 
    !((operand2 >> (bitwidth - 1)) & 1L) && 
    (((operand1 + operand2) >> (bitwidth - 1)) & 1L)) {
        return signed_max(bitwidth);
    }
    if (((operand1 >> (bitwidth - 1)) & 1L) && 
    ((operand2 >> (bitwidth - 1)) & 1L) && 
    !(((operand1 + operand2) >> (bitwidth - 1)) & 1L)) {
        return signed_min(bitwidth);
    }
    return operand1 + operand2;
}
int to_utf8(unsigned short code_point, unsigned char utf8_bytes[]) {
    if (code_point <= 0x7f) {
        utf8_bytes[0] = code_point;
        return 1;
    } else if (code_point <= 0x7ff) {
        utf8_bytes[0] = 0xc0;  // represents 11000000.
        utf8_bytes[1] = 0x80;  // represents 10000000.
        utf8_bytes[0] |= (code_point & 0x7c0) >> 6;  // 0x7c0 provides the bit mask 11100000.
        utf8_bytes[1] |= code_point & 0x3f;  // 0x3f provides the bit mask 00111111.

        return 2;
    } else {
        utf8_bytes[0] = 0xe0;  // represents 11100000.
        utf8_bytes[1] = 0x80;  // represents 10000000.
        utf8_bytes[2] = 0x80;  // represents 10000000.
        utf8_bytes[0] |= (code_point & 0xf000) >> 12;  // 0xf000 provides the bit mask 1111000000000000.
        utf8_bytes[1] |= (code_point & 0xfc0) >> 6;  // 0xfc0 provides the bit mask 0000111111000000.
        utf8_bytes[2] |= code_point & 0x3f;  // 0x3f provides the bit mask 0000000000111111.
        return 3;
    }
}
#define BIT_MASK_3 7L
unsigned long advance(unsigned long cur_gen, unsigned char ruleset) {
    unsigned long next_gen = 0;
    unsigned long neighborhood = 0;
    neighborhood = (cur_gen << 1) & BIT_MASK_3;
    next_gen |= (ruleset >> neighborhood) & 1L; 

    for (int i = 0; i <= sizeof(long) * CHAR_BIT - 2; ++i) {
        neighborhood = (cur_gen >> i) & BIT_MASK_3;
        next_gen |= ((ruleset >> neighborhood) & 1L) << (i + 1); 
    }
    return next_gen;
}

void draw_generation(unsigned long gen) {
    for (int i = sizeof(long) * CHAR_BIT - 1; i >= 0; --i) {
        if ((gen >> i) & 1L) {
            printf(LIVE_STR);
        } else {
            printf(EMPTY_STR);
        }
    }
    printf("\n");
}

Assignment 2

const char *get_env_value(const char envp[], const char key) {
    int lenKey = strlen(key);
    for (int i = 0; envp[i] != NULL; ++i) {
        char* match = strstr(envp[i], key);
        if (match == envp[i] && *(match + lenKey) == '=') {
            return match + lenKey + 1;
        }
    }
    return NULL;
}

bool scan_token(const char **p_input, 
const char *delimiters, char buf[], size_t buflen) {
    const char begin = p_input;
    begin += strspn(begin, delimiters);
    const char* end = begin + strcspn(begin, delimiters);
    
    int maxlen = 0;
    if (end - begin <= buflen - 1) {
        maxlen = end - begin;
    } else {
        maxlen = buflen - 1;
    }
    if (maxlen <= 0) {
        *p_input = begin;
        return false;
    }
    strncpy(buf, begin, maxlen);
    buf[maxlen] = '\0';
    *p_input = begin + maxlen;
    return true;
}
int main(int argc, char argv[], const char envp[]) {
    const char *searchpath = get_env_value(envp, "MYPATH");
    if (searchpath == NULL) {
        searchpath = get_env_value(envp, "PATH");
    }
    if (argc == 1) {
        char dir[PATH_MAX];
        const char *remaining = searchpath;

        printf("Directories in search path:\n");
        while (scan_token(&remaining, ":", dir, sizeof(dir))) {
            printf("%s\n", dir);
        }
    } else {
        for (size_t i = 1; i < argc; ++i) {
            const char *executable = argv[i];
            char dir[PATH_MAX];
            const char *remaining = searchpath;
            while (scan_token(&remaining, ":", dir, sizeof(dir))) {
                strcat(dir, "/");
                strcat(dir, executable);
                if (access(dir, R_OK | X_OK) == 0) {  
                    printf("%s\n", dir);
                    break;
                }
            }        
        }
    }   
    return 0;
}

Assignment 3

char *read_line(FILE *file_pointer) {
    char* buffer = malloc(MINIMUM_SIZE);
    assert(buffer != NULL);
    size_t curSize = MINIMUM_SIZE;
    char* curPtr = fgets(buffer, curSize, file_pointer);
    if (curPtr == NULL) {
        free(buffer);
        return NULL;
    }
    size_t strLen = strlen(buffer);
    while (*(buffer + strLen - 1) != '\n') {
        curSize *= 2;
        buffer = realloc(buffer, curSize);
        assert(buffer != NULL);
        curPtr = buffer + strLen;
        char* newPtr = fgets(curPtr, curSize - strLen, file_pointer); 
        if (newPtr == NULL) {
            *curPtr = '\0'; 
            break;
        } else {
            curPtr = newPtr;
        }
        strLen += strlen(curPtr);
    }
    if (*(buffer + strLen - 1) == '\n') {
        *(buffer + strLen - 1) = '\0'; 
    }
    return buffer;
}
void print_last_n(FILE *file_pointer, int n) {
    char* lines[n];
    char* line = NULL;
    int idx = 0;
    size_t cnt_read = 0;
    while ((line = read_line(file_pointer)) != NULL) {
        lines[idx] = line;
        idx = (idx + 1) % n;
        ++cnt_read;
    }
    if (cnt_read < n) {
        idx = 0;
    } else {
        cnt_read = n;
    }
    line = lines[idx];
    size_t cnt_print = 0;
    while (cnt_print < cnt_read) {
        printf("%s\n", line);
        free(line);
        idx = (idx + 1) % n;
        line = lines[idx];
        ++cnt_print;
    }
}
struct Element {
    char* str;
    int cnt;
};

void print_uniq_lines(FILE *file_pointer) {
    size_t curSize = ESTIMATE;
    struct Element arr = malloc(sizeof(struct Element)  curSize);
    assert(arr != NULL);
    size_t cntElement = 0;
    char* line = NULL;
    while ((line = read_line(file_pointer)) != NULL) {
        bool found = false;
        for (size_t i = 0; i < cntElement; ++i) {
            if (strcmp(line, arr[i].str) == 0) {
                ++arr[i].cnt;
                found = true;
                free(line); 
                break;
            }
        }
        if (!found) {
            arr[cntElement].str = line;
            arr[cntElement].cnt = 1;
            ++cntElement;
            if (cntElement == curSize) {
                curSize += ESTIMATE;
                arr = realloc(arr, sizeof(struct Element) * curSize);
                assert(arr != NULL);
            }
        }
    }
    for (size_t i = 0; i < cntElement; ++i) {
        printf("%7d %s\n", arr[i].cnt, arr[i].str);
        free(arr[i].str);
    }
    free(arr);
}