ADS Flashcards - Cram.com

Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Flashcards
»
ADS

by rowanmorris, Apr. 2019

Subjects: Computer Science

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/69

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

69 Cards in this Set

Front
Back

	What is an algorithm?	An algorithm is a method or a process followed to solve a problem
	What properties must an algorithm have?	-correctness -composed of concrete unambiguous steps -the number of steps must be finite -must terminate
	What is a data structure?	A data structure is a particular way of storing and organising data in a computer so that it can be used efficiently
	Random Access Machine	1) Memory consists of an infinite array 2) Instructions executed sequentially one at a time 3) All instructions take unit time. Running time is the number of instructions executed
	Arrays	Sequence of elements A1,A2,...,an is called an array They're in consecutive memory cells, but we (usually) don't care where exactly. All array elements are of the same type, eg, integer
	Linked lists	- a list is made up of nodes - each node stores an element (a piece of data) plus a pointer or "link" to another node - the first node is called the head - the last node, called the tail points to null - the nodes may be scattered all over the memory
	Doubly linked lists	A node in a doubly linked list stores two references: - a next link, which points to the next node in the list - a prev link, which points to the previous node in the list To simplify, we add two dummy or sentinel modes at the ends of the double linked list: - the header has a valid next reference but a null prev reference - the trailer has a valid prev reference but a null next reference A doubly linked list needs to store references to the two sentinel nodes and a size counter keeping track of the number of nodes in the list (not counting sentinels) An empty list would have the two sentinel nodes pointing to each other
	Circularly linked lists	- A circularly linked list has the same kind of nodes as a singly linked list. That is, each node has a next pointer and a reference to an element - The circularly linked list has no beginning or end. You can think of it as a singly linked list, where the last nodes next pointer, instead of being NULL, points back to the first node. - Instead of references to the head and the tail, we mark a node of the circularly linked list as the cursor. The cursor is the starting node when we transverse the list.
	Stacks	- A stack is a collection of objects that are inserted and removed according to the last-in-first-out principle - Objects can be inserted into a stack at any time, but only the most recently inserted object (the last) can be removed at any time
	Stack methods	A stack supports the following methods: - push(e) : insert element e at the top of the stack - pop: remove and return the top element of the stack; an error occurs if the stack is empty And possibly also: - size : return the number of elements in the stack - isEmpty : return a Boolean indicating if the stack is empty - top : return the top element in the stack, without removing it; an error occurs if the stack is empty
	Stacks : implementation using arrays	- the array based stack implementation is time efficient. The time taken by all methods does not depend on the size of the stack. - However, the fixed size N of the array can be a serious limitation: 1) If the size of the stack is much less than the size of the array, we waste memory 2) If the size of the stack exceeds the size of the array, the implementation will generate an exception - the array-based implementation of the stack has fixed capacity
	Queues	- A queue is a collection of objects that are inserted and removed according to the first-in-first-out (FIFO) principle - Element access and deletion are restricted to the first element in the sequence, which is called the front of the queue - Element insertion is restricted to the end of the sequence, which is called the rear of the queue
	Queues: Methods	A queue supports the following methods: - enqueue(e) : insert element e at the rear of the queue - dequeue : remove and return from the queue the element at the front; an error occurs if the queue is empty And possible also: - size : Return the number of elements in the queue - isEmpty : Return a Boolean indicating if the queue is empty - Front : Return the front element of the queue, without removing it, an error occurs if the queue is empty
	Queues : Implementation using arrays	- We use two variables f and r, which have the following meaning: 1) f is an index to the cell of Q storing the front of the queue, unless the queue is empty, in which case f = r 2) r is an index to the next available array cell in Q, that is, the cell after the rear of Q, if Q is not empty. - initially we assign f = r = 0, indicating that the queue is empty - After each enqueue operation we increment r. After each dequeue operation we increment f - r is incremented after each enqueue operation and never decremented. After N enqueue operations we would get an array-out-of-bounds error - To avoid this problem, we let r and f wrap around the end of Q, by using modulo N arithmetic on them. - If the size of the queue is N, then f = r and the isEmpty method returns true, even though the queue is not empty. - We avoid this problem by keeping the maximum of elements that can be stored in the queue to N-1. See the FullQueueException in the enqueue Algorithm - the array based implementation of th queue is time efficient. All methods run in constant time - Similarly to the array based implementation of the stack, the capacity of array based implementation of the queue is fixed
	Hash Tables	- A hash table consists of a bucket array and hash function - A bucket array for a hash table is an array A of size N, where each cell of A is thought as a bucket storing a collection of key-value pairs. - The size N of the array is called the capacity of the hash table - A hash function h is a function mapping each key k to an integer in the range [0, N-1], where N is the capacity of the hash table -The main idea is to use h(k) as an index into the bucket array A. That is, we store the key-value pair (k,v), in the bucket A[h(k)] -if there are two keys with the same hash value h(k), then two different entries will be mapped to the same bucket A. In this case, we say that a collision has occurred
	Hash functions	Ideal goal : scramble the keys uniformly. More practically: - hash functions should be efficiently computable - each table position equally likely for each key Some compression functions: - division: take integer mod N - multiply add divide (MAD) : y maps to ay + b mod N where a and B are nonnegative numbers There are several ways to deal with collisions Which ever method we choose to deal with them, a large number of collisions reduces the performance of the hash table A good hash function minimises the collisions as much as possible.
	Separate chaining	Each bucket A[i] stores a list holding the entries (k,v) such that h(k) = i Separate chaining performance: - cost is proportional to length of list - average length = N/M (N is amount of data, M is size of array) - Worst case : all keys hash to the same list
	Linear probing	- In linear probing, if we try to insert an entry (k,v) into a bucket A[i] that is already occupied, where i = h(k), then we try next at A[(i+1) mod N] - If A[(i + 1) mod N] is also occupied, then we try A[(i + 2) mod N], and so on, until we find an empty bucket that can accept the new entry - The name linear probing comes from the fact that accessing a cell of the bucket array can be viewed as a probe
	Linear probing : Costs	- Insert and search cost depend on length of cluster - average length of a cluster is N/M - Worst case : all keys hash to same cluster
	Quadratic probing	- Quadratic probing interatively tries the buckets A[i+f(j)) mod N] for j = 0,1,2,..., Where f(j) = j^2 until finding an empty bucket - Quadratic probing avoids clustering patterns that occur with linear probing. However, it creates its own kind of clustering, called secondary clustering. - The quadratic probing strategy may not find an empty bucket in A even if one exists.
	Double hashing	-In this approach, we choose a secondary hash function, h', and if h maps some key k to a bucket A[i], with i = h(k), that is already occupied, then we iteratively try the buckets A[(i+ f(j)) mod N], for j = 0,1,2,..., where f(j) =jh'(k) - A common choice of secondary hash function is h'(k) = q - (k mod q), for some prime number q < N - The secondary hash function function should not evaluate to zero
	Open addressing schemes	- linear probing - Quadratic probing - Double hashing Store at most one entry in each bucket
	Comparison	- the open addressing schemes are more memory efficient compared to separate chaining - regarding running times, in experimental and theoretical analyses, the separate chaining method is either competitive or faster than the other methods - if memory space is not a major issue, the collision-handling method of choice is separate chaining.
	Deletions	- Deletions from the hash table must not hinder future searches. If a bucket is simply left empty this will hinder future probes. - but the bucket should not be left unusable - to solve these problems we use tombstones : a marker that is left in a bucket after a deletion - if a tombstone is encountered when searching along a probe sequence to find a key, we know to continue. - If a tombstone is encountered during insertion, then we should continue probing (to avoid creating duplicates), but then the new record can be placed in the bucket where the tombstone was found - the use of tombstones lengthens the average probe sequence distance - two possible remedies: 1) local reorganization : after deleting a key, continue to follow the probe sequence of that key and move records into the vacated bucket (this will not work for all collision resolution policies) 2) Periodically rehash the table by reinserting all records into a new hash table. And if you have a record of which keys are accessed most these can be placed where they will be found most easily
	Recursive Algorithms	- A recursive algorithm must have a base case - A recursive Algorithm must change its state and move toward the base case - A recursive algorithm must call itself, recursively
	A recursive technique : Backtracking	General idea : build up the solution one step at a time, backtracking when unable to continue
	Cool log trick	Loga(X) = logb (X) / log b (a)
	Time complexity	The time complexity of an algorithm can be expressed in terms of the number of basic operations used by the algorithm when the input has particular size
	Examples of basic operations	- additions - multiplications - comparisons of two numbers - swaps - assignments of values to variables
	Space complexity	The space complexity of an algorithm is expressed in terms of the memory required by the algorithm for an input of a particular size
	Worst-case time complexity	The worst-case time complexity of an algorithm can be expressed in terms of the largest number of basic operations used by the algorithm when the input has a particular size. Worst-case analysis tells us how many operations an algorithm requires to guarantee that it will produce a solution. It is difficult to compute the exact number of operations. We are more interested in upper bounds for worst-case analysis. These bounds should give us the possibility to estimate growth of the number of operations when the input size increases. It is important to estimate the number of operations when that input size is large.
	Average-case analysis	Here we are interested in the average number of operations over all inputs of a given size. It is difficult to compute the exact number of operations.
	Big - O notation	Let f(X) and g(X) be functions from the set of integers or the set of real numbers to the set of real numbers. We stay that f(X) is O(g(X)) I'd there are constants C and k such that \|F(X)\| <= C.\|g(X)\| Whenever x >= k. The definition says that after a certain point, namely after k, the absolute value of f(X) is bounded from above by C times the absolute value of g(X). The constants C and k in the definition are called witnesses to the relationship f(X) is O(g(X)) If there is a pair of witnesses to the relationship f(X) is O(g(X)), then there are infinitely pairs of witnesses to that relationship
	Theorem (the sum rule)	If F1(X) is O(G1(X)) and f2(X) is O(G2(X)), then F1(X) + f2(X) is O(max{\|G1(X)\|,\|G2(X)\|}). Proof Let k = max{k1,K2} and C = c1 + c2 Then for X > k we have \|F1(x) + f2(X)\| < = \|F1(X)\| + \|f2(X)\| <= c1. \|G1(X)\| + c2.\|G2(X)\| <= C.max{\|G1(X)\|, \|G2(X)\|}
	Theorem (the product rule)	If F1(X) is O(G1(X) and f2(X) is O(G2(X)), then f1(X).f2(X) is O(G1(X).G2(X)). Proof Let k = max{k1,K2} and C = c1.c2 Then for X > k we have \|F1(X).f2(X)\| = \|F1(X)\|• \|f2(X)\| < = c1•\|G1(X)\|•c2•\|G2(X)\| = c•\|G1(X)•g2(X)\|. This proves the theorem.
	Big- Omega notation	We say that f(X) is omega(g(X)) if there are positive constants X and k such that \|F(X)\| >= C•\|G(X)\| whenever x > k. Note that this implies that f(X) is omega(g(X)) if and only if g(X) is O(f(X))
	Theta	We say that f(X) is theta(g(X)) if f(X) if O(g(X)) and f(X) is omega(g(X)) This is equivalent to saying that f is O(g(X)) and g(X) is O(f(X))
	Little-o	We say that f(X) is o(g(X)) when Lim X➡️(infinity) f(X)/g(X) = 0 This clearly shows that f(X) is o(g(X)) implies f(X) is O(g(X))
	Sublinear	A function f(X) is called sublinear if f(X) is o(X), So if lim X ➡️(infinity) f(X)/X = 0
	Little omega	w is to o what omega is to O
	Insertion Sort	We know: - when j has certain value, it inserts the j-th element into already sorted sequence - can be proved correct using invariant "after j-th iteration first j+1 elements are in order" - running time between n-1 and (n(n-1))/2 - worst case O(n^2)
	Selection Sort	Invariant: after i-th iteration positions 1,...,I contain the overall I many smallest elements in order Not necessarily the first i elements
	Bubble sort	The i-th iteration of the outer loop "bubbles" the i-th largest element to where it belongs
	Merge sort	- if given a sequence of no more than one element then we're done - otherwise (length > 1), split sequence into two shorter sequences of equal length, sort then recursively, and merge the two resulting sequences Mergesort always requires approximately nlog(n) steps to sort any numbers
	Quicksort	At the beginning of each recursive call, quicksort picks one elements from the current sequence, the pivot. Partitioning will be done w.r.t the pivot: - each element smaller than the pivot does int the left part - each element bigger than the pivot goes int the right part - parts may have very different sizes
	Ω(nlogn)Theorem	For any comparison-based sorting algorithm A and any n ∈ IN large enough there exists an input of length n that requires A to perform Ω(nlogn) comparisons.
	Radix Sort	• have as many buckets as you’ve got diﬀerent digits, that is, for base-10 you’ll have 10 of them • repeatedly bucket-sort by given digit • number of rounds will depend on values (the longer thebase-10 representations, the more rounds), but number of buckets only depends on number of diﬀerent digits Running time: Θ(d ·n) where d is number of rounds
	Binary Search	1 Peek right into the middle of the given array, at position p = dn/2e 2 If A[p] = x then we’re lucky & done, and can return p 3 Otherwise, if x is greater than A[p], we may focus oursearch on stuﬀ to the right of A[p] and may completely ignore anything to its left 4 Likewise, if x is smaller than A[p], we may focus oursearch on stuﬀ to the left of A[p] and may completely ignore anything to its right.This is called binary search because in each (unsuccessful) step it at least halves the search space For binary search we have the recurrence T(n) = T(n/2) + O(1) = O(logn),
	Quick Select	• recursive • Partition() function for selecting pivot and partitioning into LOW and HIGH (those smaller/bigger than pivot) • not two recursive calls to sort but now only one: diving into the part (LOW or HIGH) where we know the i-th smallest element is to be found know that after partitioning, pivot is in correct position w.r.t. overall sortedness, so can simply compare pivot position after partitioning with sought index i → T(n) = T(n−1) + O(n), which means T(n) = Θ(n^2)(worst case)
	Median-of-Medians	0 If length(A)≤ 5 then sort and return i-th smallest 1 Divide n elements into bn/5c groups of 5 elements each,plus at most one group containing the remaining n mod 5 < 5 elements 2 Find median of each of the dn/5e groups by sorting each one, and then picking median from sorted group elements 3 Call Select recursively on set of dn/5e medians found above,giving median-of-medians, x 4 Partition entire input around x. Let k be # of elements on low side plus one (simply count after partitioning): • x is k-th smallest element, and • there are n−k elements on high side of partition 5 If i = k, return x. Otherwise use Selectr ecursively to ﬁnd i-th smallest element of low side if i < k, or (i −k)-the smallest on high side if i > k
	Binary search tree	A binary search tree (BST) is a tree in which no node has more than two children (not necessarily exactly two).You must build and maintain the tree such that it’s true for every node v of the tree that: • all elements in its left sub-tree are “smaller” than v • all elements in its right sub-tree are “bigger” than v
	Tree traversals	• in-order (for binary trees): recurse into left sub-tree, print payload, recurse into right sub-tree • pre-order: print payload, recurse into sub-trees • post-order: recurse into sub-trees, print payload
	Deleting from a BST	We ﬁrst do a lookup on the element that we wish to remove. If that returns “not found” then we’re done (element not in tree).Otherwise, three cases to be considered: • If node is a leaf (no child) then simply remove it • If node has one child (left or right) then remove and replace it with that one child – lift (the one) sub-tree up. • If node has two children then...what?
	Red-black trees	Idea simple: a red-black tree is an ordinary BST with one extra bit of storage per node: its colour, red or black Constraining how nodes can be coloured results in (approximate) balancedness of tree
	A BST is a red-black tree if:	1 every node is either red or black 2 the root is black 3 every leaf (Null) is black 4 red nodes have black children 5 for all nodes, all paths from node to descendant leaves contain same # black nodes (A red-black tree with n internal nodes hash height at most 2log(n + 1))
	Red-Black Tree worst case complexities	1 space: O(n) 2 lookup: O(logn) 3 insert: O(logn) 4 delete: O(logn)
	Heaps	Heaps are trees as well, but typically assumed to be stored in a ﬂat array • each tree node corresponds to an element of the array • the tree is complete except perhaps the lowest level, ﬁlled left-to-right • heap property: for all nodes v in the tree, v.parent.data>= v.data (assuming we have a node class like for BSTs) • This is a max-heap (for min-heaps, v.parent.data =< v.data) 1 the root is in A[1] 2 parent(i) = A[i/2] – integer division, rounds down 3 left(i) = A[2i] 4 right(i) = A[2i+1]
	Heapify	1 starting at the root, identify largest of current node v and its children 2 suppose largest element is in w 3 if w 6= v · swap A[w] and A[v] . recurse into w (contains now what root contained previously)
	Running time for all algorithms studied	SelectionSort O(n^2) InsertionSort O(n^2) BubbleSort O(n^2) MergeSort O(nlogn) QuickSort O(n^2) BucketSort O(n + K) RadixSort O(nlogK) HeapSort O(nlogn)
	Basic properties of MSTs:	· have\|V\|−1 edges; · have no cycles; . might not be unique.
	Kruskal’s Algorithm	1 Sort the edges by weight. 2 Let A = ∅. 3 Consider edges in increasing order of weight. For each edge e, add e to A unless this would create a cycle. (Running time is O(E logV).)
	Prim’s Algorithm	1 Let U = {u} where u is some vertex chosen arbitrarily. 2 Let A = ∅. 3 Until U contains all vertices: ﬁnd the least-weight edge e that joins a vertex v in U to a vertex w not in U and add e to A and w to U. (Running time is O(V logV +E).)
	String Matching Problem	· Given a (long) string T (over some (ﬁnite) alphabet Σ), · and a (short) pattern P over the same alphabet (or subset), . ﬁnd one or all occurrences of P in T, if any
	Shifts	We say s is · a valid shift if P occurs with shift s in T, . and an invalid shift otherwise
	RobinHood Hashing	This is a variation of linear probing. If, during probing with a new key, an existing key is found that is “closer to home” than the new key, then the existing key is displaced and replaced by the new key.
	RobinHood hashing theorems	(Assuming truly random hash functions) the expected worst-case cost of a lookup in a hash table with Robin Hood hashing is Ω(logn).(Assuming truly random hash functions) the variance of the expected number of probes required in Robin Hood hashing is O(loglogn).
	Cuckoo Hashing	· With cuckoo hashing there are two tables, T1 and T2, and two corresponding hash functions h1 and h2. · Each keyk is stored in either T1[h1(k)] or T2[h2(k)]. · To lookupor delete a key can be done in time O(1) since only two locations need bechecked. · A key k is stored in T1[h1(k)] if it is empty... ... else it is stored in T2[h2(k)] · if that is empty If both T1[h1(k)] and T2[h2(k) ]are occupied remove the key k’ that is in T1[h1(k)] and put k in T1[h1(k)], · and move k’ to T2[h2(k')] if T2[h2(k’)] is occupied by another key k’’ . then move that to T1[h1(k'')] and so on ... each time a key is displaced move it to the other table.
	The Cuckoo Graph	· The cuckoo graph is a bipartite graph derived from a cuckoo hash table. · The vertices are the buckets, and for each key k, there is an edge from T1[h1(k)] to T2[h2(k)] (so we can think of keys as being the edges of the graph). . An insertion into the table can be described by a path through the cuckoo graph.
	Cuckoo hashing fails	If any connected component of the cuckoo graph has more than one cycle.

Share This Flashcard Set

Set the Language

Ads

Add to Folders

Upgrade to Cram Premium

Card Range To Study

69 Cards in this Set