# ECE 250: DSA ## Heaps A heap is a binary tree **stored in an array** in which all levels but the lowest are filled. It is guaranteed that the parent of index $i$ is greater than or equal to the element at index $i$. - the parent of index $i$ is stored at $i/2$ - the left child of index $i$ is stored at $2i$ - the right child of index $i$ is stored at $2i+1$ (Source: Wikimedia Commons) The **heapify** command takes a node and makes it and its children a valid heap. ```rust fn heapify(&mut A: Vec, i: usize) { if A[2*i] >= A[i] { A.swap(2*i, i); heapify(A, 2*i) } else if A[2*i + 1] >= A[i] { A.swap(2*i + 1, i); heapify(A, 2*i + 1) } } ``` Repeatedly heapifying an array from middle to beginning converts it to a heap. ```rust fn build_heap(A: Vec) { let n = A.len() for i in (n/2).floor()..0 { // this is technically not valid but it's much clearer heapify(A, i); } } ``` ### Heapsort Heapsort constructs a heap annd then does magic things that I really cannot be bothered to figure out right now. ```rust fn heapsort(A: Vec) { build_heap(A); let n = A.len(); for i in n..0 { A.swap(1, i); heapify(A, 1); // NOTE: heapify takes into account the changed value of n } } ``` ### Priority queues A priority queue is a heap with the property that it can remove the highest value in $O(\log n)$ time. ```rust fn pop(A: Vec, &n: usize) { let biggest = A[0]; A[0] = n; *n -= 1; heapify(A, 1); return biggest; } ``` ```rust fn insert(A: Vec, &n: usize, key: i32) { *n += 1; let i = n; while i > 1 && A[parent(i)] < key { A[i] = A[parent(i)]; i = parent(i); } A[i] = k; } ``` ## Sorting algorithms ### Quicksort Quicksort operates by selecting a **pivot point** that ensures that everything to the left of the pivot is less than anything to the right of the pivot, which is what partitioning does. ```rust fn partition(A: Vec, left_bound: usize, right_bound: usize) { let i = left_bound; let j = right_bound; while true { while A[j] <= A[right_bound] { j -= 1; } while A[i] >= A[left_bound] { i += 1; } if i < j { A.swap(i, j); } else { return j } // new bound! } } ``` Sorting calls partitioning with smaller and smaller bounds until the collection is sorted. ```rust fn sort(a: Vec, left: usize, right: usize) { if left < right { let pivot = partition(A, left, right); sort(A, left, pivot); sort(A, pivot+1, right); } } ``` - In the best case, if partitioning is even, the time complexity is $T(n)=T(n/2)+\Theta(n)=\Theta(n\log n)$. - In the worst case, if one side only has one element, which occurs if the list is sorted, the time complexity is $\Theta(n^2)$. ### Counting sort If items are or are linked to a number from $1..n$ (duplicates are allowed), counting sort counts the number of each number, then moves things to the correct position. Where $k$ is the size of the counter array, the time complexity is $O(n+k)$. First, construct a count prefix sum array: ```rust fn count(A: Vec, K: usize) { let counter = vec![0; K]; for i in A { counter[i] += 1; } for (index, val) in counter.iter_mut().enumerate() { counter[index + 1] += val; // ignore bounds for cleanliness please :) } return counter } ``` Next, the prefix sum represents the correct position for each item. ```rust fn sort(A: Vec) { let counter = count(A, 100); let sorted = vec![0; A.len()]; for i in n..0 { sorted[counter[A[i]]] = A[i]; counter[A[i]] -= 1; } } ``` ## Graphs !!! definition - A **vertex** is a node. - The **degree** of a node is the number of edges connected to it. - A **connected graph** is such that there exists a path from any node in the graph to any other node. - A **connected component** is a subgraph such that there exists a path from any node in the subgraph to any other node in the subgraph. - A **tree** is a connected graph without cycles. ### Directed acyclic graphs a DAG is acyclic if and only if there are no **back edges** — edges from a child to an ancestor. ### Bellman-Ford The Bellman-Ford algorithm allows for negative edges and detects negative cycles. ```rust fn bf(G: Graph, s: Node) { let mut distance = Vec::new(INFINITY); let mut adj_list = Vec::from(G); distance[s] = 0; for i in 1..G.vertices.len()-1 { for (u,v) in G.edges { if distance[v] > distance[u] + adj_list[u][v] { distance[v] = distance[u] + adj_list[u][v]; } } } for (u, v) in G.edges { if distance[v] > distance[u] + adj_list[u][v] { return false; } } return true; } ``` ### Topological sort This is used to find the shortest path in a DAG simply by DFS. ```rust fn shortest_path(G: Graph, s: Node) { let nodes: Vec = top_sort(G); let mut adj_list = G.to_adj_list(); let mut distance = Vec::new(INFINITY); for v in nodes { for adjacent in adj_list[v] { if distance[adjacent] > distance[v] + adjacent[v] { distance[v] = distance[adjacent] + adjacent[v]; } } } } ### Minimum spanning tree !!! definition - A **cut** $(S, V-S)$ is a partition of vertices into disjoint sets $S$ and $V-S$. - An edge $u,v\in E$ **crosses the cut** $(S,V-S)$ if t`he endpoints are on different sides of the cut. - A cut **respects** a set of edges $A$ if and only if no edge in $A$ crosses the cut. - A **light edge** is the minimum of all edges that could cross the cut. There can be more than one light edge per cut. A **spanning tree** of $G$ is a subgraph that contains all of its vertices. An MST minimises the sum of all edges in the spanning tree. To create an MST: 1. Add edges from the spanning tree to an empty set, maintaining that the set is always a subset of an MST (only "safe edges" are added) The **Prim-Jarnik algorithm** grows a tree one vertex at a time. $A$ is a subset of the already computed portion of $T$, and all vertices outside $A$ have a weight of infinity if there is no edge. ```rust // r is the start vertex fn create_mst_prim(G: Graph, r: Vertex) { // clean all vertices for vertex in G.vertices.iter_mut() { vertex.min_weight = INFINITY; vertex.parent = None; } let Q = BinaryHeap::from(G.vertices); // priority queue while let Some(u) = Q.pop() { for v in u.adjacent_vertices.iter_mut() { if Q.contains(v) && v.edge_to(u).weight < v.min_weight { v.min_weight = v.edge_to(u).weight; Q."modify_key"(v); v.parent = u; } } } } ``` **Kruskal's algorithm** is objectively better by relying on edges instead. ```rust fn create_mst_kruskal(G: Graph) -> HashSet { let mut A = HashSet::new(); let mut S = DisjointSet::new(); // vertices set for v in G.vertices.iter() { S.add_as_new_set(v); } G.edges.sort(|edge| edge.weight); for (from, dest) in G.edges { if S.find_set_that_contains(from) != S.find_set_that_contains(dest) { A.insert((from, to)); let X = S.pop(from); let Y = S.pop(to); S.insert({X.union(Y)}); } } return A; } ``` The time complexity is $O(E\log V)$. ### All pairs shortest path Also known as an adjacency matrix extended such that each point represents the minimum distance from one edge to that other edge.