forked from eggy/eifueo
291 lines
7.4 KiB
Markdown
291 lines
7.4 KiB
Markdown
# ECE 250: DSA
|
|
|
|
## Solving recurrences
|
|
|
|
The **master method** is used to solve recurrences. For expressions of the form $T(n)=aT(n/b)+f(n)$:
|
|
|
|
- If $f(n)=O(n^{\log_b a})$, we have $T(n)=\Theta(n^{\log_b a}\log n)$
|
|
- If $f(n) < O(n^{\log_b a})$, we have $T(n)=O(n^{\log_b a})$
|
|
- If $f(n) > \Omega(n^{\log_b a})$, and $af(n/b)\leq cf(n), c<0$, we have $T(n)=\Theta(f(n))$
|
|
|
|
|
|
## Heaps
|
|
|
|
A heap is a binary tree **stored in an array** in which all levels but the lowest are filled. It is guaranteed that the parent of index $i$ is greater than or equal to the element at index $i$.
|
|
|
|
- the parent of index $i$ is stored at $i/2$
|
|
- the left child of index $i$ is stored at $2i$
|
|
- the right child of index $i$ is stored at $2i+1$
|
|
|
|
|
|
<img src="https://upload.wikimedia.org/wikipedia/commons/c/c4/Max-Heap-new.svg" width=600>(Source: Wikimedia Commons)</img>
|
|
|
|
The **heapify** command takes a node and makes it and its children a valid heap.
|
|
|
|
```rust
|
|
fn heapify(&mut A: Vec, i: usize) {
|
|
if A[2*i] >= A[i] {
|
|
A.swap(2*i, i);
|
|
heapify(A, 2*i)
|
|
} else if A[2*i + 1] >= A[i] {
|
|
A.swap(2*i + 1, i);
|
|
heapify(A, 2*i + 1)
|
|
}
|
|
}
|
|
```
|
|
|
|
Repeatedly heapifying an array from middle to beginning converts it to a heap.
|
|
|
|
```rust
|
|
fn build_heap(A: Vec) {
|
|
let n = A.len()
|
|
for i in (n/2).floor()..0 { // this is technically not valid but it's much clearer
|
|
heapify(A, i);
|
|
}
|
|
}
|
|
```
|
|
|
|
### Heapsort
|
|
|
|
Heapsort constructs a heap annd then does magic things that I really cannot be bothered to figure out right now.
|
|
|
|
```rust
|
|
fn heapsort(A: Vec) {
|
|
build_heap(A);
|
|
let n = A.len();
|
|
for i in n..0 {
|
|
A.swap(1, i);
|
|
heapify(A, 1); // NOTE: heapify takes into account the changed value of n
|
|
}
|
|
}
|
|
```
|
|
|
|
### Priority queues
|
|
|
|
A priority queue is a heap with the property that it can remove the highest value in $O(\log n)$ time.
|
|
|
|
```rust
|
|
fn pop(A: Vec, &n: usize) {
|
|
let biggest = A[0];
|
|
|
|
A[0] = n;
|
|
*n -= 1;
|
|
heapify(A, 1);
|
|
return biggest;
|
|
}
|
|
```
|
|
|
|
```rust
|
|
fn insert(A: Vec, &n: usize, key: i32) {
|
|
*n += 1;
|
|
|
|
let i = n;
|
|
while i > 1 && A[parent(i)] < key {
|
|
A[i] = A[parent(i)];
|
|
i = parent(i);
|
|
}
|
|
A[i] = k;
|
|
}
|
|
```
|
|
|
|
## Sorting algorithms
|
|
|
|
### Quicksort
|
|
|
|
Quicksort operates by selecting a **pivot point** that ensures that everything to the left of the pivot is less than anything to the right of the pivot, which is what partitioning does.
|
|
|
|
```rust
|
|
fn partition(A: Vec, left_bound: usize, right_bound: usize) {
|
|
let i = left_bound;
|
|
let j = right_bound;
|
|
|
|
while true {
|
|
while A[j] <= A[right_bound] { j -= 1; }
|
|
while A[i] >= A[left_bound] { i += 1; }
|
|
|
|
if i < j { A.swap(i, j); }
|
|
else { return j } // new bound!
|
|
}
|
|
}
|
|
```
|
|
|
|
Sorting calls partitioning with smaller and smaller bounds until the collection is sorted.
|
|
|
|
```rust
|
|
fn sort(a: Vec, left: usize, right: usize) {
|
|
if left < right {
|
|
let pivot = partition(A, left, right);
|
|
sort(A, left, pivot);
|
|
sort(A, pivot+1, right);
|
|
}
|
|
}
|
|
```
|
|
|
|
- In the best case, if partitioning is even, the time complexity is $T(n)=T(n/2)+\Theta(n)=\Theta(n\log n)$.
|
|
- In the worst case, if one side only has one element, which occurs if the list is sorted, the time complexity is $\Theta(n^2)$.
|
|
|
|
### Counting sort
|
|
|
|
If items are or are linked to a number from $1..n$ (duplicates are allowed), counting sort counts the number of each number, then moves things to the correct position. Where $k$ is the size of the counter array, the time complexity is $O(n+k)$.
|
|
|
|
First, construct a count prefix sum array:
|
|
|
|
```rust
|
|
fn count(A: Vec, K: usize) {
|
|
let counter = vec![0; K];
|
|
|
|
for i in A {
|
|
counter[i] += 1;
|
|
}
|
|
|
|
for (index, val) in counter.iter_mut().enumerate() {
|
|
counter[index + 1] += val; // ignore bounds for cleanliness please :)
|
|
}
|
|
return counter
|
|
}
|
|
```
|
|
|
|
Next, the prefix sum represents the correct position for each item.
|
|
|
|
```rust
|
|
fn sort(A: Vec) {
|
|
let counter = count(A, 100);
|
|
let sorted = vec![0; A.len()];
|
|
|
|
for i in n..0 {
|
|
sorted[counter[A[i]]] = A[i];
|
|
counter[A[i]] -= 1;
|
|
}
|
|
}
|
|
```
|
|
|
|
## Graphs
|
|
|
|
!!! definition
|
|
- A **vertex** is a node.
|
|
- The **degree** of a node is the number of edges connected to it.
|
|
- A **connected graph** is such that there exists a path from any node in the graph to any other node.
|
|
- A **connected component** is a subgraph such that there exists a path from any node in the subgraph to any other node in the subgraph.
|
|
- A **tree** is a connected graph without cycles.
|
|
|
|
### Directed acyclic graphs
|
|
|
|
a DAG is acyclic if and only if there are no **back edges** — edges from a child to an ancestor.
|
|
|
|
### Bellman-Ford
|
|
|
|
The Bellman-Ford algorithm allows for negative edges and detects negative cycles.
|
|
|
|
```rust
|
|
fn bf(G: Graph, s: Node) {
|
|
let mut distance = Vec::new(INFINITY);
|
|
let mut adj_list = Vec::from(G);
|
|
|
|
distance[s] = 0;
|
|
|
|
for i in 1..G.vertices.len()-1 {
|
|
for (u,v) in G.edges {
|
|
if distance[v] > distance[u] + adj_list[u][v] {
|
|
distance[v] = distance[u] + adj_list[u][v];
|
|
}
|
|
}
|
|
}
|
|
for (u, v) in G.edges {
|
|
if distance[v] > distance[u] + adj_list[u][v] {
|
|
return false;
|
|
}
|
|
}
|
|
return true;
|
|
}
|
|
```
|
|
|
|
### Topological sort
|
|
|
|
This is used to find the shortest path in a DAG simply by DFS.
|
|
|
|
```rust
|
|
fn shortest_path(G: Graph, s: Node) {
|
|
let nodes: Vec<Node> = top_sort(G);
|
|
let mut adj_list = G.to_adj_list();
|
|
let mut distance = Vec::new(INFINITY);
|
|
|
|
for v in nodes {
|
|
for adjacent in adj_list[v] {
|
|
if distance[adjacent] > distance[v] + adjacent[v] {
|
|
distance[v] = distance[adjacent] + adjacent[v];
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Minimum spanning tree
|
|
|
|
!!! definition
|
|
- A **cut** $(S, V-S)$ is a partition of vertices into disjoint sets $S$ and $V-S$.
|
|
- An edge $u,v\in E$ **crosses the cut** $(S,V-S)$ if t`he endpoints are on different sides of the cut.
|
|
- A cut **respects** a set of edges $A$ if and only if no edge in $A$ crosses the cut.
|
|
- A **light edge** is the minimum of all edges that could cross the cut. There can be more than one light edge per cut.
|
|
|
|
A **spanning tree** of $G$ is a subgraph that contains all of its vertices. An MST minimises the sum of all edges in the spanning tree.
|
|
|
|
To create an MST:
|
|
|
|
1. Add edges from the spanning tree to an empty set, maintaining that the set is always a subset of an MST (only "safe edges" are added)
|
|
|
|
The **Prim-Jarnik algorithm** grows a tree one vertex at a time. $A$ is a subset of the already computed portion of $T$, and all vertices outside $A$ have a weight of infinity if there is no edge.
|
|
|
|
```rust
|
|
// r is the start vertex
|
|
fn create_mst_prim(G: Graph, r: Vertex) {
|
|
// clean all vertices
|
|
for vertex in G.vertices.iter_mut() {
|
|
vertex.min_weight = INFINITY;
|
|
vertex.parent = None;
|
|
}
|
|
|
|
let Q = BinaryHeap::from(G.vertices); // priority queue
|
|
|
|
while let Some(u) = Q.pop() {
|
|
for v in u.adjacent_vertices.iter_mut() {
|
|
if Q.contains(v) && v.edge_to(u).weight < v.min_weight {
|
|
v.min_weight = v.edge_to(u).weight;
|
|
Q."modify_key"(v);
|
|
v.parent = u;
|
|
}
|
|
}
|
|
}
|
|
|
|
}
|
|
```
|
|
|
|
**Kruskal's algorithm** is objectively better by relying on edges instead.
|
|
|
|
```rust
|
|
fn create_mst_kruskal(G: Graph) -> HashSet<Edge> {
|
|
let mut A = HashSet::new();
|
|
let mut S = DisjointSet::new(); // vertices set
|
|
|
|
for v in G.vertices.iter() {
|
|
S.add_as_new_set(v);
|
|
}
|
|
G.edges.sort(|edge| edge.weight);
|
|
|
|
for (from, dest) in G.edges {
|
|
if S.find_set_that_contains(from) != S.find_set_that_contains(dest) {
|
|
A.insert((from, to));
|
|
let X = S.pop(from);
|
|
let Y = S.pop(to);
|
|
S.insert({X.union(Y)});
|
|
}
|
|
}
|
|
return A;
|
|
}
|
|
```
|
|
|
|
The time complexity is $O(E\log V)$.
|
|
|
|
### All pairs shortest path
|
|
|
|
Also known as an adjacency matrix extended such that each point represents the minimum distance from one edge to that other edge.
|