Data Structures and Algorithms Guide

This guide provides a structured overview of common data structures and algorithms, their performance characteristics, and primary use cases.

Algorithmic Analysis (Big O)
Linear Data Structures
Non-Linear Data Structures
Sorting Algorithms
Searching Algorithms
Graph Algorithms
Algorithmic Techniques

Algorithmic Analysis (Big O Notation)

Big O notation is used to describe the performance or complexity of an algorithm. It characterizes the worst-case scenario, focusing on how the execution time or space requirements grow as the input size (n) increases.

Notation	Name	Description & Example
`O(1)`	Constant	Always takes the same amount of time, regardless of input size. (e.g., Array lookup by index)
`O(log n)`	Logarithmic	Time increases logarithmically as input size grows. Efficient. (e.g., Binary Search)
`O(n)`	Linear	Time is directly proportional to the input size. (e.g., Iterating through a list)
`O(n log n)`	Log-Linear	A very common complexity for efficient sorting algorithms. (e.g., Merge Sort, Heap Sort)
`O(n^2)`	Quadratic	Time is proportional to the square of the input size. (e.g., Nested loops, Bubble Sort)
`O(2^n)`	Exponential	Time doubles with each addition to the input. Very slow. (e.g., Recursive Fibonacci)
`O(n!)`	Factorial	Extremely slow; grows factorially. (e.g., Traveling Salesman Problem via brute force)

Linear Data Structures

Structures that arrange data in a sequential manner.

Array

Description: A collection of items stored in a contiguous block of memory, accessible by an index.
Characteristics: Fixed size, elements are stored sequentially.
Performance:
- Access: O(1)
- Search: O(n)
- Insertion: O(n) (requires shifting elements)
- Deletion: O(n) (requires shifting elements)
Pros: Fast, predictable random access.
Cons: Inefficient insertion/deletion, fixed size.

Dynamic Array (e.g., `ArrayList` in Java, `list` in Python)

Description: An array that automatically resizes to accommodate new elements.
Characteristics: When the array becomes full, it creates a new, larger array (often double the size) and copies all elements over.
Performance:
- Access: O(1)
- Search: O(n)
- Insertion (at end): Amortized O(1). Worst-case O(n) if a resize is triggered.
- Deletion (at end): O(1)

Note: The O(n) copy operation during a resize is expensive, but because it happens infrequently, the average (amortized) cost of an append is still O(1).

Linked List

Description: A linear collection of nodes, where each node contains data and a pointer to the next node.
Characteristics: Sequential access only; the head is the entry point.
Performance:
- Access: O(n)
- Search: O(n)
- Insertion (at ends): O(1)
- Deletion (at ends): O(1)
Pros: Efficient insertions/deletions at ends, flexible size.
Cons: Slow lookups, requires more memory than an array due to pointers.

Stack

Description: A Last-In, First-Out (LIFO) data structure.
Core Operations: All O(1)
- push: Add an element to the top.
- pop: Remove and return the top element.
- peek: View the top element without removing it.
Use Cases: Function call stack, undo mechanisms, Depth-First Search (DFS).

Queue

Description: A First-In, First-Out (FIFO) data structure.
Core Operations: All O(1)
- enqueue: Add an element to the back.
- dequeue: Remove and return the front element.
- peek: View the front element without removing it.
Use Cases: Task scheduling, message processing, Breadth-First Search (BFS).

Non-Linear Data Structures

Structures where data is not arranged sequentially.

Hash Table (or Hash Map)

Description: A structure that maps keys to values for highly efficient lookups.
Implementation: Uses an array and a hashing function to compute an index for each key.
Collision Handling: When two keys hash to the same index, a strategy like chaining (using a linked list at the index) is used.
Performance:
- Access/Search/Insert/Delete (Average): O(1)
- Access/Search/Insert/Delete (Worst): O(n) (if all keys cause collisions)
Pros: Extremely fast lookups.
Cons: Unordered, not cache-friendly.

Tree

Description: A hierarchical data structure with a root node and child nodes connected by edges.
Terminology:
- Leaf: A node with no children.
- Depth: Length of the path from the root to a node.
- Height: Length of the longest path from the root to a leaf.

Common Tree Types

Binary Tree: Each node has at most two children (left and right).
- Full: Each node has exactly zero or two children.
- Complete: All levels are filled except possibly the last, which is filled from left to right. (Heaps are complete trees).
- Balanced: The heights of the left and right subtrees of any node differ by at most one.
Binary Search Tree (BST): A binary tree with a specific ordering: for any node, all values in its left subtree are smaller, and all values in its right subtree are larger.
- Performance: O(log n) for all operations if balanced, but O(n) if unbalanced (degenerates into a linked list).
Self-Balancing BSTs (AVL Tree, Red-Black Tree): These trees automatically perform rotations during insertions and deletions to maintain balance, guaranteeing O(log n) performance.
Trie (Prefix Tree): A specialized tree for storing strings, enabling fast prefix-based searches.

Heap

Description: A specialized complete binary tree where parent nodes have a specific relationship to their children.
Types:
- Min-Heap: The value of each parent node is less than or equal to the value of its children. The root is the minimum element.
- Max-Heap: The value of each parent node is greater than or equal to the value of its children. The root is the maximum element.
Performance:
- Insertion: O(log n)
- Deletion (of root): O(log n)
- Peek (at root): O(1)
Use Case: Implementing Priority Queues.

Graph

Description: A collection of nodes (vertices) connected by edges.
Representations:
- Adjacency List: A map from each vertex to a list of its neighbors. Most common.
- Adjacency Matrix: A V x V matrix where M[i][j] = 1 if an edge exists.
Types: Directed/Undirected, Weighted/Unweighted, Cyclic/Acyclic.

Sorting Algorithms

Algorithm	Time (Best)	Time (Avg)	Time (Worst)	Space	Stable	Notes
Bubble Sort	`O(n)`	`O(n^2)`	`O(n^2)`	`O(1)`	Yes	Simple but very inefficient.
Selection Sort	`O(n^2)`	`O(n^2)`	`O(n^2)`	`O(1)`	No	Simple, but inefficient.
Insertion Sort	`O(n)`	`O(n^2)`	`O(n^2)`	`O(1)`	Yes	Efficient for small or nearly sorted lists.
Merge Sort	`O(n log n)`	`O(n log n)`	`O(n log n)`	`O(n)`	Yes	Reliable, efficient, but requires extra space.
Heap Sort	`O(n log n)`	`O(n log n)`	`O(n log n)`	`O(1)`	No	In-place version of selection sort using a heap.
Quick Sort	`O(n log n)`	`O(n log n)`	`O(n^2)`	`O(log n)`	No	Very fast on average, but has a bad worst case.

Searching Algorithms

Linear Search

Description: Checks each element in a list sequentially until the target is found.
Performance: O(n)
Requirement: None.

Binary Search

Description: Finds an element by repeatedly dividing the search interval in half.
Performance: O(log n)
Requirement: The array must be sorted.

Graph Algorithms

Breadth-First Search (BFS)

Description: Explores all neighbors of a node at the present depth before moving on to nodes at the next depth level.
Data Structure: Uses a Queue.
Use Case: Finding the shortest path in an unweighted graph.

Depth-First Search (DFS)

Description: Explores as far as possible down one branch before backtracking.
Data Structure: Uses a Stack (often the implicit call stack via recursion).
Use Case: Detecting cycles, topological sorting, finding connected components.

Dijkstra’s Algorithm

Description: Finds the shortest path from a single source node to all other nodes in a weighted graph with non-negative edge weights.

Minimum Spanning Tree (MST)

Description: A subset of the edges of a connected, weighted graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight.
Common Algorithms: Prim’s, Kruskal’s.

Topological Sort

Description: A linear ordering of vertices in a Directed Acyclic Graph (DAG) such that for every directed edge from vertex u to vertex v, u comes before v in the ordering.

Algorithmic Techniques

Dynamic Programming (DP)

Description: A method for solving complex problems by breaking them down into simpler, overlapping subproblems. It solves each subproblem only once and stores the results to avoid redundant computation.
Key Idea: Overlapping subproblems and optimal substructure.

Greedy Algorithms

Description: An approach that makes the locally optimal choice at each stage with the hope of finding a global optimum.

Memoization

Description: An optimization technique where the results of expensive function calls are cached and returned when the same inputs occur again. It is the top-down approach to implementing Dynamic Programming.

Bitwise Operations

Description: Operations that manipulate individual bits of a number. They are extremely fast and can be used for clever and efficient solutions to certain problems (e.g., checking for odd/even, swapping numbers, managing sets with bitmasks).