Skip to content

Data Structures and Algorithms Guide

This guide provides a structured overview of common data structures and algorithms, their performance characteristics, and primary use cases.

Table of Contents

  1. Algorithmic Analysis (Big O)
  2. Linear Data Structures
  3. Non-Linear Data Structures
  4. Sorting Algorithms
  5. Searching Algorithms
  6. Graph Algorithms
  7. Algorithmic Techniques

Algorithmic Analysis (Big O Notation)

Big O notation is used to describe the performance or complexity of an algorithm. It characterizes the worst-case scenario, focusing on how the execution time or space requirements grow as the input size (n) increases.

NotationNameDescription & Example
O(1)ConstantAlways takes the same amount of time, regardless of input size. (e.g., Array lookup by index)
O(log n)LogarithmicTime increases logarithmically as input size grows. Efficient. (e.g., Binary Search)
O(n)LinearTime is directly proportional to the input size. (e.g., Iterating through a list)
O(n log n)Log-LinearA very common complexity for efficient sorting algorithms. (e.g., Merge Sort, Heap Sort)
O(n^2)QuadraticTime is proportional to the square of the input size. (e.g., Nested loops, Bubble Sort)
O(2^n)ExponentialTime doubles with each addition to the input. Very slow. (e.g., Recursive Fibonacci)
O(n!)FactorialExtremely slow; grows factorially. (e.g., Traveling Salesman Problem via brute force)

Linear Data Structures

Structures that arrange data in a sequential manner.

Array

  • Description: A collection of items stored in a contiguous block of memory, accessible by an index.
  • Characteristics: Fixed size, elements are stored sequentially.
  • Performance:
    • Access: O(1)
    • Search: O(n)
    • Insertion: O(n) (requires shifting elements)
    • Deletion: O(n) (requires shifting elements)
  • Pros: Fast, predictable random access.
  • Cons: Inefficient insertion/deletion, fixed size.

Dynamic Array (e.g., ArrayList in Java, list in Python)

  • Description: An array that automatically resizes to accommodate new elements.
  • Characteristics: When the array becomes full, it creates a new, larger array (often double the size) and copies all elements over.
  • Performance:
    • Access: O(1)
    • Search: O(n)
    • Insertion (at end): Amortized O(1). Worst-case O(n) if a resize is triggered.
    • Deletion (at end): O(1)

Note: The O(n) copy operation during a resize is expensive, but because it happens infrequently, the average (amortized) cost of an append is still O(1).

Linked List

  • Description: A linear collection of nodes, where each node contains data and a pointer to the next node.
  • Characteristics: Sequential access only; the head is the entry point.
  • Performance:
    • Access: O(n)
    • Search: O(n)
    • Insertion (at ends): O(1)
    • Deletion (at ends): O(1)
  • Pros: Efficient insertions/deletions at ends, flexible size.
  • Cons: Slow lookups, requires more memory than an array due to pointers.

Stack

  • Description: A Last-In, First-Out (LIFO) data structure.
  • Core Operations: All O(1)
    • push: Add an element to the top.
    • pop: Remove and return the top element.
    • peek: View the top element without removing it.
  • Use Cases: Function call stack, undo mechanisms, Depth-First Search (DFS).

Queue

  • Description: A First-In, First-Out (FIFO) data structure.
  • Core Operations: All O(1)
    • enqueue: Add an element to the back.
    • dequeue: Remove and return the front element.
    • peek: View the front element without removing it.
  • Use Cases: Task scheduling, message processing, Breadth-First Search (BFS).

Non-Linear Data Structures

Structures where data is not arranged sequentially.

Hash Table (or Hash Map)

  • Description: A structure that maps keys to values for highly efficient lookups.
  • Implementation: Uses an array and a hashing function to compute an index for each key.
  • Collision Handling: When two keys hash to the same index, a strategy like chaining (using a linked list at the index) is used.
  • Performance:
    • Access/Search/Insert/Delete (Average): O(1)
    • Access/Search/Insert/Delete (Worst): O(n) (if all keys cause collisions)
  • Pros: Extremely fast lookups.
  • Cons: Unordered, not cache-friendly.

Tree

  • Description: A hierarchical data structure with a root node and child nodes connected by edges.
  • Terminology:
    • Leaf: A node with no children.
    • Depth: Length of the path from the root to a node.
    • Height: Length of the longest path from the root to a leaf.

Common Tree Types

  • Binary Tree: Each node has at most two children (left and right).
    • Full: Each node has exactly zero or two children.
    • Complete: All levels are filled except possibly the last, which is filled from left to right. (Heaps are complete trees).
    • Balanced: The heights of the left and right subtrees of any node differ by at most one.
  • Binary Search Tree (BST): A binary tree with a specific ordering: for any node, all values in its left subtree are smaller, and all values in its right subtree are larger.
    • Performance: O(log n) for all operations if balanced, but O(n) if unbalanced (degenerates into a linked list).
  • Self-Balancing BSTs (AVL Tree, Red-Black Tree): These trees automatically perform rotations during insertions and deletions to maintain balance, guaranteeing O(log n) performance.
  • Trie (Prefix Tree): A specialized tree for storing strings, enabling fast prefix-based searches.

Heap

  • Description: A specialized complete binary tree where parent nodes have a specific relationship to their children.
  • Types:
    • Min-Heap: The value of each parent node is less than or equal to the value of its children. The root is the minimum element.
    • Max-Heap: The value of each parent node is greater than or equal to the value of its children. The root is the maximum element.
  • Performance:
    • Insertion: O(log n)
    • Deletion (of root): O(log n)
    • Peek (at root): O(1)
  • Use Case: Implementing Priority Queues.

Graph

  • Description: A collection of nodes (vertices) connected by edges.
  • Representations:
    • Adjacency List: A map from each vertex to a list of its neighbors. Most common.
    • Adjacency Matrix: A V x V matrix where M[i][j] = 1 if an edge exists.
  • Types: Directed/Undirected, Weighted/Unweighted, Cyclic/Acyclic.

Sorting Algorithms

AlgorithmTime (Best)Time (Avg)Time (Worst)SpaceStableNotes
Bubble SortO(n)O(n^2)O(n^2)O(1)YesSimple but very inefficient.
Selection SortO(n^2)O(n^2)O(n^2)O(1)NoSimple, but inefficient.
Insertion SortO(n)O(n^2)O(n^2)O(1)YesEfficient for small or nearly sorted lists.
Merge SortO(n log n)O(n log n)O(n log n)O(n)YesReliable, efficient, but requires extra space.
Heap SortO(n log n)O(n log n)O(n log n)O(1)NoIn-place version of selection sort using a heap.
Quick SortO(n log n)O(n log n)O(n^2)O(log n)NoVery fast on average, but has a bad worst case.

Searching Algorithms

  • Description: Checks each element in a list sequentially until the target is found.
  • Performance: O(n)
  • Requirement: None.
  • Description: Finds an element by repeatedly dividing the search interval in half.
  • Performance: O(log n)
  • Requirement: The array must be sorted.

Graph Algorithms

Breadth-First Search (BFS)

  • Description: Explores all neighbors of a node at the present depth before moving on to nodes at the next depth level.
  • Data Structure: Uses a Queue.
  • Use Case: Finding the shortest path in an unweighted graph.

Depth-First Search (DFS)

  • Description: Explores as far as possible down one branch before backtracking.
  • Data Structure: Uses a Stack (often the implicit call stack via recursion).
  • Use Case: Detecting cycles, topological sorting, finding connected components.

Dijkstra’s Algorithm

  • Description: Finds the shortest path from a single source node to all other nodes in a weighted graph with non-negative edge weights.

Minimum Spanning Tree (MST)

  • Description: A subset of the edges of a connected, weighted graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight.
  • Common Algorithms: Prim’s, Kruskal’s.

Topological Sort

  • Description: A linear ordering of vertices in a Directed Acyclic Graph (DAG) such that for every directed edge from vertex u to vertex v, u comes before v in the ordering.

Algorithmic Techniques

Dynamic Programming (DP)

  • Description: A method for solving complex problems by breaking them down into simpler, overlapping subproblems. It solves each subproblem only once and stores the results to avoid redundant computation.
  • Key Idea: Overlapping subproblems and optimal substructure.

Greedy Algorithms

  • Description: An approach that makes the locally optimal choice at each stage with the hope of finding a global optimum.

Memoization

  • Description: An optimization technique where the results of expensive function calls are cached and returned when the same inputs occur again. It is the top-down approach to implementing Dynamic Programming.

Bitwise Operations

  • Description: Operations that manipulate individual bits of a number. They are extremely fast and can be used for clever and efficient solutions to certain problems (e.g., checking for odd/even, swapping numbers, managing sets with bitmasks).