Modules
01/06
Graph Theory

Contents

Graph Representations

How you store a graph in memory determines what operations are fast, what is slow, and how much space you need. Choose wisely.

Introduction

A graph is an abstract concept: a set of nodes connected by edges. But computers do not understand abstract concepts. They understand memory addresses, arrays, and pointers. Before you can run any graph algorithm, you must first answer a fundamental question: how do I store this graph in memory?

The answer is not straightforward. Different representations make different operations efficient. Choosing the wrong one can turn an O(1) lookup into O(n), or balloon your memory usage from O(n) to O(n²). This is especially critical in machine learning, where graphs can have millions of nodes.

The Core Trade-off

There is no universally "best" representation. Adjacency matrices excel at checking if two nodes are connected. Adjacency lists are superior for iterating over neighbors. Edge lists are compact for sparse graphs. Understanding these trade-offs is essential for writing efficient graph algorithms.

Why Representation Matters

Consider a social network with 1 billion users. Most people have around 200 friends, not 1 billion. This is a sparse graph. Now consider two representation choices:

Adjacency Matrix

Space: 109 × 109 = 1018 entries

That is 1 exabyte just for the structure. Impossible.

Adjacency List

Space: 109 nodes × 200 friends = 2 × 1011 entries

About 200 GB. Large but feasible.

The difference is a factor of 5 million. This is why understanding graph representations is not academic trivia. It determines whether your algorithm runs at all.

Key Terminology

Dense Graph

Most pairs of nodes are connected. E ≈ V². Example: fully connected neural network layers.

Sparse Graph

Few edges relative to nodes. E ≪ V². Example: social networks, road networks.

Directed Graph

Edges have direction (A→B ≠ B→A). Example: Twitter follows, web links.

Weighted Graph

Edges have numerical values (costs, distances). Example: road distances, transition probabilities.

Interactive: Space Complexity

Explore how memory usage scales with different graph sizes and densities. This helps you understand when each representation makes sense.

Space Complexity

Visualize memory scaling. Find the right representation for your graph density.

Parameters

100
10%
SparseDense
Total Edges1,000
Graph TypeSPARSE

Winner: Edge List

For this configuration, the Edge List uses the least amount of memory (2.0K units).

Representation StructureRelative Size
Edge List
O(E)
Edge List2.0K units
Adjacency List
O(V + E)
Adjacency List2.1K units
Adjacency Matrix
O(V²)
Adjacency Matrix10.0K units
Incidence Matrix
O(VE)
Memory Usage
Incidence Matrix100.0K units
0% Usage100% (Worst Case)

Interactive: Representation Explorer

Toggle between graph types and hover over nodes to see how the same graph looks in different representations. Click on matrix cells to highlight the corresponding edge.

Graph Explorer

Visualize how graph structure maps to memory.

Visual Graph
01234
Hover over nodes or list items to highlight connections. Visualize how different formats store the same graph topology.
Adjacency Matrix
0
1
2
3
4
0
0
1
1
0
0
1
1
0
1
1
0
2
1
1
0
0
1
3
0
1
0
0
1
4
0
0
1
1
0
Incidence Matrix
e0
e1
e2
e3
e4
e5
0
1
1
0
0
0
0
1
1
0
1
0
0
1
2
0
1
0
1
0
1
3
0
0
1
0
1
0
4
0
0
0
1
1
0
Adjacency List
0
12
1
032
2
041
3
14
4
23
Edge List
(0, 1)
(0, 2)
(1, 3)
(2, 4)
(3, 4)
(1, 2)

Adjacency Matrix

The adjacency matrix is the most intuitive representation. Given V nodes, create a V×V matrix A where:

A[i][j]={1if edge exists from i to j0otherwiseA[i][j] = \begin{cases} 1 & \text{if edge exists from } i \text{ to } j \\ 0 & \text{otherwise} \end{cases}

For weighted graphs, replace 1 with the edge weight.

Properties

Undirected graphs: The matrix is symmetric. A[i][j] = A[j][i]. You could save half the space by storing only the upper triangle.

Self-loops: Represented by diagonal entries A[i][i].

Degree of node i: Sum of row i (out-degree) or column i (in-degree).

Complexity Analysis

OperationTimeNotes
Check if edge existsO(1)Direct array lookup
Get all neighbors of nodeO(V)Must scan entire row
Add/remove edgeO(1)Direct array update
SpaceO(V²)Regardless of edge count

Matrix Powers: A Hidden Superpower

One remarkable property: Ak[i][j] counts the number of paths of length k from node i to node j.

Example Application

If A² = A × A, then A²[0][3] tells you how many 2-hop paths exist from node 0 to node 3. This is used in PageRank, random walks, and Graph Neural Networks.

Adjacency List

Instead of storing all possible edges, store only the edges that exist. For each node, maintain a list of its neighbors.

Structure: Map nodes to their neighbor lists

Node 0 → [1, 2] means node 0 connects to nodes 1 and 2

Implementation Choices

Array of Arrays

Simple, cache-friendly. Fast iteration. Adding nodes is expensive if array needs resizing.

Hash Map of Sets

O(1) edge existence check per neighbor list. Easy to add/remove nodes. More memory overhead.

Linked Lists

Classic textbook approach. O(1) edge insertion at head. Poor cache performance in practice.

CSR Format (Sparse)

Compressed Sparse Row. Two arrays: values + indices. Optimal for static graphs. Used in scipy.sparse.

Complexity Analysis

OperationTimeNotes
Check if edge existsO(degree)Must scan neighbor list (O(1) if using set)
Get all neighborsO(degree)Just return the list
Add edgeO(1)**Amortized for dynamic arrays
SpaceO(V + E)Proportional to actual edges

When to Use

Adjacency lists are the default choice for most graph algorithms. BFS, DFS, Dijkstra, and most graph traversals spend most of their time iterating over neighbors, exactly what adjacency lists are optimized for. Use them for sparse graphs (social networks, road maps, dependency graphs).

Edge List

The simplest representation: just store all edges as a list of pairs (or triples for weighted graphs). No complex indexing, just raw edge data.

Structure: List of edge tuples

[(0, 1), (0, 2), (1, 3)] or [(src, dst, weight)] for weighted graphs

When Edge Lists Shine

Kruskal's Algorithm

Requires sorting all edges by weight. Edge list is the perfect format.

Graph I/O

Most graph file formats (CSV, edge-list files) are naturally edge lists.

Streaming Graphs

When edges arrive one at a time, edge list is the natural format.

Memory-Constrained

Minimal overhead: just the edges themselves.

OperationTime
Check if edge existsO(E)
Get all neighborsO(E)
Add edgeO(1)
SpaceO(E)

Incidence Matrix

Less common but important for certain applications. A V×E matrix where rows are nodes and columns are edges.

B[v][e]={1if node v is incident to edge e0otherwiseB[v][e] = \begin{cases} 1 & \text{if node } v \text{ is incident to edge } e \\ 0 & \text{otherwise} \end{cases}

For directed graphs: +1 for source, -1 for target.

Why Use It?

Hypergraphs: When edges can connect more than 2 nodes, incidence matrices generalize naturally.

Electrical circuits: Kirchhoff's laws are elegantly expressed via incidence matrices.

Graph Laplacian: L=BBTL = B B^T gives the Laplacian matrix, fundamental in spectral graph theory.

Comparison & Trade-offs

RepresentationSpaceEdge CheckNeighborsBest For
Adjacency MatrixO(V²)O(1)O(V)Dense graphs, matrix ops
Adjacency ListO(V+E)O(deg)O(deg)Sparse graphs, traversals
Edge ListO(E)O(E)O(E)Kruskal's, I/O, streaming
Incidence MatrixO(VE)O(V)O(E)Hypergraphs, Laplacian

Decision Guide

  • Graph is dense (E ≈ V²)? Use adjacency matrix.
  • Graph is sparse and you need fast traversals? Use adjacency list.
  • You need to frequently check if edge (u,v) exists in a sparse graph? Use hash set of edges or adjacency list with set.
  • You are doing matrix operations (GNNs, spectral methods)? Use sparse adjacency matrix (CSR format).

ML Applications

Graph Neural Networks (GNNs)

GNNs use adjacency matrices (often in sparse format) to propagate information between nodes. The core operation is:

H(l+1)=σ(A~H(l)W(l))H^{(l+1)} = \sigma(\tilde{A} H^{(l)} W^{(l)})

Where à is the normalized adjacency matrix. Efficient sparse matrix multiplication is critical for scaling to large graphs. Learn more about GNNs.

PageRank & Random Walks

PageRank computes the stationary distribution of a random walk on the web graph. This involves repeated matrix-vector multiplication with the transition matrix (derived from adjacency matrix).

Sparse representations are essential. The web graph has billions of nodes but each page links to only a few others.

Recommendation Systems

User-item interactions form a bipartite graph. Collaborative filtering algorithms traverse this graph to find similar users or items.

The adjacency list representation enables efficient "users who bought X also bought Y" queries.

Knowledge Graphs

Entity-relation-entity triples (like "Einstein" → "born_in" → "Germany") are naturally stored as edge lists with edge types.

Models like TransE and RotatE learn embeddings by operating on these structured edge lists.

Social Network Analysis

Finding communities, influencers, and information cascades in social networks requires efficient graph traversal.

Adjacency lists are ideal for algorithms like Label Propagation and Louvain community detection.

Advanced Representations

Compressed Sparse Row (CSR)

The industry standard for sparse matrices. Uses three arrays:

  • data: Edge values in row-major order
  • indices: Column index for each value
  • indptr: Where each row starts in data

Space: O(E + V). Fast row slicing, slow column slicing. Use CSC for column operations.

COO (Coordinate Format)

Three arrays: row indices, column indices, values. Easy to construct incrementally, but slow for arithmetic.

Often used as an intermediate format before converting to CSR.

Block Sparse Matrices

When graphs have natural block structure (e.g., community structure), store blocks rather than individual entries.

Used in large-scale GNNs and distributed graph processing.