Introduction
A graph is an abstract concept: a set of nodes connected by edges. But computers do not understand abstract concepts. They understand memory addresses, arrays, and pointers. Before you can run any graph algorithm, you must first answer a fundamental question: how do I store this graph in memory?
The answer is not straightforward. Different representations make different operations efficient. Choosing the wrong one can turn an O(1) lookup into O(n), or balloon your memory usage from O(n) to O(n²). This is especially critical in machine learning, where graphs can have millions of nodes.
The Core Trade-off
There is no universally "best" representation. Adjacency matrices excel at checking if two nodes are connected. Adjacency lists are superior for iterating over neighbors. Edge lists are compact for sparse graphs. Understanding these trade-offs is essential for writing efficient graph algorithms.
Why Representation Matters
Consider a social network with 1 billion users. Most people have around 200 friends, not 1 billion. This is a sparse graph. Now consider two representation choices:
Adjacency Matrix
Space: 109 × 109 = 1018 entries
That is 1 exabyte just for the structure. Impossible.
Adjacency List
Space: 109 nodes × 200 friends = 2 × 1011 entries
About 200 GB. Large but feasible.
The difference is a factor of 5 million. This is why understanding graph representations is not academic trivia. It determines whether your algorithm runs at all.
Key Terminology
Dense Graph
Most pairs of nodes are connected. E ≈ V². Example: fully connected neural network layers.
Sparse Graph
Few edges relative to nodes. E ≪ V². Example: social networks, road networks.
Directed Graph
Edges have direction (A→B ≠ B→A). Example: Twitter follows, web links.
Weighted Graph
Edges have numerical values (costs, distances). Example: road distances, transition probabilities.
Interactive: Space Complexity
Explore how memory usage scales with different graph sizes and densities. This helps you understand when each representation makes sense.
Space Complexity
Visualize memory scaling. Find the right representation for your graph density.
Parameters
Winner: Edge List
For this configuration, the Edge List uses the least amount of memory (2.0K units).
O(E)O(V + E)O(V²)O(VE)Interactive: Representation Explorer
Toggle between graph types and hover over nodes to see how the same graph looks in different representations. Click on matrix cells to highlight the corresponding edge.
Graph Explorer
Visualize how graph structure maps to memory.
Adjacency Matrix
The adjacency matrix is the most intuitive representation. Given V nodes, create a V×V matrix A where:
For weighted graphs, replace 1 with the edge weight.
Properties
Undirected graphs: The matrix is symmetric. A[i][j] = A[j][i]. You could save half the space by storing only the upper triangle.
Self-loops: Represented by diagonal entries A[i][i].
Degree of node i: Sum of row i (out-degree) or column i (in-degree).
Complexity Analysis
| Operation | Time | Notes |
|---|---|---|
| Check if edge exists | O(1) | Direct array lookup |
| Get all neighbors of node | O(V) | Must scan entire row |
| Add/remove edge | O(1) | Direct array update |
| Space | O(V²) | Regardless of edge count |
Matrix Powers: A Hidden Superpower
One remarkable property: Ak[i][j] counts the number of paths of length k from node i to node j.
Example Application
If A² = A × A, then A²[0][3] tells you how many 2-hop paths exist from node 0 to node 3. This is used in PageRank, random walks, and Graph Neural Networks.
Adjacency List
Instead of storing all possible edges, store only the edges that exist. For each node, maintain a list of its neighbors.
Structure: Map nodes to their neighbor lists
Node 0 → [1, 2] means node 0 connects to nodes 1 and 2
Implementation Choices
Array of Arrays
Simple, cache-friendly. Fast iteration. Adding nodes is expensive if array needs resizing.
Hash Map of Sets
O(1) edge existence check per neighbor list. Easy to add/remove nodes. More memory overhead.
Linked Lists
Classic textbook approach. O(1) edge insertion at head. Poor cache performance in practice.
CSR Format (Sparse)
Compressed Sparse Row. Two arrays: values + indices. Optimal for static graphs. Used in scipy.sparse.
Complexity Analysis
| Operation | Time | Notes |
|---|---|---|
| Check if edge exists | O(degree) | Must scan neighbor list (O(1) if using set) |
| Get all neighbors | O(degree) | Just return the list |
| Add edge | O(1)* | *Amortized for dynamic arrays |
| Space | O(V + E) | Proportional to actual edges |
When to Use
Adjacency lists are the default choice for most graph algorithms. BFS, DFS, Dijkstra, and most graph traversals spend most of their time iterating over neighbors, exactly what adjacency lists are optimized for. Use them for sparse graphs (social networks, road maps, dependency graphs).
Edge List
The simplest representation: just store all edges as a list of pairs (or triples for weighted graphs). No complex indexing, just raw edge data.
Structure: List of edge tuples
[(0, 1), (0, 2), (1, 3)] or [(src, dst, weight)] for weighted graphs
When Edge Lists Shine
Kruskal's Algorithm
Requires sorting all edges by weight. Edge list is the perfect format.
Graph I/O
Most graph file formats (CSV, edge-list files) are naturally edge lists.
Streaming Graphs
When edges arrive one at a time, edge list is the natural format.
Memory-Constrained
Minimal overhead: just the edges themselves.
| Operation | Time |
|---|---|
| Check if edge exists | O(E) |
| Get all neighbors | O(E) |
| Add edge | O(1) |
| Space | O(E) |
Incidence Matrix
Less common but important for certain applications. A V×E matrix where rows are nodes and columns are edges.
For directed graphs: +1 for source, -1 for target.
Why Use It?
Hypergraphs: When edges can connect more than 2 nodes, incidence matrices generalize naturally.
Electrical circuits: Kirchhoff's laws are elegantly expressed via incidence matrices.
Graph Laplacian: gives the Laplacian matrix, fundamental in spectral graph theory.
Comparison & Trade-offs
| Representation | Space | Edge Check | Neighbors | Best For |
|---|---|---|---|---|
| Adjacency Matrix | O(V²) | O(1) | O(V) | Dense graphs, matrix ops |
| Adjacency List | O(V+E) | O(deg) | O(deg) | Sparse graphs, traversals |
| Edge List | O(E) | O(E) | O(E) | Kruskal's, I/O, streaming |
| Incidence Matrix | O(VE) | O(V) | O(E) | Hypergraphs, Laplacian |
Decision Guide
- Graph is dense (E ≈ V²)? Use adjacency matrix.
- Graph is sparse and you need fast traversals? Use adjacency list.
- You need to frequently check if edge (u,v) exists in a sparse graph? Use hash set of edges or adjacency list with set.
- You are doing matrix operations (GNNs, spectral methods)? Use sparse adjacency matrix (CSR format).
ML Applications
Graph Neural Networks (GNNs)
GNNs use adjacency matrices (often in sparse format) to propagate information between nodes. The core operation is:
Where à is the normalized adjacency matrix. Efficient sparse matrix multiplication is critical for scaling to large graphs. Learn more about GNNs.
PageRank & Random Walks
PageRank computes the stationary distribution of a random walk on the web graph. This involves repeated matrix-vector multiplication with the transition matrix (derived from adjacency matrix).
Sparse representations are essential. The web graph has billions of nodes but each page links to only a few others.
Recommendation Systems
User-item interactions form a bipartite graph. Collaborative filtering algorithms traverse this graph to find similar users or items.
The adjacency list representation enables efficient "users who bought X also bought Y" queries.
Knowledge Graphs
Entity-relation-entity triples (like "Einstein" → "born_in" → "Germany") are naturally stored as edge lists with edge types.
Models like TransE and RotatE learn embeddings by operating on these structured edge lists.
Social Network Analysis
Finding communities, influencers, and information cascades in social networks requires efficient graph traversal.
Adjacency lists are ideal for algorithms like Label Propagation and Louvain community detection.
Advanced Representations
Compressed Sparse Row (CSR)
The industry standard for sparse matrices. Uses three arrays:
data: Edge values in row-major orderindices: Column index for each valueindptr: Where each row starts indata
Space: O(E + V). Fast row slicing, slow column slicing. Use CSC for column operations.
COO (Coordinate Format)
Three arrays: row indices, column indices, values. Easy to construct incrementally, but slow for arithmetic.
Often used as an intermediate format before converting to CSR.
Block Sparse Matrices
When graphs have natural block structure (e.g., community structure), store blocks rather than individual entries.
Used in large-scale GNNs and distributed graph processing.