What is the graph Laplacian matrix?

The graph Laplacian is defined as L = D - A, where D is the degree matrix (diagonal) and A is the adjacency matrix. Each diagonal entry L[i][i] equals the degree of node i, and off-diagonal entry L[i][j] = -1 if edge (i,j) exists, 0 otherwise. The Laplacian's rows always sum to zero.

What is the normalized graph Laplacian?

The normalized graph Laplacian has two common forms: (1) Symmetric: Lsym = D^(-1/2) L D^(-1/2) = I - D^(-1/2) A D^(-1/2) with eigenvalues in [0,2], (2) Random walk: Lrw = D^(-1) L = I - D^(-1) A, related to random walks. Normalized versions treat nodes equally regardless of degree, often performing better than unnormalized Laplacian.

What are the properties of graph Laplacian eigenvalues?

Graph Laplacian eigenvalues are: (1) All real and non-negative (≥ 0), (2) Smallest eigenvalue λ₁ = 0 always, (3) Number of zero eigenvalues equals number of connected components, (4) Second eigenvalue λ₂ (Fiedler value) measures algebraic connectivity - larger λ₂ means better connectivity, (5) Eigenvalue spectrum reveals clustering structure.

What is the Fiedler vector?

The Fiedler vector is the eigenvector corresponding to the second smallest eigenvalue λ₂ of the graph Laplacian. It naturally partitions the graph: nodes with positive Fiedler vector values go in one cluster, negative values in another. The optimal graph cut often occurs where the Fiedler vector crosses zero.

How does spectral clustering work?

Spectral clustering algorithm: (1) Build graph Laplacian L = D - A, (2) Compute k smallest eigenvectors v₁,...,vₖ, (3) Form n×k matrix U where row i contains [v₁[i], v₂[i], ..., vₖ[i]], (4) Apply k-means clustering to rows of U. The spectral embedding 'untangles' complex graph structure into well-separated clusters.

What is the Laplacian quadratic form?

For any vector x assigning values to nodes, the Laplacian quadratic form is xᵀLx = Σ(i,j)∈E (xi - xj)². This measures 'smoothness' over the graph - how much connected nodes differ. Low values mean smooth signals (neighbors have similar values), high values mean rough signals. Used in GNNs and semi-supervised learning.

What is Cheeger's inequality?

Cheeger's inequality relates the second eigenvalue λ₂ to the Cheeger constant h(G), which measures how easily the graph can be cut: λ₂/2 ≤ h(G) ≤ √(2λ₂). This means λ₂ approximates the optimal cut quality. Computing h(G) exactly is NP-hard, but computing λ₂ is polynomial time, making spectral methods practical.

Spectral Graph Theory: Laplacian, Eigenvalues & Clustering

Introduction

Graphs are combinatorial objects: nodes and edges. Linear algebra deals with matrices and vector spaces. At first glance, these seem like separate worlds. Spectral graph theory is the bridge between them.

The core insight is simple yet profound: we can represent a graph as a matrix, then analyze that matrix using eigenvalues and eigenvectors. The "spectrum" of a graph (its set of eigenvalues) encodes structural information that is invisible to the naked eye: how clustered the graph is, whether it is connected, how information flows through it.

The Big Idea: Eigenvalues are to graphs what frequencies are to sound waves. A chord sounds different from a single note because it has multiple frequencies. Similarly, a clustered graph "sounds different" from a random graph because its eigenvalue spectrum has a different shape.

Why "Spectral"?

The term "spectral" comes from physics. When you shine white light through a prism, it splits into a spectrum of colors, each color corresponding to a different frequency. In linear algebra, the spectrum of a matrix is its set of eigenvalues.

Just as the color spectrum tells you about light's composition, a matrix's spectrum tells you about its fundamental properties. For graphs, this means the eigenvalue spectrum reveals:

Connectivity

How many connected components? Is the graph well-connected or fragile?

Clustering

Are there natural communities? How separable are they?

Expansion

How quickly does information spread? Are there bottlenecks?

Bipartiteness

Can nodes be split into two groups with edges only between groups?

The Graph Laplacian

While the adjacency matrix is the most obvious way to represent a graph, the Graph Laplacian is far more useful for spectral analysis. It is defined as:

L = D - A

Laplacian matrix (V × V)

Degree matrix (diagonal)

Adjacency matrix

Constructing the Laplacian

Step 1: Adjacency Matrix A

A[i][j] = 1 if edge (i, j) exists, 0 otherwise. For undirected graphs, A is symmetric.

Step 2: Degree Matrix D

D is diagonal: D[i][i] = degree of node i. All off-diagonal entries are 0.

Step 3: Laplacian L = D - A

Diagonal entries: L[i][i] = degree of node i. Off-diagonal: L[i][j] = -1 if edge exists, 0 otherwise.

Example

Graph: 0 -- 1 -- 2

Adjacency A:        Degree D:           Laplacian L = D - A:
[0, 1, 0]           [1, 0, 0]           [ 1, -1,  0]
[1, 0, 1]           [0, 2, 0]           [-1,  2, -1]
[0, 1, 0]           [0, 0, 1]           [ 0, -1,  1]

Notice: each row of L sums to zero. This is always true and is the key property that makes the Laplacian special.

Why "Laplacian"?

In continuous domains, the Laplacian operator $\nabla^2 f = \frac{\partial^2 f}{\partial x^2} + \frac{\partial^2 f}{\partial y^2}$ measures how a function differs from its neighborhood average.

The graph Laplacian is the discrete analog: it measures how a node's value differs from its neighbors' values. This connection enables us to apply continuous mathematical tools to discrete graph structures.

Quadratic Form & Smoothness

The Laplacian's true power emerges through its quadratic form. For any vector x that assigns values to nodes:

x^T L x = \sum_{(i,j) \in E} (x_i - x_j)^2

Sum of squared differences across all edges

This measures smoothness over the graph:

Low Value = Smooth

Connected nodes have similar values. The signal varies slowly across the graph.

High Value = Rough

Connected nodes have very different values. The signal changes abruptly.

Why This Matters for ML

In semi-supervised learning, we add $x^T L x$ as a regularization term. This encourages predictions to be smooth over the graph: if two nodes are connected, they should have similar labels. GNNs implicitly optimize for graph smoothness.

Interactive: Graph Smoothness

Assign values to nodes and watch how the Laplacian quadratic form measures smoothness. Edge thickness indicates the magnitude of difference between connected nodes.

Laplacian Smoothness

Total Roughness:2.00

Edges glow when connected nodes have different values. This visually represents the Laplacian term (vᵢ - vⱼ)².

Adjust Signal Values

Node 01.0

Node 10.0

Node 20.0

Pro Tip: Set all nodes to the same value (e.g., 1.0). Roughness drops to 0. This is the Lambda_0 eigenvector (constant vector).

Properties of the Laplacian

Symmetric & Positive Semi-Definite

All eigenvalues are real and ≥ 0. This is crucial for many algorithms.

Rows Sum to Zero

For any node i: L[i][i] - Σⱼ L[i][j] = degree - degree = 0. This means 1 (the all-ones vector) is always an eigenvector with eigenvalue 0.

Quadratic Form

For any vector x: $x^T L x = \sum_{(i,j) \in E} (x_i - x_j)^2$ . This measures how "smooth" x is over the graph. Neighboring nodes with similar values contribute less.

Number of Zero Eigenvalues = Connected Components

A graph with k connected components has exactly k eigenvalues equal to 0. This is how we can "count" components algebraically.

Eigenvalues & Their Meaning

Let the eigenvalues of L be $0 = \lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_n$ . Each eigenvalue tells us something important:

λ₁ = 0 (Always)

The smallest eigenvalue is always 0, with eigenvector 1. This represents the trivial partition where all nodes are in the same cluster.

λ₂ (Algebraic Connectivity / Fiedler Value)

The second smallest eigenvalue is arguably the most important. It measures how well-connected the graph is:

λ₂ = 0 means graph is disconnected (multiple components)
λ₂ small means graph is "almost disconnected" (has bottlenecks)
λ₂ large means graph is well-connected (hard to cut)

λₙ (Largest Eigenvalue)

Related to the maximum degree. For regular graphs (all nodes same degree d): λₙ ≤ 2d. Large λₙ indicates the presence of high-degree "hub" nodes.

Spectral Gap (λ₂)

The gap between λ₁ and λ₂ determines how fast random walks mix. Large gap means fast mixing and good expansion properties. This is why expander graphs (used in cryptography and CS theory) are designed to maximize λ₂.

Interactive: Eigenvalue Spectrum

See how different graph structures produce characteristic eigenvalue patterns. Pay special attention to λ₂!

Eigenvalue Spectrum

Select logic structures to see their spectral fingerprint.

Topology Preview

Two dense groups with a weak bridge.

Eigenvalues (λ)Sorted

λ₂ Fiedler

λ₂ Value

0.08

Connectivity

Weak

Components

The Fiedler Vector

The eigenvector corresponding to λ₂ is called the Fiedler vector, named after Miroslav Fiedler who proved its remarkable property: it naturally partitions the graph.

The Magic: If you sort nodes by their Fiedler vector values and cut anywhere, you get a reasonable graph partition. The optimal cut (minimizing edges between partitions) often occurs where the Fiedler vector crosses zero.

Why Does This Work?

Remember the quadratic form: $x^T L x = \sum_{(i,j) \in E} (x_i - x_j)^2$ . The Fiedler vector minimizes this subject to being orthogonal to 1 and having unit norm.

In plain terms: the Fiedler vector assigns values to nodes such that connected nodes have similar values, while the overall assignment is as "spread out" as possible. This naturally groups densely connected clusters together.

Simple Partitioning Rule

Compute Fiedler vector v
Nodes with v[i] < 0 → Cluster 1
Nodes with v[i] ≥ 0 → Cluster 2

Interactive: Spectral Clustering

Step through the spectral clustering algorithm to see how matrix operations reveal graph structure.

Spectral Clustering Demo

Two-moon dataset: Hard for k-means, easy for spectral methods

Clusters:

Build k-NN Graph

Connect each point to its k nearest neighbors

Compute Laplacian

L = D - A where D is degree matrix

Find Eigenvectors

Get k smallest eigenvectors of L

K-Means Clustering

Cluster in eigenvector space

The Spectral Clustering Algorithm

Spectral clustering extends the Fiedler vector idea to k clusters:

Step 1: Build the Laplacian

Compute L = D - A (or use a normalized variant for better results).

Step 2: Compute Eigenvectors

Find the k smallest eigenvectors of L: v₁, v₂, ..., vₖ.

Step 3: Form the Embedding Matrix

Create an n × k matrix U where row i is [v₁[i], v₂[i], ..., vₖ[i]]. Each node is now a point in k-dimensional space.

Step 4: Run k-Means

Apply standard k-means clustering to the rows of U. The cluster labels transfer back to the original nodes.

Why k-Means Works Here: The spectral embedding "untangles" the graph structure. In the original graph, clusters may be interleaved in complex ways. In the spectral embedding, clusters become well-separated Gaussian-like blobs that k-means can easily identify.

Cheeger's Inequality

Cheeger's inequality is the theoretical backbone of spectral clustering. It relates the algebraic connectivity λ₂ to the Cheeger constant h(G), which measures how "cuttable" the graph is.

Cheeger Constant

h(G) = \min_{S} \frac{|\text{edges between } S \text{ and } \bar{S}|}{\min(|S|, |\bar{S}|)}

The minimum over all ways to cut the graph into two pieces, normalized by the size of the smaller piece. Small h(G) means there is a "cheap" cut.

Cheeger's Inequality

\frac{\lambda_2}{2} \leq h(G) \leq \sqrt{2 \lambda_2}

λ₂ is a tight approximation of the optimal cut quality. Finding h(G) exactly is NP-hard, but computing λ₂ is polynomial time!

Normalized Laplacians

The unnormalized Laplacian L = D - A has a bias: high-degree nodes dominate. For many applications, normalized variants work better:

Symmetric Normalized

L_{sym} = D^{-1/2} L D^{-1/2} = I - D^{-1/2} A D^{-1/2}

Eigenvalues in [0, 2]. Used in most spectral clustering implementations.

Random Walk Normalized

L_{rw} = D^{-1} L = I - D^{-1} A

Connection to random walks on the graph. D⁻¹A is the transition matrix.

The normalized Laplacian treats each node more equally, regardless of degree. This is often what you want: a hub with 1000 connections should not dominate over a leaf node with 2 connections.

ML Applications

Graph Neural Networks (GNNs)

GNNs propagate information using the normalized adjacency matrix. The spectral perspective explains why: GNNs are essentially applying low-pass filters on the graph spectrum.

H^{(l+1)} = \sigma(\tilde{D}^{-1/2} \tilde{A} \tilde{D}^{-1/2} H^{(l)} W)

The multiplication by the normalized adjacency smooths features over neighbors, exactly what eigenvectors of L do.

Community Detection

Social networks, biological networks, and citation graphs all have community structure. Spectral methods find these communities by looking at the "gaps" in the eigenvalue spectrum.

If there are k natural communities, you will typically see k eigenvalues close to 0, then a gap, then the rest.

Image Segmentation

Treat pixels as nodes, with edges weighted by similarity (color, texture). Spectral clustering groups visually coherent regions together.

The normalized cuts algorithm (Shi & Malik, 2000) is a classic spectral method for image segmentation.

Laplacian Eigenmaps

A dimensionality reduction technique similar to PCA. Embed high-dimensional data in a low-dimensional space such that nearby points in the original space remain nearby.

The embedding coordinates are simply the bottom eigenvectors of the graph Laplacian built from the data's nearest-neighbor graph.

Contents

Introduction

Why "Spectral"?

Connectivity

Clustering

Expansion

Bipartiteness

The Graph Laplacian

Constructing the Laplacian

Step 1: Adjacency Matrix A

Step 2: Degree Matrix D

Step 3: Laplacian L = D - A

Example

Why "Laplacian"?

Quadratic Form & Smoothness

Low Value = Smooth

High Value = Rough

Why This Matters for ML

Interactive: Graph Smoothness

Laplacian Smoothness

Adjust Signal Values

Properties of the Laplacian

Symmetric & Positive Semi-Definite

Rows Sum to Zero

Quadratic Form

Number of Zero Eigenvalues = Connected Components

Eigenvalues & Their Meaning

λ₁ = 0 (Always)

λ₂ (Algebraic Connectivity / Fiedler Value)

λₙ (Largest Eigenvalue)

Spectral Gap (λ₂)

Interactive: Eigenvalue Spectrum

Eigenvalue Spectrum

The Fiedler Vector

Why Does This Work?

Simple Partitioning Rule

Interactive: Spectral Clustering

Spectral Clustering Demo

The Spectral Clustering Algorithm

Step 1: Build the Laplacian

Step 2: Compute Eigenvectors

Step 3: Form the Embedding Matrix

Step 4: Run k-Means

Cheeger's Inequality

Cheeger Constant

Cheeger's Inequality

Normalized Laplacians

Symmetric Normalized

Random Walk Normalized

ML Applications

Graph Neural Networks (GNNs)

Community Detection

Image Segmentation

Laplacian Eigenmaps