What Happens When Magnet Attracts Metal?

(To be rephrased)

When you witness a magnet pulling a piece of iron, you are seeing one of the most profound "glitches" in classical physics. To understand why that piece of metal moves, we must dismantle our classical intuition and rebuild it using the tools of Dirac, Pauli, and Heisenberg.

1. The Paradox: Why the Lorentz Force is "Lazy"

In classical electromagnetism, the force exerted on a charge q moving with velocity v in a magnetic field B is given by the Lorentz force law:

Fmag=q(v×B)

By definition, this force is always perpendicular to the velocity vector (Fv=0). Since power is defined as the rate of doing work (P=Fv), the magnetic force performs zero work.

Yet, when a magnet attracts a piece of metal, the metal clearly gains kinetic energy. If the magnetic force cannot do work, what is actually pulling the metal? The answer lies in the fact that magnetism is not merely a classical force, but a quantum relativistic correction.

2. The Dead End: The Bohr-van Leeuwen Theorem

Before we find the solution, we must prove that classical physics is incapable of producing magnetism. This is the Bohr-van Leeuwen Theorem.

Consider a system of N electrons. The magnetic moment μ is defined by the orbital motion of charges:

μ=q2r×v=q2m(r×mv)=q2mL

Crucially, the total magnetization in the z-direction is a linear function of the generalized velocities q˙i:

Mz=i=13Nai(q1,,q3N)q˙i

In an electromagnetic field, the Hamiltonian H (total energy) is given by:

H=i=1N(piqA)22m+qV(q1,,qN)

Using Hamilton’s equations of motion, we have q˙i=Hpi. The thermal average of the magnetization Mz is calculated via the canonical partition function:

Mz=dq1dq3Ndp1dp3NMzeH/kBTdq1dq3Ndp1dp3NeH/kBT

Let us focus on the momentum integral for a single term aiq˙i. Since ai depends only on coordinates, we can pull it out and integrate over the momentum pi:

+dpiq˙ieH/kBT=+dpiHpieH/kBT

This integral is mathematically equivalent to integrating a total derivative. By recognizing that HpieβH=1βpi(eβH) (where β=1/kBT), we have:

H(p=)H(p=+)dHeH/kBT=[kBTeH/kBT]pi=pi=+

Since the kinetic energy term (pqA)22m goes to infinity as momentum p±, the Boltzmann factor eH/kBT vanishes at the boundaries. Thus:

[kBTeH/kBT]+=00=0

Result: M=0. The physics is merciless: The vector potential A shifts the momentum, but since we integrate over all possible momenta from to +, this shift contributes nothing to the total integral. Classical physics predicts that magnets cannot exist.

3. Beyond the Schrödinger Equation

The Schrödinger equation is built upon the non-relativistic energy-momentum relation E=p2/2m. If we attempt to use the relativistic relation E2=c2p2+m2c4 by substituting quantum operators Eit and pi, we obtain the Klein-Gordon Equation:

1c22ψt22ψ+(mc)2ψ=0

This equation can be written more compactly using the d'Alembertian operator, =1c22t22, which is the Minkowski spacetime generalization of the Laplacian operator:

(+(mc)2)ψ=0

However, when first proposed, the Klein-Gordon equation was nearly abandoned due to two significant conceptual problems that arose from interpreting ψ as a single-particle wave function, analogous to its role in the Schrödinger equation.

On one hand, the relativistic energy-momentum relation is quadratic in energy, leading to two solutions: E=±(pc)2+(mc2)2. The existence of negative energy states was deeply problematic. It implied that a particle could continuously radiate energy by transitioning to ever-lower negative energy levels, suggesting that no stable ground state existed for matter.

On the other hand, the wave function in quantum mechanics is used to construct a conserved probability current, jμ=(ρ,j), where ρ is the probability density. For the Schrödinger equation, the probability density ρ=|ψ|2 is always non-negative. For the Klein-Gordon equation, the derived conserved density is:

ρ=i2mc2(ψψtψψt)

Because the equation is second-order in time, the values of ψ and ψt can be chosen independently at any given initial time. This allows for conditions where ρ can be negative, which is nonsensical for a probability density. A particle cannot have a negative probability of being found somewhere.

These problems stem from the second-order nature of the time derivative. Dirac's goal was to find a new equation that was simultaneously first-order in time (like the Schrödinger equation) and relativistically covariant. The natural approach is to write a Hamiltonian linear in momentum with to-be-defined coefficients:

H^=c(αxp^x+αyp^y+αzp^z)+βmc2

where p^k=ixk are the momentum operators. The corresponding wave equation is:

iψt=H^ψ=(ick=13αkxk+βmc2)ψ=(icα+βmc2)ψ

For this equation to be relativistically consistent, any free-particle solution to it must also satisfy the relativistic energy-momentum relation (i.e., the Klein-Gordon equation). This requires that when the Dirac operator is applied twice ("squaring" the equation), it yields the Klein-Gordon operator:

(cαipi+βmc2)2=c2pi2+m2c4

Expanding the left side (noting that pi commute with each other, but the coefficient matrices αi,β do not necessarily commute):

c2iαi2pi2+c2i<j{αi,αj}pipj+mc3i{αi,β}pi+β2m2c4

Comparing the coefficients on both sides, we derive the algebraic requirements for αi and β, known as the Clifford Algebra:

  1. αi2=I
  2. β2=I
  3. {αi,αj}=0(ij)
  4. {αi,β}=0

To make the equation manifestly covariant under Lorentz transformations, we define the Gamma matrices γμ:

Substituting these back, we obtain the concise covariant form:

(iγμμmc)ψ=0,where μ=(1ct,)

Using the properties of αi and β, we can derive the core properties of γμ:

Combining these results, the Gamma matrices satisfy the famous Anti-commutation Relation:

{γμ,γν}=2ημνI

where ημν=diag(1,1,1,1) is the Minkowski metric.

The smallest irreducible representation satisfying this algebra involves 4×4 matrices. We can deduce this by elimination:

In 4D, the most common choice is the Dirac Representation (Standard Representation), which uses the 2×2 identity matrix I and Pauli matrices σi:

γ0=(I00I),γi=(0σiσi0),γi=βαi=(I00I)(0σiσi0)=(0σiσi0)σx=(0110),σy=(0i+i0),σz=(1001)

This is a profound result: the requirement for a linear, relativistic wave equation forces the wave function ψ to have four components, which we now understand as describing the electron (spin up/down) and its antiparticle, the positron (spin up/down). Dirac attempted to solve the negative probability problem but inadvertently unlocked the door to spin. This demonstrates that magnetism is not an added property of matter, but an inevitable consequence of spacetime symmetry.

4. Coupling with Electromagnetic Field

Now we consider the case with an electromagnetic potential Aμ=(Φ/c,A). We need to perform a local gauge transformation in the Dirac equation. Momentum substitution: Replace momentum p with the Kinetic Momentum: ppqA, where A is the magnetic vector potential. Energy substitution: Replace the energy operator it with: ititqΦ, where Φ is the electric scalar potential. The equation becomes:

iψt=[cα(pqA)+βmc2+qΦ]ψ

To handle 4D spacetime more intuitively, we usually write this in covariant form. Introducing the Gamma matrix definitions γ0=β,γi=βαi, and combining the 4-momentum operator pμ=iμ=(i1ct,i) with the 4-potential Aμ=(Φ/c,A), the equation can be rearranged as:

γ0(i1ctqΦc)ψ+γ(i+qA)ψmcψ=0

Using the covariant derivative Dμ=μ+iqAμ, it can be written in the minimalist form:

(iγμDμmc)ψ=0orγμ(pμqAμ)ψ=mcψ

We now wish to solve this four-component spinor equation. The simple block form of the gamma matrices allows us to group the four components into pairs. Furthermore, in the non-relativistic limit, the rest energy mc2 is the dominant term. We write the wavefunction as a combination of a "large component" ϕ and a "small component" χ (both are 2-component spinors), explicitly extracting the time evolution of the rest energy:

ψ(r,t)=(ϕ(r,t)χ(r,t))eimc2t/

Applying the product rule to the time derivative on the left side:

iψt=(it(ϕχ)+mc2(ϕχ))eimc2t/

Substituting this into the Dirac equation and using the matrix form in the standard Dirac representation, we cancel the exponential term on both sides to obtain:

it(ϕχ)+mc2(ϕχ)=c(0σσ0)π(ϕχ)+(mc200mc2)(ϕχ)+qΦ(ϕχ)

where π=pqA is the kinetic momentum. We decompose the above equation into two coupled equations:

iϕt+mc2ϕ=c(σπ)χ+mc2ϕ+qΦϕiχt+mc2χ=c(σπ)ϕmc2χ+qΦχ

In the non-relativistic limit, the electron's kinetic and potential energies are much smaller than its rest energy, i.e., |itχ||2mc2χ| and |qΦχ||2mc2χ|. Thus, the equation for the small component approximates to:

2mc2χc(σπ)ϕχσπ2mcϕ

This explains why χ is called the "small component": its magnitude is roughly v/c times that of the large component. Substituting the expression for χ back into the equation for the large component ϕ:

iϕt=[(σπ)22m+qΦ]ϕ

Here lies the key to the emergence of spin. Using the Pauli matrix identity (σA)(σB)=AB+iσ(A×B):

(σπ)2=ππ+iσ(π×π)

We calculate the operator cross product π×π acting on a test function f:

(π×π)f=(pqA)×(pqA)f=(p×pqp×AqA×p+q2A×A)f

Since p×p=0 and A×A=0, the remaining terms are:

q(p×A+A×p)f=q[(i)×(Af)+A×(if)]=iq[(×A)fA×(f)+A×(f)]=iq(×A)f=iqBf

Therefore:

(σπ)2=π2+iσ(iqB)=π2qσB

Finally, we have fully derived the Pauli Equation, which explicitly includes the Zeeman term U=μB:

iϕt=[(pqA)22m+qΦq2mσB]ϕ

Clearly, an extra term has appeared in the Hamiltonian:

H=(pqA)22m+qΦq2mσB

This is the origin of spin. If we hadn't introduced the σ matrices (i.e., the Spin-0 Klein-Gordon case), the kinetic operator term would simply be π2. This term would not produce a linear coupling with the magnetic field B, degenerating instead into the standard electromagnetic coupling Hamiltonian for a spinless particle: HSpin0=(pqA)22m+qΦ. It is precisely because the relativistic covariance of the electron wavefunction requires a matrix structure that the non-commutativity of π is transformed into a magnetic energy term. In simple terms: introducing the σ matrix mathematically adds an "internal axis of rotation" to the particle. Without it, the particle is a completely "isotropic" point in space. The σ matrix defines the non-commutativity of operators; in a Spin-0 world, the three components of momentum πx,πy,πz do not commute in a magnetic field, but their effect is "external" (changing the particle's trajectory). Introducing the σ matrix directly couples this non-commutativity (the magnetic field strength B) to the internal dimensions of the wavefunction.

5. Universal Covering between SU2 and SO3

We have already observed the "emergence" of spin from the dynamical level. However, as an intrinsic property, spin should theoretically not require an external electromagnetic field to manifest; fundamentally, its existence must be linked to symmetry and transformation. Why must the wavefunction be operated on by σ matrices? Why is it a two-component spinor? This requires us to delve into the kinematic underpinnings—specifically, the group-theoretical basis of spacetime symmetry. Even without considering electromagnetic fields, as a physical object, when we rotate the laboratory coordinate system by an angle θ, the electron's wavefunction ψ must inevitably change. It is precisely this "transformation rule under rotation" that defines spin. Before diving into the physics, let us briefly introduce the most important groups: SU(2) and SO(3). These are Lie groups and hold an extremely important place in physics.

The basic definition of SU(2) is:

SU(2){U|UGL(2,C),UU=12×2,|U|=1}{[abba]|a,bC,|a|2+|b|2=1}{U(n,ω)=eiω2nσ|ω[0,π],n is the set of all 3D real unit vectors }

If we use real parameters xiR and let a=x4+ix3,b=x2+ix1 to describe it:

U=[abba]=[x4+ix3x2+ix1x2+ix1x4ix3]

The constraint becomes x12+x22+x32+x42=1, indicating that SU(2) as a manifold is S3, a 3-sphere (hypersphere). Its T2-fibration is described as:

{x1=sinθcosφx2=sinθsinφ,{x3=cosθcosχx4=cosθsinχ, where θ[0,π/2];φ,χ[0,2π]

This can be described using two opposing conical surfaces for χ,φ and an axis θ. At θ=0 or θ=π/2, one parameter becomes degenerate. It can also be described as a "doughnut" (solid torus) that scales with θ, causing a parameter to degenerate at the endpoints. When θ=0, the doughnut is a circle with zero width; only χ is valid along the circle, while φ is invalid because the circle has no width. When θ=π/2, the doughnut becomes a sphere with no hole, so χ becomes invalid and only φ is effective.

The spherical coordinate system description is:

 For ω[0,2π],θ[0,π],φ[0,2π] we have {x1=sinω2sinθcosφx2=sinω2sinθsinφx3=sinω2cosθx4=cosω2

SU(2) can also be expressed by Pauli matrices:

U=[x4+ix3x2+ix1x2+ix1x4ix3]=x412×2+ix1σ1+ix2σ2+ix3σ3.

Combining this further with spherical coordinates, we have:

U(n,ω)=eiω2nσ=12×2cosω2+inaσasinω2n=(sinθcosφ,sinθsinφ,cosθ);ω[0,2π],θ[0,π],φ[0,2π].

The basic definition of SO(3) is:

SO(3){R|RGL(3,R),RTR=13×3,|R|=1}{R(ω)ω=ωn,n=(cosφsinθ,sinφsinθ,cosθ)ω[0,π],θ[0,π],φ[0,2π]}.

SO(3) is more intuitive; it represents the rotation operations we can see and touch. As a manifold, SO(3) can be viewed as a solid ball of radius π formed by the endpoints of ω with antipodal identification. Its abstraction lies in this antipodal identification. Although its origin is the simple fact that rotating 180 counter-clockwise around a fixed axis is the same as rotating 180 clockwise, this antipodal identification results in the solid ball being a connected manifold but not a simply connected one (i.e., not every closed curve or loop in the space can be continuously shrunk to a point). The solid ball with antipodal identification has a name: Real Projective Space of 3 dimensions, denoted as RP3.

Mathematically, a representation of a group G on a vector space V is a homomorphism from the group G to the general linear group GL(V) (the group of all invertible transformations on V): g1,g2G,D(g1g2)=D(g1)D(g2). For Lie groups, the map is also required to be continuous. Projective representations arise because, in quantum mechanics, physical states are described by rays in Hilbert space (i.e., |ψ and eiα|ψ represent the same physical state). Therefore, the group multiplication law only needs to hold up to a phase factor:

D(g1)D(g2)=ω(g1,g2)D(g1g2)

where ω(g1,g2) is a complex number with modulus 1, called the group exponent. Bargmann's Theorem (1954) provides a rigorous mathematical framework for projective representations: for a Lie group G satisfying H2(g,R)=0 (including SO(3) and the Lorentz group), all continuous projective unitary representations can be "lifted" to ordinary unitary representations of a central extension group G~. For SO(3), where the "phase ambiguity" cannot be removed simply, we solve this by finding its Universal Covering Group, SU(2), and converting the projective representation of SO(3) into an ordinary representation of SU(2).

The concept of the universal covering group introduced here highlights the profound connection between SO(3) and SU(2). In topology, the universal covering space X~ of a space X is like an "upgraded version" of it, possessing two core features: Simply Connectedness (all loops in X~ can be shrunk to a point; no topological holes) and Local Isomorphism (in local regions, X~ looks exactly like X, but globally X~ is often "larger" and covers X in an n:1 manner).

Why do SU(2) matrices generate SO(3) rotations? Here is a classic construction method. We map a vector x=(x,y,z) in 3D space to a second-order traceless Hermitian matrix X:

X=xσ1+yσ2+zσ3=(zxiyx+iyz)

(where σi are the Pauli matrices). Note that det(X)=(x2+y2+z2)=x2. Let a matrix U in SU(2) act on X via the following transformation: X=UXU. Since U is unitary and has a determinant of 1, this transformation preserves the tracelessness and Hermiticity of X, and preserves the determinant: det(X)=det(UXU)=det(X). This means x2=x2, i.e., the transformation preserves the length of the vector, so it describes a 3D rotation. Observing the transformation formula X=UXU, if we replace U with U: (U)X(U)=(1)2UXU=UXU, we find that U and U produce exactly the same rotation effect. This is the algebraic root of the 2:1 covering: every rotation in SO(3) corresponds to two points in SU(2). This explains why rotating by 2π does not return to the origin (in SU(2), it has only traveled half a circle), while rotating by 4π is required to return to the origin (completing a full circle in SU(2)). That is to say, SO(3) itself has a "hole" (its fundamental group is Z2), while SU(2) (i.e., S3) is simply connected (fundamental group is 0) and has no topological holes.

The second condition of the universal cover, local consistency, leads to another major feature of SU(2) and SO(3): they are locally isomorphic near the identity element. This means that if you look only at "infinitesimal" rotations, or rotate only a tiny bit, the two groups are exactly the same. It is only when you rotate a large amount (e.g., 2π) to explore the "full picture" of the group that you discover they are different (one returns to the start, the other goes to I). Mathematically speaking, this is because they possess the exact same Lie Algebra, i.e., an isomorphism of the tangent spaces at the identity: su(2)so(3). From solving the Schrödinger equation, we know that the 3 generators of SO(3) are Jx,Jy,Jz (corresponding to infinitesimal rotations about the x, y, z axes), and their commutation relations are: [Ji,Jj]=iϵijkJk. This formula defines the essence of 3D rotation; the so-called generators form a basis for the Lie algebra.

For the SU(2) group, let us now solve for its generators. Assume an infinitesimal transformation:

U(ϵ)=IiϵS

For this to belong to SU(2), there is a unitarity constraint: (I+iϵS)(IiϵS)=IIiϵ(SS)=IS=S. Thus, S must be a Hermitian matrix, which is physically observable. There is also a special constraint: using the formula det(eA)=eTr(A), we have det(U)=det(eiϵS)=eiϵTr(S)=1Tr(S)=0. So S must be a traceless matrix. The matrices that fit these conditions are exactly the Pauli matrices σx,σy,σz, which form a complete basis for the Lie algebra su(2). Thus, the generator S must be proportional to σ. Since [σi,σj]=2iϵijkσk, we find this differs from [Li,Lj]=iϵijkLk by only a factor of 12. This already demonstrates that the Lie algebras of SU(2) and SO(3) are isomorphic. If we take S=12σ, it becomes the standard [Si,Sj]=iϵijkSk, exactly the same commutation relations as J. This gives us a self-consistent theory of angular momentum, where in the physical world, total angular momentum L=J+S exists. If spin S is to qualify as "angular momentum" and be additive with J to form a conserved quantity, S must follow the exact same algebraic rules as J. Since orbital angular momentum J=r×p is defined by spatial coordinates and its commutation relations are fixed (derived from the commutation of x and p), we have no choice but to let S=12σ. Beyond theoretical consistency, real-world experimental results confirm this: when we perform the Stern-Gerlach experiment, measuring the deflection of electrons in a magnetic field, the measured physical values are ±12. This directly proves that the operator S representing the physical observable must have eigenvalues of ±1/2. Mathematically, only the matrix 12σz has eigenvalues of ±1/2 (since the eigenvalues of σz are ±1).

Returning to representations: for Lie groups, a useful property of generators is that any finite transformation D(θ) can be obtained from the generator J via the exponential map. If J is an element of the Lie algebra g, then elements of the group can be written as:

D(θ)=exp(iθnJ)

Here J is the angular momentum operator (matrix) we discussed. Note that this expression involves a common "abuse of notation" or shorthand in physics. Strictly speaking, the exponential map exp maps abstract Lie algebra elements to abstract Lie group elements. However, the physical formula D(θ)=exp(iθnJ) is actually an operation within the representation space (matrix space). Since J here is the matrix representation of the generator, the result D(θ) is naturally the matrix representation of the group element.

Whether we are dealing with SO(3) or SU(2), their Lie algebras are isomorphic. This means they share the same set of generator commutation relations: [Ji,Jj]=iϵijkJk (taking =1). We want to find all finite-dimensional irreducible representations allowed by this set of algebraic rules. Define the ladder operators:

J±=Jx±iJy

Introduce the eigenstates |j,m of Jz such that:

Jz|j,m=m|j,m,J2|j,m=λ|j,m

Calculating the commutator:

[Jz,J±]=[Jz,Jx]±i[Jz,Jy]=iJy±i(iJx)=±(Jx±iJy)=±J±

This implies J± act as "ladders" for the eigenvalues:

Jz(J±|j,m)=(J±Jz+[Jz,J±])|j,m=(m±1)(J±|j,m)

If m is an eigenvalue, then m±1 are also eigenvalues. However, since we are looking for finite-dimensional representations, the spectrum of eigenvalues must have an upper bound mmax and a lower bound mmin.

J+|j,mmax=0,J|j,mmin=0

Using the operator identity JJ+=J2Jz2Jz acting on the highest weight state |j,mmax:

0=(λmmax2mmax)|mmaxλ=mmax(mmax+1)

For convenience, we label the maximum weight as j, i.e., mmaxj. So the eigenvalue of the Casimir operator is j(j+1). Similarly, using J+J=J2Jz2+Jz acting on the lowest weight state |j,mmin:

0=(j(j+1)mmin2+mmin)|mmin

Solving the equation mmin2mminj(j+1)=0, we get two solutions:

mmin=jormmin=j+1

Since mminmmax=j, we must have mmin=j. Climbing from mmin=j to mmax=j by adding 1 each step, we must reach the top in an integer number of steps k:

mmaxmmin=j(j)=2j=k(kZ)j=k2

We arrive at the conclusion that, based solely on the Lie algebra structure, the allowed values for j are 0,1/2,1,3/2,2. However, the Lie algebra structure is only a local property. We must now test these results against the global group structure. The core criterion for this test is single-valuedness: If we transform a group element along a closed path back to the starting point (the identity), its representation matrix must also return to the identity matrix (for an ordinary representation). SO(3) is the rotation group in 3D space. Rotating by 2π (360) around any axis (say, the z-axis) restores physical space completely: Rz(2π)=Rz(0)=1, which is the group identity. For an ordinary representation D of SO(3), it must satisfy: D(Rz(2π))=D(1)=I (the identity matrix). Substituting the formula derived from the Lie algebra, where Jz is diagonal in the z-basis with diagonal elements m: D(2π)=exp(i2πJz)=diag(ei2πm,). To make this matrix equal to the identity I, every diagonal element must be 1: ei2πm=1mZ. If j is an integer (0,1,), then m is an integer, and the condition is met. However, if j is a half-integer (1/2,3/2,), then m is a half-integer, and ei2πm=11. Therefore, half-integer spins are strictly forbidden in ordinary representations of SO(3).

However, the geometric structure of SU(2) is different from SO(3). It is the Universal Covering Group of SO(3) (a 2:1 cover). In SU(2), the parameter θ=2π corresponds not to the identity element, but to U(2π)=II; only a rotation of 4π corresponds to the identity element. We find that the behavior of D(2π) perfectly matches the behavior of the SU(2) group itself. We have effectively obtained an instance of Bargmann's Theorem on SO(3) and SU(2): the projective representations of the non-simply connected Lie group SO(3) are equivalent to the ordinary representations of its universal covering group SU(2). The final mapping relationship is:

Spin j In Lie Algebra In SU(2) In SO(3) Physical Particle
Integer
(0,1,)
Exists Ordinary Rep. (but not faithful, cannot distinguish ±I) Ordinary Rep. Bosons (Photons, etc.)
Half-Integer
(1/2,)
Exists Ordinary Rep. (Faithful Rep.) Projective Rep. (Multi-valued) Fermions (Electrons, etc.)

"Spin" is able to "emerge" from this abstract mathematical structure because quantum mechanics defines "physical state" more leniently than classical mechanics, thereby releasing topological degrees of freedom that were masked by classical physics.

Now let us specifically calculate what the representations are for different j. We use the exponential map:

D(j)(n^,θ)=k=0(iθ)kk!(n^J(j))k

where n^ is the rotation axis unit vector and θ is the rotation angle.

Simplest case: j=0 scalar representation. Dimension d=2(0)+1=1. The basis has only one state |0,0. Since m can only be 0, the generator Jz=[0]. The ladder operators acting on the highest/lowest weight states are 0, so J+=[0],J=[0], and thus Jx=0,Jy=0,Jz=0. The exponential map gives the representation D(θ)=eiθn0=1, also known as the trivial representation. This is a scalar; no matter how you rotate, the value is always multiplied by 1, remaining unchanged.

When j=12, the dimension is 2j+1=2. We already know the generator is J=12σn^J=12(n^σ). To calculate higher powers of (n^J), recall the property of Pauli matrices: (n^σ)2=I (Identity matrix). Thus, the power law for generators is:

(n^J)2=(12n^σ)2=14(n^σ)2=14I,(n^J)3=(n^J)2(n^J)=14(n^J)

The general term formula is:

(n^J)2k=(14)kI=(12)2kI(n^J)2k+1=(12)2k(n^J)=(12)2k+1(n^σ)

Summing the series by splitting the exponential series into even and odd parts:

D(1/2)=k=0(iθ)kk!(n^J)k=m=0(iθ)2m(2m)!(12)2mIEven terms+m=0(iθ)2m+1(2m+1)!(12)2m+1(n^σ)Odd terms

Even term coefficient: (1)m(2m)!(θ2)2m=cos(θ2). Odd term coefficient: i(1)m(2m+1)!(θ2)2m+1=isin(θ2). Finally, we get:

D(1/2)(n^,θ)=cos(θ2)Iisin(θ2)(n^σ)

This is the representation for j=1/2. It maps rotation operations to 2×2 complex matrices. Checking 2π: Substitute θ=2π, cos(π)=1,sin(π)=0. The result is I.

When j=1, the dimension is 2j+1=3. We need 3×3 matrices. In the angular momentum basis defined in physics (Cartesian basis), the generators satisfy (Jk)ab=iϵkab. For example, the generator for rotation around the z-axis, Jz:

Jz=(0i0i00000)

For an arbitrary axis n^, let matrix K=n^J. Calculating powers of Jz directly (other directions are similar):

Jz2=(0i0i00000)(0i0i00000)=(100010000)(Note: This is not I)Jz3=Jz2Jz=(100010000)(0i0i00000)=(0i0i00000)=Jz

We find a pattern: for j=1 generators, they satisfy the characteristic equation (n^J)3=(n^J). This means: Odd terms (k=1,3,5): (n^J)k=(n^J); Even terms (k=2,4,6): (n^J)k=(n^J)2; k=0 term: I (Identity matrix). Expanding the Taylor series again, but this time we must separate I because J2I.

D(1)=I+odd k(iθ)kk!(n^J)+even k2(iθ)kk!(n^J)2

Odd term coefficient: i(θθ33!+)=isinθ. Even term coefficient: (θ22!+θ44!)=cosθ1. Result:

D(1)(n^,θ)=Iisinθ(n^J)+(cosθ1)(n^J)2

This is the representation for j=1 (Rodrigues' rotation formula in physics form). It maps rotation operations to 3×3 real matrices (although J contains i, iJ is a real matrix). Checking 2π: Substitute θ=2π, sin(2π)=0,cos(2π)=1, D(1)=I0+(11)()=I.

Generally, we can establish generators and representations for all j. The construction of generators for all j relies on three core matrix element formulas for angular momentum operators in quantum mechanics. As long as we have these three formulas, we can write out matrices for j=0,3/2,2 or even j=100. First, Jz is a diagonal matrix:

j,m|Jz|j,m=mδmm

Then J+ (raising operator) is a superdiagonal matrix:

j,m+1|J+|j,m=j(j+1)m(m+1)

J (lowering operator) is a subdiagonal matrix: It is the transpose of J+ (in the real case). And Jx and Jy are composed of J±:

Jx=12(J++J),Jy=12i(J+J)

For example j=3/2: Spin-3/2 representation. This belongs to fermions, similar to electrons, but has 4 components. Commonly seen in Δ baryons or gravitinos in supergravity. Dimension: d=2(3/2)+1=4. Construct generator Jz (diagonal):

Jz=(3/200001/200001/200003/2)

J+ (raising operator coefficients) requires calculating j(j+1)m(m+1), where j=3/2, which is 3.75m(m+1). m=1/23/2: 3.750.75=3; m=1/21/2: 3.75(0.25)=4=2; m=3/21/2: 3.750.75=3. So:

J+=(0300002000030000)

Using Jx=12(J++J+):

Jx=12(0300302002030030)

Representation: This is a 4×4 unitary matrix. When rotating by 2π, since the diagonal elements are half-integers, it becomes −I4×4​. So j=3/2 is also a faithful representation of SU(2) and a projective representation of SO(3).

Thus far, we have completely narrated the essence of spin from the perspective of symmetry and group theory. However, it seems we did not need relativistic corrections as in the previous chapter. Unlike the previous chapter where we started from the Dirac equation and directly derived the 4-component wavefunction, here we started from spatial rotation (SU(2)/SO(3)) and only derived the 2-component spinor (Pauli spinor) for j=1/2. Where did the other two components go? The reason lies again in relativistic effects; the current symmetry analysis only considered spatial rotations and did not consider Lorentz boosts. Only by introducing the Lorentz Group can we explain why the electron must be the direct sum of "left-handed" and "right-handed" SU(2) representations (2+2=4), thereby forming a perfect closed loop back to the structure of the Dirac equation.

6. Lorentz Group

To achieve complete spin, we must begin considering the true symmetry group: the Lorentz Group SO(1,3), which includes both rotations and boosts. When we attempt to find the "fundamental representation" of the Lorentz group, something extremely curious happens: the algebraic structure splits apart.

The definition of the Lorentz Group is:

O(1,3){ΛΛGL(4,R),gμνΛμρΛνσ=gρσ}dimO(1,3)=6,g=diag(1,1,1,1)

Fundamentally, it is the group of linear transformations that preserve the metric of Minkowski spacetime. From the metric-preserving condition ΛTgΛ=ggμνΛμρΛνσ=gρσ, we can derive a constraint on the component Λ00:

1=gμνΛ0μΛ0ν=(Λ00)2i(Λ0i)2(Λ00)2=1+i(Λ0i)21

This means that Lorentz transformations must have either Λ001 or Λ001, so the group is already disconnected. This allows us to divide the Lorentz group into two manifolds: O+(1,3) and O(1,3). The former is called the orthochronous Lorentz group. The latter does not contain the identity element and thus does not form a group; it is called the antichronous branch of the Lorentz group.

From the metric-preserving condition, we can also determine the value of the determinant:

|ΛTgΛ|=|g||Λ|2|g|=|g||Λ|2=1, i.e., |Λ|=±1

Transformations with |Λ|=1 are denoted as SO(1,3) and called the proper Lorentz group, while those with |Λ|=1 are called the improper branch. Combining these two considerations, we can divide the Lorentz group O(1,3) into four connected manifolds. However, in practice, we only need to study the proper orthochronous branch SO+(1,3). This is because the other three branches can be obtained by acting on SO+(1,3) with two specific Lorentz transformations: time reversal T=T1=diag(1,1,1,1) and parity inversion P=P1=diag(1,1,1,1). Moreover, real-world reference frame transformations are strictly orthochronous and proper.

We focus on the proper orthochronous Lorentz group SO+(1,3). In this connected component, any transformation can be written as an exponential map from the identity. Just as SO(3) has 3 rotation generators, SO+(1,3) has a total of 6 degrees of freedom (3 rotations + 3 boosts), corresponding to 6 generators. Considering an infinitesimal transformation ΛIiϵX, similar to the process above, we can write out two sets of generators: Rotation generators J=(J1,J2,J3), corresponding to spatial rotations (these are the familiar angular momentum operators); and Boost generators K=(K1,K2,K3), corresponding to velocity transformations along the x,y,z axes. These 6 generators satisfy the Lorentz Lie algebra so(1,3) as follows:

Pure rotations are closed (SO(3) subalgebra):

[Ji,Jj]=iϵijkJk

Relation between rotations and boosts (Boost operators themselves rotate like vectors): $$[J_i, K_j] = i \epsilon_{ijk} K_k$$Boosts are not closed among themselves (The composition of two boosts in different directions is not just a boost but also produces a rotation, i.e., Thomas precession) (Note: the negative sign here is a characteristic manifestation of the spacetime metric g=diag(1,1,1,1), distinguishing it from the algebra of SO(4)):

[Ki,Kj]=iϵijkJk

At this point, the algebraic structure still looks coupled (J and K are intertwined). To find irreducible representations, we introduce a non-unitary basis change (Complexification). Define two new sets of operators N+ and N:

N+=12(J+iK),N=12(JiK)

Let's calculate the commutation relations for these new operators. First, look within N+:

=14[Ji+iKi,Jj+iKj]=14([Ji,Jj]+i[Ji,Kj]+i[Ki,Jj][Ki,Kj])=14(iϵijkJk+i(iϵijkKk)+i(iϵijkKk)(iϵijkJk))=14(2iϵijkJk2ϵijkKk)=iϵijk12(Jk+iKk)=iϵijkNk+

Similarly, we can verify that [Ni,Nj]=iϵijkNk. The most shocking result lies between N+ and N:

[Ni+,Nj]=14[Ji+iKi,JjiKj]==0

This means: upon complexification, the Lie algebra of the Lorentz group splits into the direct sum of two mutually independent su(2) algebras.

so(1,3)Csu(2)Lsu(2)R

This is a massive simplification in group theory. Since we already know the representations of su(2) inside out (labeled by spin j), irreducible representations of the Lorentz group can be uniquely labeled by a pair of half-integers or integers (jL,jR). According to this decomposition, the most fundamental spinor representation is no longer unique; instead, there are two most basic choices (fundamental representations), corresponding to taking j=1/2 for one su(2) and j=0 for the other. This introduces the concept of chirality.

Left-handed Weyl Spinor corresponds to the label (1/2,0). It behaves as spin 1/2 under N and as a scalar under N+. This is a 2-component complex vector, denoted ψL. In this representation, N=12σ and N+=0. Solving for the physical generators gives:

J=N++N=12σ,K=i(N+N)=i12σ

Right-handed Weyl Spinor corresponds to the label (0,1/2). It behaves as a scalar under N and as spin 1/2 under N+. This is also a 2-component complex vector, denoted ψR. In this representation, N=0 and N+=12σ. The physical generators are:

J=12σ,K=i12σ

Note the difference in sign for K! This demonstrates that while ψL and ψR behave identically under spatial rotation (J) (both are spin 1/2), their transformation properties under Lorentz Boosts (K) are diametrically opposite.

Since ψL and ψR are both 2-component objects, why do we need 4 components? The reason lies in Parity (P). The parity transformation P inverts spatial coordinates xx. J is an axial vector (r×p), so it remains invariant under P: JJ. K is a polar vector (v), so it changes sign under P: KK. Substituting this into the definition of N±, we find that the parity transformation swaps these two algebras:

P:N+N

This means parity transforms the left-handed representation (1/2,0) into the right-handed representation (0,1/2). If we want to describe a particle like an electron that has both spin and mass, and obeys parity conservation (under electromagnetic forces), we cannot simply pick one. We must take their "direct sum" together. Thus, the Dirac spinor Ψ, as a representation of SO+(1,3) extended by the parity operator, is precisely the direct sum of these two fundamental representations:

Ψ=(ψLψR)(12,0)(0,12)

This explains why there are 4 components: two components come from the left-handed sector, and two components come from the right-handed sector; they are tightly coupled together by the mass term and parity transformation. The Pauli spinor we saw earlier with SU(2) is merely a silhouette of this relativistic object in the rest frame (or non-relativistic limit).

Incidentally, the Weyl spinors here are deeply connected to the γ5 matrix mentioned earlier in the context of gamma matrices. In fact, the Hermitian operator γ5iγ0γ1γ2γ3 is precisely the operator used to "identify" and "define" Weyl spinors. Without γ5, we could not mathematically distinguish what is "Left-handed" and what is "Right-handed". In Dirac theory, γ5 is called the Chirality Operator. It has a crucial algebraic property: it anti-commutes with all γμ ({γ5,γμ}=0). However, it commutes with the generators of Lorentz transformations Sμν=i4[γμ,γν] ([γ5,Sμν]=0), meaning γ5 is a conserved quantity of the Lorentz group representation (for massless particles), and we can use its eigenvalues to classify spinors. A Right-handed Weyl spinor is a state with a γ5 eigenvalue of +1 (γ5ψR=+ψR), and a Left-handed Weyl spinor is a state with a γ5 eigenvalue of -1 (γ5ψL=ψL). Thus, what is physically referred to as "left-handedness" and "right-handedness" mathematically refers to whether the eigenvalue of γ5 is -1 or +1. Since the Dirac spinor Ψ is a mixture (direct sum) of left and right, how do we "sift" the left-handed and right-handed parts out of a mixed Ψ individually? This requires using projection operators based on γ5:

PL=1γ52,PR=1+γ52

These two operators have the standard properties of projection operators (P2=P,PLPR=0,PL+PR=1). Their function is to kill off one handedness component and keep only the other:

PLΨ=1γ52(ψL+ψR)=1(1)2ψL+112ψR=ψL,PRΨ=ψR

In particle physics calculations (especially weak interactions), you will often see terms like 1γ52; this is telling you: "This interaction only plays with left-handed spinors; right-handed spinors, please step aside." (This is precisely the mathematical expression of parity non-conservation). To make this relationship immediately clear, we can choose a special set of Gamma matrix forms called the Weyl Representation (or Chiral Representation). Unlike the Dirac representation mentioned earlier, in the Weyl representation, the Gamma matrices are block diagonal, which makes the Dirac spinor explicitly split into upper and lower Weyl spinors Ψ=(ψL,ψR)T. This is the most commonly used perspective in high-energy physics. In contrast, in low-energy condensed matter physics, we commonly use the Dirac (Standard) representation, where ψL and ψR are deeply mixed, better reflecting the non-relativistic approximation of "large components" and "small components".

7. Magnetic Gradient Force

We have completed a long mathematical journey. From the derivation of the Dirac equation to the representation of the Lorentz group, we have established that the electron must possess spin and is a 4-component relativistic object. Now, let us return to the original paradox: if the Lorentz force does no work, then who is doing the work? To answer this question, we need to link microscopic spin to macroscopic force.

By taking the non-relativistic approximation of the Dirac equation, we naturally obtained the Pauli equation. Let us revisit the "Zeeman term" that appeared seemingly out of nowhere:

HZeeman=e2m(σB)

In classical physics, we define the relationship between magnetic moment μ and angular momentum L as the gyromagnetic ratio. For orbital angular momentum, this relationship is:

μL=e2mL

If we attempt to use the same logic to define the "spin magnetic moment," we need to relate the spin operator S to the magnetic moment. Recalling the definition of the spin operator in Section 5:

S=2σσ=2S

Substituting σ back into the Zeeman term HZeeman, we get:

HZeeman=e2m(2S)B=2e2mSB

We write the general form of magnetic potential energy as U=μSB. Comparing this with the equation above, we can read out the electron's spin magnetic moment μS:

μS=2e2mS

If we write this in the general Landé g-factor form μ=ge2mS, we immediately arrive at the conclusion:

g=2

This g=2 is not a parameter fudged to fit experiments; it is a direct mathematical consequence of the spacetime symmetry of the Dirac equation. It implies: Electron spin generates a magnetic moment twice as efficiently as classical orbital motion.

With the magnetic moment μ, we can finally explain "who is doing the work." The classical Lorentz force FLorentz=q(v×B) indeed does no work. However, for an object with an intrinsic magnetic moment, its dynamics are governed by the potential energy U. According to Hamiltonian mechanics, force is the negative gradient of potential energy:

F=U=(μB)=(μB)

Since spin μ is an intrinsic property and remains constant under spatial differentiation, we obtain:

FGradient=(μ)B

This is the force that does the work, known as the Gradient Force. If the magnetic field is uniform (B=0), the force is zero, and there is only torque. However, the magnetic field produced by a real magnet is non-uniform (the magnetic field lines diverge), so B0. This is a conservative force that converts the potential energy of the magnetic field-moment coupling into the object's kinetic energy. Therefore, a magnet attracting iron is essentially the quantized spin magnetic moment being acted upon by the gradient force in a non-uniform magnetic field. The Lorentz force is responsible for deflection, while the gradient force is responsible for doing work.

So, there really are two types of magnetic forces. From the perspective of symmetry, they correspond to two completely different symmetry mechanisms, touching upon the edge of Quantum Field Theory. In QFT, interactions are prescribed by symmetry:

Since we are now adopting the perspective of Quantum Field Theory, we must point out that while Dirac's g=2 is glorious, it is not the ultimate truth. In the Dirac equation, we treat the electron as a classical field coupling to the electromagnetic field. But in full Quantum Electrodynamics (QED), the vacuum is not empty. As an electron propagates, it constantly emits and absorbs Virtual Photons, and even creates electron-positron pairs. This means the interaction Vertex between the electron and the magnetic field is no longer just a simple point (Tree level), but contains infinite Loop corrections. Julian Schwinger calculated the first-order correction (the one-loop diagram shown above) in 1948, giving the famous formula:

g=2(1+α2π+O(α2))

where α1/137 is the fine-structure constant. This shifts the theoretical value of g to g2.002319..., which is astonishingly consistent with experimental measurements (accurate to 12 decimal places). This tiny deviation (the Anomalous magnetic moment) not only confirms the relativistic origin of g=2 but also reveals the deep physics behind magnetism: when we feel the pull of a magnet, we are not only witnessing the geometric attributes of spacetime (spin) but also touching the seething ocean of virtual particles in the vacuum.

8. Heisenberg Model

We now know that every electron is a tiny magnetic needle (g=2). However, if you put a bunch of these tiny magnets together, thermal agitation at room temperature is sufficient to completely randomize their orientations, resulting in zero macroscopic magnetic moment (paramagnetism). To form ferromagnetism, there must be an extremely strong "coupling force" between spins that forces them to align. The classical magnetic dipole-dipole interaction is far too weak—only about one ten-thousandth of the thermal energy. The real power comes from the combination of the identical particle statistics (Spin-Statistics) we mentioned earlier and the Coulomb interaction. This is known as the Exchange Interaction.

To demonstrate the essence of this, let us consider the simplest model: a two-electron system (such as a Helium atom or electrons on two adjacent Iron atoms). Assume there are two electrons, 1 and 2, and two spatial orbitals ψa(r) and ψb(r), where ψa(r) is localized near atom A and ψb(r) is localized near atom B. We assume these two orbitals are orthonormal: ψa|ψb=0. Note that this orthogonality is a prerequisite assumption for the Heisenberg model we are about to derive. Even if they are not orthonormal, we can create two new orthonormal orbitals through a basis transformation. Assuming their overlap integral is small greatly simplifies the calculation of the two-electron system, but this does not mean there is no interaction between the two electrons, as the interaction involves exchange integrals which generally are not zero.

The total Hamiltonian of this system is H=H0+Hint, where H0 is the single-electron part (kinetic energy + nuclear potential energy), and Hint is the Coulomb interaction between the two electrons: Hint=e2|r1r2|. Note: there is absolutely no magnetic interaction term here, only pure electrostatic repulsion.

According to the Fermi statistics hypothesis for identical particles in quantum mechanics, the total wavefunction of the electrons Ψ(1,2) must change sign (be antisymmetric) under the action of the particle exchange operator P12:

P12Ψ(1,2)=Ψ(1,2)

Since the total wavefunction consists of a spatial part ϕ(r1,r2) and a spin part χ(s1,s2): Ψ=ϕχ. We first derive the spin part, and then obtain the spatial part based on the overall antisymmetry. We have two spin-1/2 particles (e.g., two electrons), so there are a total of 2×2=4 possible product states (uncoupled basis). We want to add their spins together to see what the total spin Stot=S1+S2 looks like, finding the common eigenstates |S,M of the total spin operator S^2 and total magnetic quantum number S^z. According to angular momentum addition rules, the total spin S formed by two spins of 1/2 can be: S=1/2+1/2=1 (Triplet, with 3 M values: +1,0,1) and S=1/21/2=0 (Singlet, with 1 M value: 0).

The Triplet corresponds to three components. The total magnetic quantum number M is the sum of the magnetic quantum numbers of the two particles: M=m1+m2. To get M=1, the only possibility is both electrons are spin-up: 1/2+1/2=1. So, the first member of the Triplet is determined: |1,1=|↑↑. Then, using the lowering operator (S), we get the intermediate state (M=0) from |1,1 to derive |1,0. Using the total lowering operator S^=S^1+S^2 acting on the state |j,m: J|j,m=j(j+1)m(m1)|j,m1, acting on the coupled state on the left gives:

S^|1,1=1(1+1)1(11)|1,0=2|1,0

Acting on the product state on the right gives:

(S^1+S^2)|↑↑=(S^1|1)|2+|1(S^2|2) =|1|2+|1|2 =|↓↑+|↑↓

Thus we obtain:

|1,0=12(|↑↓+|↓↑)

The lowest weight state (M=1) is also simple; only both down can give 1: |1,1=|↓↓. So the spin part of the Triplet is symmetric (does not change sign upon exchange), which means the spatial part must be antisymmetric.

The quantum numbers for the Singlet are S=0,M=0. It must be some linear combination of the product states |↑↓ and |↓↑ with M=0: |0,0=a|↑↓+b|↓↑. Since eigenstates with different quantum numbers must be orthogonal, the Singlet |0,0 must be orthogonal to |1,0 in the Triplet. Solving this gives:

|0,0=12(|↑↓|↓↑)

The spin part of the Singlet is antisymmetric, so the corresponding spatial part must be symmetric. Summarizing:

Our goal is to find a mathematical expression H^eff containing only spin operators Si and Sj such that when it acts on the Singlet and Triplet states, it automatically yields the corresponding energies ES and ET. To construct this Hamiltonian, the most natural building block is the dot product of the two spins SiSj. We need to calculate the eigenvalues of this operator for the Singlet and Triplet states. Define the total spin operator for the two-electron system: Stot=Si+Sj. Squaring the total spin operator allows us to solve for the dot product term: SiSj=12(Stot2Si2Sj2). Using the eigenvalue formula for the square of the angular momentum operator in quantum mechanics S^2|s=s(s+1)|s (omitting 2 for brevity, or treating spin as dimensionless):

We assume the effective Hamiltonian H^ij has the following linear form (this is the most general rotationally symmetric form): H^ij=C0+C1(SiSj), where C0 is just a constant energy shift independent of spin configuration, which can be discarded (or the energy zero point redefined) when studying phase transitions and spin dynamics. Substituting the data for Singlet and Triplet, we have ES=C1(34),ET=C1(14). Defining the constant JESET, we obtain the two-particle Hamiltonian:

H^ij=J(SiSj)

Now we generalize this two-particle interaction to the entire lattice. Assuming each electron i only interacts with its nearest neighbors. We need to sum over all atoms in the lattice. To correct for double counting:

Hexchange=J2i,j neighborSiSj

Finally, we must consider the interaction of each electron spin with an external uniform magnetic field B. This is a single-body interaction and does not involve neighbors. Recalling our conclusion from the Dirac equation, the electron has a spin magnetic moment:

μS=ge2mS

To make the formula more concise and universal, physicists define a natural combination of constants called the Bohr Magneton. The Bohr Magneton is the natural unit of magnetic moment in atomic physics. It is defined as:

μBe2me

This physical quantity contains three fundamental constants: elementary charge e, Planck constant , and electron mass me. It represents the magnitude of the orbital magnetic moment generated by a classical electron moving in the ground state orbit of a hydrogen atom. This is in SI units; in Gaussian units, it also includes the speed of light: μBe2mec. If we treat the spin operator S as a dimensionless operator (i.e., eigenvalues are 1/2 instead of /2), then the true physical angular momentum is S. Extracting this and combining it with the constants above:

μS=ge2me(Sdimensionless)=g(e2me)S=gμBS

When an electron is in an external magnetic field B, its potential energy (Zeeman Energy) is given by the classical electromagnetism formula U=μB. Substituting the magnetic moment expression:

UZeeman=(μS)B(Note the sign: Potential energy is usually defined as μB)=(gμBS)B=gμBSB

Physical Convention on Signs: In condensed matter physics, we usually want the Hamiltonian to reflect energy minima. Electrons are negatively charged, so the magnetic moment μ is anti-parallel to the spin S. The lowest energy state is when the magnetic moment μ is parallel to the magnetic field B. This means the spin S is anti-parallel to the magnetic field B. To avoid dealing with cumbersome negative signs, or to make spin look like it aligns "with" the field (defining S to point in the direction of the magnetic moment rather than angular momentum), literature sometimes adjusts the definition. However, the standard derivation (keeping the electron's negative charge) gives the Zeeman term as +gμBSB or μB. Nevertheless, in the customary notation of the Heisenberg model, for mathematical symmetry and ease of discussion (e.g., assuming g is negative or redefining the spin direction), the Zeeman term is typically written with a negative sign, indicating that spins tend to align along the field (this is a phenomenological treatment):

HZeeman=gμBiSiB

(This implies that energy is minimized when Si is in the same direction as B. This implies we have redefined the spin direction, or taken g to be negative. In phenomenological models, we only care about: which direction does the magnetic field tend to pull the spins.)

Now, we combine the two pieces of the puzzle: Internal Interaction (Exchange Energy generated by the Pauli principle and Coulomb force) and External Interaction (Coupling of magnetic moment with external field generated by relativistic quantum effects). Adding them together, we finally obtain the core Hamiltonian describing solid-state magnetism—the Heisenberg Model:

H=J2i,jSiSjgμBiSiB

This formula is the cornerstone of modern magnetism. The first term (J) explains why magnets have magnetism (spontaneous magnetization, ordered alignment of spins). The second term (B) explains how magnets are controlled by the outside world (magnetization process, hysteresis loop). μB and g link microscopic quantum constants (,e,me) with macroscopic observable magnetic fields.

9. Ising Model

We have completed the construction of the microscopic mechanism (Dirac Spin Exchange Interaction Heisenberg Hamiltonian). Now, we must move from the microscopic to the macroscopic. To do this, we need to handle the Heisenberg Model. However, solving the Heisenberg Model exactly in two or three dimensions is extremely difficult (because it contains non-commuting operators). Therefore, we need to introduce the Ising Model as an approximation and use Mean-Field Theory to demonstrate how symmetry is broken.

We first non-dimensionalize the Heisenberg model and analyze its structure. Assume the external magnetic field is along the z-direction: B=(0,0,B). We expand the dot product of the spin operators S into longitudinal (z) and transverse (x,y) components:

SiSj=SizSjz+(SixSjx+SiySjy)

To see the physical meaning of the transverse part more clearly, we introduce Ladder Operators:

Si+=Six+iSiy,Si=SixiSiy

Thus, the Heisenberg model can be rewritten in two parts:

H=[J2i,jSizSjzhiSiz]Ising Part+[J4i,j(Si+Sj+SiSj+)]Flip Part

(where h=gμBB). These two parts have distinct physical meanings:

In many real magnetic materials, due to the symmetry of the crystal structure, there exists Magnetic Anisotropy. This means the energy of spins along certain directions (e.g., the z-axis, the easy axis) is lower than in the x,y plane. If the anisotropy is strong enough, or if we are only concerned with phase transition behavior in the classical limit, we can ignore the Flip Part (quantum fluctuations) and retain only the longitudinal term. This is the famous Ising Model.

At this point, we replace the operator Siz with a classical variable σi=±1 (absorbing the coefficients into J):

HIsing=J2i,jσiσjhiσi

This is a massive simplification: we turn a non-commuting quantum matrix problem into a classical statistical combinatorial problem. The 2D Heisenberg model cannot be solved exactly to this day, though we can solve it using computational methods. However, the Ising model within it is relatively simpler. The solution to the 1D Ising model is very simple; there is no phase transition in 1D (Ising, 1924). But the 2D case is also very difficult to solve. It wasn't until the brilliant Lars Onsager published a very simple paper stating that he had solved it—without providing the solution details, but giving the critical temperature and pointing out that it is ferromagnetic at low temperatures and paramagnetic at high temperatures—that we had the first exactly solvable model exhibiting a ferromagnetic phase transition. He withheld the solution process until C.N. Yang saw the paper and provided a solution, which was still famously difficult. To understand the physical picture intuitively, we will adopt the Mean-Field Approximation here.

The problem we face is many-body coupling: the state of σi depends on σj, and σj depends on σk... This chain reaction makes the partition function difficult to calculate. The Mean-Field idea is: when we focus on atom i, we don't care whether neighbor j is flipping between +1 or 1; we only care about the average influence of the neighbors. We write σj as an average value plus a fluctuation: σj=σ+δσj. Ignoring the second-order fluctuation term δσiδσj0, the Hamiltonian can be linearized into a single-body form:

HMFA=iσi(Jjneighσ+h)heff

This defines the Effective Molecular Field heff:

heff=Jzσ+h

where z is the coordination number (number of neighbors for each atom). Now, the problem becomes the statistical distribution of a single spin in an "external field" heff. According to the Boltzmann distribution, the probability of this spin being up is proportional to eβheff, and the probability of being down is proportional to eβheff (where β=1/kBT). Thus, the thermodynamic average of this spin σi is:

σi=(+1)eβheff+(1)eβheffeβheff+eβheff=tanh(βheff)

Since the lattice is uniform, σi must equal the average value of the field source itself, m=σ. Substituting heff, we obtain the famous Self-consistent Equation:

m=tanh(Jzm+hkBT)

Let us consider the most critical case: no external magnetic field (h=0). The equation simplifies to:

m=tanh(TcTm)

where we package the constants into the definition of the Curie Temperature Tc​=Jz/kB​. This is a transcendental equation, and we can analyze the behavior of the solution graphically (by finding the intersection of y=m and y=tanh(TTc​​m)):

This is Spontaneous Symmetry Breaking: The Hamiltonian has a flip symmetry σ→−σ when h=0. However, when the temperature drops below Tc​, nature is forced to choose between "all up" and "all down." This choice is not imposed by an external force but is a collective decision made spontaneously by the system to lower its energy (driven by the exchange interaction J). This is precisely the statistical mechanical essence behind the macroscopic phenomenon of a magnet attracting iron.

10. Magnetic Domains

Based on the Ising model and Mean-Field Theory, we concluded that when the temperature is below the Curie temperature Tc, electron spins spontaneously align, producing a massive macroscopic magnetization M. However, this immediately leads to a new paradox: if you go to a hardware store and buy an iron nail (room temperature is obviously far below Iron's Curie temperature of 1043K), it is not magnetic. It does not pick up other objects. This is because we ignored one final energy competition.

Our previous Hamiltonian only considered Exchange Energy and Zeeman Energy. However, at the macroscopic scale, there is also the classical Magnetostatic Energy. If all 1023 atoms in a piece of iron were aligned upwards, this magnet would establish a huge magnetic field in the surrounding space. The magnetic field contains energy density B2/2μ0. Spreading out all those magnetic field lines costs a tremendous amount of energy. To reduce this Magnetostatic Energy, the material spontaneously splits into many tiny regions called Magnetic Domains. Although inside a domain, the exchange interaction keeps spins aligned (satisfying microscopic ferromagnetism), overall, the vector sum of the magnetic moments of the various domains is zero (Mi=0). There are no external magnetic field lines, thereby drastically reducing the magnetostatic energy.

The boundary between magnetic domains is called a Domain Wall. Inside the domain wall, spins do not flip abruptly but rotate gradually. This is another game of energy trade-offs: Exchange Energy wants spins to be parallel and resists their turning (preferring the wall to be as wide as possible); Magnetic Anisotropy wants spins to align along the easy axis and resists them pointing in intermediate directions (preferring the wall to be as narrow as possible). The balance between the two determines the thickness of the domain wall (typically hundreds of atomic layers). The formation of magnetic domains is not derived from "first principles" like spin, but belongs to the realm of Micromagnetics, involving energy minimization in continuum field theory, which we will not expand upon here.

Now, we can finally fully describe the macroscopic process of "a magnet attracting iron":

Conclusion: The Deep Symmetry of the Universe

When you play with two magnets in your hands, feeling the repulsion and attraction between them, you are feeling more than just a force. You are touching the essence of quantum mechanics and the secrets of cosmic evolution with your own hands. Let us review this journey and see how we rebuilt our physical intuition:

  1. The Classical Collapse: We found that the Lorentz force does no work, and classical statistical physics forbids magnetism (Bohr-van Leeuwen Theorem).
  2. The Relativistic Correction: The Dirac equation revealed that the electron must be a 4-component spinor carrying an intrinsic magnetic moment with g=2. Magnetism is the residue of relativistic effects in the low-speed world.
  3. The Power of Quantum Statistics: The Pauli Exclusion Principle combined with Coulomb repulsion creates an equivalent "Exchange Interaction," forcing spins to align parallel.
  4. Symmetry Breaking: The Ising model taught us that when the temperature drops, the system, in order to survive (lower its energy), is forced to break rotational symmetry and choose a direction.

Finally, it is worth mentioning that the Spontaneous Symmetry Breaking (SSB) we saw in the Ising model has significance far beyond solid-state physics. It is a core paradigm for understanding the universe in modern physics. At the beginning of the Big Bang (extremely high temperature), physical laws possessed extremely high symmetry. All fundamental particles were massless, just like the iron block at high temperatures has no magnetism (paramagnetic phase). As the universe cooled, when the temperature dropped below a certain critical value, the Higgs Field filling the universe underwent a phase transition. Just like electron spins suddenly choosing to point in one direction, the Higgs field acquired a non-zero Vacuum Expectation Value in empty space.

So, the next time you see a magnet pick up a paperclip, realize this: you are witnessing a miniature moment of cosmic creation. The mechanism that gives the nail its magnetism is the very same mechanism that gives the quarks and electrons in your body their mass, allowing this universe to exist.

Magnetic force does no work; it is the geometry of spacetime doing the work. The attraction of a magnet is the dance of a quantum ghost in the macroscopic world.