In this post I will introduce the Ising model from a graph theoretic point of view, without resorting to a particular graph structure on which the system lives. I will explore more specific systems in a future post.

Consider an undirected graph $G=\langle V,E \rangle$ where $V$ is a set of vertices, and $E$ is a set of edges. A state of the graph, $\sigma_G$, is a mapping of the vertices, $\sigma: V\rightarrow \{-1,1\}: v_i \mapsto \sigma_i$, and the edges, $\sigma: E\rightarrow \{-1,1\}: e_{ij}\mapsto \sigma_{ij} = \sigma_i\sigma_j$, where i and j are the vertices that bound the edge. This definition makes edges that bound vertices of the same state have a state of +1, and edges that bound vertices of opposite states have a state of -1.

Since each vertex can be labeled in one of two ways, and the state of the edges is fully determined by the state of the vertices, there are a total of $2^{|V|}$ possible states for the graph, where $|\cdot|$ is the cardinality operator. We bundle these together and write $\Sigma_G = \{\sigma_G\}$.

We want to compare different states to one another, and the easiest way to do this is to define a Hamiltonian. A Hamiltonian is a function that maps states to real numbers, which we call energies. It consists of a piece that acts on the edges, and a piece that acts on the vertices:

where the f’s are coupling functions. In the simplest case we choose $f_0(x)=hx$, and $f_1(x) = -Jx$. In the physics parlance, h is referred to as the magnetic field, and J the coupling. Plugging in the definitions for the state of vertices and edges, the form of the Hamiltonian that we will be examining reads:

where the second sum on the first line is over unique pairs of vertices, and on the second line over the neighbors of the ith vertex. The factor of a half comes from overcounting each vertex twice in the complete expression. The reason for the negative sign is that it ranks states that are more homogeneous lower, so we can say that homogeneity decreases energy.

We can make the notion of homogeneity more precise by introducing an order parameter, called the magnetization, describing the average vertex state:

Note that this is a number between 1 and -1, inclusive. There are many states that have the same order parameter. So many, in fact, that’s it’s best to talk about the logarithm of the number, which turns out to be (for very large graphs):

Here I used Stirling’s approximation for the factorials in the binomial coefficient. This function is peaked close to 0, especially for large graphs. The majority of the states have little or no magnetization; in fact if we expand around 0 magnetization we see that the states are distributed as a Gaussian:

Here I have included the quartic term in the Gaussian, which can be neglected. To see this we plot the distribution below for graphs with 10,100,and 1000 vertices. We see how quickly the distribution becomes sharply peaked at vanishing magnetization:

What does this mean?

We imagine that the system has some sort of dynamics: that it can transition from one state to another state. Under no other constraints, the system will soon find itself in a state with little or no magnetization.

For something more interesting to occur, the dynamics of how the system transitions from one state to another must be constrained by more than simply the available volume of phase space. Here is where we finally get to the physics of the Ising model. The constraint we need to incorporate is that the energy of the system is constant, so that we can say what the probability of each state is: $p(\{\sigma\})$. We resort to a MaxEnt analysis: the distribution of states under the constraint of constant energy is the one that maximizes the Shannon entropy of the system:

Two Lagrange multipliers have been introduced to enforce the normalization and energy constraints on the system. The MaxEnt method produces the belief distribution that an epistemic agent should have about the system knowing nothing more than the energy constraint. Any deviation from this distribution implies that the EA has more information at their disposal. After a little bit of calculus, and a little redefinition of the Lagrange multipliers, one can find:

This is just the standard Maxwell Boltzmann distribution, and Z is called the partition function (normalization constraint).

I do want to discuss a bit the assumptions that went into this derivation. The actual energy of our system is the Hamiltonian, but since an epistemic agent cannot know the actual state of the system, it must resort to averaging over all possible states to define the energy. If the actual system fluctuates a bit, this means that the actual energy is a time average of the energies the system fluctuates through. The statement that this temporal average is equivalent to the average over the ensemble of all possible states is known as the ergodic hypothesis. This is only possible because we assume the energy is constant (in the temporal average), which means that the system is in equilibrium. If the system was out of equilibrium, the ergodic hypothesis would no longer hold, and the epistemic agent would not have the belief distribution about the system derived above. Out of equilibrium dynamics are much more difficult to analyze, and we will not do so here.

The partition function turns out to be the most important quantity to analyze. Unfortunately it also happens to be notoriously difficult to calculate. Recalling the definition of the mean field, we can rewrite it as:

This functional form is a sum over all possible magnetizations, followed by a sum over all states that have that magnetization. One thing we can instantly see from this is that the magnetic field, h, is acting as a bias. It suppresses belief in states that do not have a magnetization that is the same sign as the magnetic field. This is the reason why these quantities have the names they do: the magnetic field magnetizes the system.

With this machinery at hand we can now introduce dynamics. We initialize our system in any state we wish and then allow for transitions to new states: we let the system evolve in time. We use the probability of a graph state to find the probability that a single vertex transitions by simply holding all other vertices constant:

At each timestep we pick some fraction of the vertices and calculate the probability that they change their state. With that probability we allow those vertices to either change or stay the same.

I’ve been running a lot of these types of simulations on scale free networks. Many networks in reality, including the Internet and Social Networks, have been shown to be scale free. I will write an upcoming post on how to actually create these types of networks, amongst others, but for this post I wanted to focus on the Ising model on graphs. Heres a simulation of all that I’ve been describing: