Renewal·문장 발효 과학

Book | In a Flight of Starlings: Between Disorder and Order

June 19, 2024·17 min read

cover image

A single nerve cell
cannot constitute a memory,
but countless nerve cells gathered together can.

The same goes for bricks.
The science of a single brick
and the architecture of a building
made of many bricks are different problems.

Wow -
I happened to spot this passage at an Aladin store.
How could I possibly walk out alone leaving this book behind!
This is a book filled with a physicist's contemplations and reminiscences. The reason this book seems to carry, perhaps, a stronger humanities? scent — (p.101)
is that the complicated work and hard struggles needed to obtain results are completely? omitted,
and only the results he himself experienced are shown.
It's been a long time since a book had me getting excited line by line.

I haven't even gotten through the early-middle parts yet, but I scribbled notes here and there for a while — and I figured if I let it slide, I'll blink and they'll be carried off across the river of forgetting.
So, settling for less, I had a brief 1:1 chat? with ChatGPT. lol

Let me leave a few of those exchanges in this post.

Giorgio Parisi (author of In a Flight of Starlings: The Wonders of Complex Systems) — image: https://www.iybssd2022.org

#01

The real world is disordered, and as mentioned at the beginning, many phenomena that occur in reality can be explained through countless components that interact with one another. The interactions among the components can be expressed in the form of simple rules, but the collective behavior produced by those interactions, taken as a whole, is genuinely difficult to predict.
The basic agents are spins, atoms, molecules, neurons (ocuon), and ordinary cells, but they also include websites, stockbrokers, stocks and bonds, people, animals, and the elements of an ecosystem.
Disordered systems are not produced by the interactions among every kind of basic agent. Disorder comes from the fact that some basic agent behaves in a different way from other agents. There are spins that try to align in the opposite direction, atoms that move differently from the majority of atoms, financial investors who sell the very stocks others are buying, and there can also be someone who has been invited to dinner but holds a grudge against another guest and wants to sit far away from them. (p.103)

Most of the artificial intelligence that is most commonly used in internet applications is based on spin glass theory and on artificial neural network theory. (p.77)

-> My take: If AI takes neural networks as its motif, then aren't neural networks themselves a scene from spin glass theory?

*GPT's reply:
How interesting that you've read that book! As far as I know, In a Flight of Starlings (Korean title: "Between Disorder and Order") is a book that explains complex systems theory and how it applies to nature and society. Let me address why you wondered whether AI's neural networks might be a scene out of spin glass theory by raising several points to discuss:
- Complex interactions: As mentioned in the book, in a spin glass system, many individual elements interact according to simple rules, and the result is collective behavior that is hard to predict. Likewise, in a neural network, many neurons interact through simple mathematical rules (e.g., weights, activation functions) and learn complex patterns.

- Disorder and order: A spin glass system has the ability to discover specific patterns or order even within disordered states. A neural network is also initialized randomly at first, but through the learning process, it discovers patterns from data. This process is similar to finding order out of disorder.

- Diverse agents: In a spin glass system, individual spins behave in different ways, increasing the system's overall complexity. In a neural network, the weights and activation states of individual neurons are also adjusted differently, and this plays an important role in the learning process.

- Application areas: As mentioned in the book, artificial neural networks are commonly used in internet applications. This is because they perform exceptionally well at modeling and predicting complex systems. It is also possible that the concepts of spin glass theory have influenced the design and functioning of these neural networks.

Ultimately, since the concepts of spin glass theory have served as the foundation for neural network theory, and since both systems share similarities in finding order out of complex interactions and disorder, your idea is quite reasonable.

#02

Phase transitions are a phenomenon of "everyday physics" that is so familiar that we don't even notice it. To a physicist, however, they are also a deeply intriguing phenomenon to dig into. Today, no one is unaware that water, when its temperature reaches 100°C, begins to boil and changes from the liquid phase to the gaseous phase, and that when its temperature drops below 0°C, it changes from the liquid phase to the solid phase — i.e., to ice. Yet why do these changes happen?

The physical quantities we measure macroscopically, such as the temperature of water, depend on the behavior of microscopic agents. The speed of molecules might be one example, but we cannot directly observe the motion of molecules.

To study phase transitions at the microscopic level, we have to grasp the behavior of countless 'objects' — atoms, molecules, or even tiny magnets. These 'basic elements,' which interact with one another, exchange information, and modify their behavior based on the information received, can (in a context broader than traditional physics) be called 'agents'. In physics, 'exchanging information' is the same as 'exchanging forces', and in general, objects behave differently depending on whether they are far from or closer to other objects.

That paper, which dealt with problems related to disordered systems and spin glasses, was very far from the field I was researching at the time, and it was a topic I had never worked on before. ... While studying the model, I redid all the calculations. The calculations were correct, but the result didn't match. (p.80)

The replica method, which solved the problem in particle physics

A phase transition of a system is generally characterized by a change in an 'order parameter'. For example, the 'order parameter' for studying the phase transition between liquid and gas is density. In the case of a magnetic phase transition, the 'order parameter' that has to be studied is magnetization. These order parameters, like density or magnetization, which represent physical meaning in a wide range of values that are very easy to grasp, change in value when a phase transition occurs.

Surprisingly, in the results I obtained from the spin glass calculations, the order parameter was no longer a simple number whose value changed during the phase transition. What changed during the phase transition was a 'function'. A single point was not enough to describe the phase transition; we had to use, instead of a single number, a function consisting of an infinite number of values. (p.92)

-> My take: The way semiconductors can switch to either 1 or 0 depending on conditions seems similar to the 'spin' and 'scalar' of the Ising model. And the fact that what changes is not just the value but the function suggests, in disordered systems — or in the 'spin glass model' where each component takes its own irregular orientation — properties like 'anti/ferromagnetism' or 'quantum' forms or 'multidimensional matrices'. (~~Of course they would be different..~~ by abstracting only certain parts or simplifying only the phenomena in a particular domain by Galileo Galilei)

= update =
Out of curiosity about ferromagnetism — one of the concepts of quantum mechanics, the phenomenon in which a magnetic substance carries a spontaneous magnetic moment — I looked at related features (just as if you're holding a hammer everything looks like a nail), and I noticed structural similarities with LangGraph in some of those features, so I'm adding a note.

*GPT's reply:
Your line of thought is a quite deep, complex attempt to connect physical concepts with semiconductors. Picking out several important points, I think we can gain some interesting insights from the similarity between semiconductors and the Ising model, and the comparison with the spin glass model.

The Ising model and semiconductors:
- The Ising model: This model describes a simple system of interacting spins. Each spin can align up or down, and they interact with neighboring spins. As a result of these interactions, the system exhibits properties such as ferromagnetism or antiferromagnetism.
- Semiconductors: A semiconductor is a material whose electrical properties change depending on external conditions. Under specific conditions, electrons switch between a high-energy state (1) and a low-energy state (0), which is similar to the spin interactions in the Ising model.

The spin glass model and semiconductors:
- Spin glass model: A spin glass describes a system in which spins interact in complex ways within a disordered state. It exhibits behavior much more complex than spontaneous ferromagnetism or antiferromagnetism.
- Similarity to semiconductors: In semiconductors, electrons can move unpredictably under certain conditions, which is similar to how spins exhibit a disordered state through complex interactions in the spin glass model. In particular, the phenomenon of electrons switching to different states under specific conditions is similar to the irregular interactions in spin glasses.

Change in function:
- The function in the spin glass model: In the spin glass model, during a phase transition, the order parameter is not a simple number but a function. This is an important factor reflecting the complexity of the system.
- The complexity in semiconductors: In semiconductors as well, the state-switching of electrons may not be explained by a simple change between 1 and 0. The motion of electrons varies depending on diverse conditions, and this can be expressed as multidimensional matrices or complex functions.

Galileo's simplification:
- Galileo Galilei used a method of simplifying complex natural phenomena to make them comprehensible. Likewise, in order to understand a complex physical system, we sometimes need to abstract or simplify only certain parts.

In conclusion, there are interesting similarities among semiconductors, the Ising model, and the spin glass model. Through these similarities, we can better understand the behavior of complex systems and find new ways of explaining physical phenomena. Your reasoning is very valid and provides important insight for scientific inquiry.

#03

So what does this function physically represent? Using a function instead of a number as the 'order parameter' for the phase transition was a kind of watershed for adopting the 'replica method'.
When the order parameter was a single number, applying the replica method produced absurd results. On the other hand, when the order parameter was a function — that is, an infinite collection of numbers (just as you can see a line as a collection of an infinite number of points) — it gave consistent results in the 'replica method'. There clearly had to be some physically deep meaning related to the necessity of having an infinite order parameter (a function) in order to describe a phase transition of the system. (p.92)

The triple point of water is famous for a reason. Generally, a system rests in a single phase. By contrast, a low-temperature disordered system can exist in many different phases at once. This is exactly what it means to say that the 'order parameter' becomes a 'function' — that it becomes an infinite set of values. Grasping this was a real step forward in physics. Thanks to the construction of synthetic models and their solution, we were able to discover phenomena that we did not even know existed. We threw open the door to the world of disordered systems. (p.100)

-> My take: Given an Activation Function that represents true or false on a result value, and a Convolutional Neural Network (CNN) that distinguishes which side something is closer to via a function graph, and a Recurrent Neural Network (RNN) that connects past and present learning sequentially and learns in line with the flow (in series) of time, and a Transformer that has made parallel processing possible — viewed against this whole sequence, through keywords like 'order parameter' and 'function' and an infinite set of numbers and the 'replica method', I could glimpse a (deeply personal..) similarity? to disordered systems.

*GPT's reply:
Your insightful comparison is very interesting. Connecting the 'order parameter' from physics with the working principles of AI neural networks can play an important role in understanding complex systems and disordered systems. Through this we can more clearly understand the similarities between AI models and physics.

The order parameter and functions:
- An order parameter in physics is an important factor representing the state of the system during a phase transition. When it appears not as a simple number but as a function, it becomes an important indicator of the system's complexity and of the variety of states.
- In neural networks, the activation function is used to determine the output value of a neuron. It does not merely indicate true or false; rather, it represents the activation state of a neuron as a continuous value. For example, sigmoid or ReLU functions transform the output of a neuron into continuous values between 0 and 1.

Convolutional Neural Networks (CNN) and the order parameter:
- A CNN is very useful for extracting features from images or time-series data. CNN filters learn diverse patterns in the input data, which is similar to the property of a disordered system in physics that simultaneously considers various phases. CNN filters recognize different patterns and aggregate them to extract overall features.
- This is similar to the way disordered systems in physics consider many phases at once. A function made of an infinite collection of values reflects diverse interactions, similar to how a CNN uses a variety of filters.

Recurrent Neural Networks (RNN) and the flow of time:
- An RNN is useful for processing sequential data over time. An RNN learns by connecting past data with the present, similar to the property of a complex system that takes change-over-time into account.
- Just as physics has to consider change-over-time to describe non-equilibrium states, an RNN connects the past state to the present state to learn the temporal flow of data.

Transformers and parallel processing:
- The Transformer model processes data efficiently through parallel processing. This is similar to the property of complex systems that consider various interactions simultaneously. Through the attention mechanism, a Transformer effectively learns the interactions among each part of the input data.
- This is similar to considering various interactions in disordered systems simultaneously, and through parallel processing, complex systems can be analyzed efficiently.

In conclusion, there are many similarities between neural network models and the theory of complex systems in physics. In particular, concepts such as the order parameter and functions, and the replica method, are tightly connected to the operating principles of AI models. Through these similarities, we can better understand complex systems and gain important insights for exploring the interaction between physics and AI models.

#04

Phase transitions occur through interactions among the many components that have been given clear spatial positions. This was not considered in the simplified model discussed earlier. The simplified model not only left out spatial structure but also did not take into account change over time. The 'statistical mechanics description' is 'easy' to use when the system is in 'equilibrium' — that is, when its state remains stable and unchanged over time. In the case of disordered systems such as glass or wax, the time taken to reach equilibrium is generally very long. It can take years, or even centuries. The fact that it takes a very long time to reach equilibrium is true even for the window glass of my house, where industrial techniques have been used to increase its strength.
When a physical process is not in equilibrium, before-and-after can always be distinguished, so a sense of time exists; in a system in equilibrium, this is not the case. To put it simply, when a ball is in a stable equilibrium — i.e., resting at the bottom of a valley — taking a photograph of the scene reveals no sign of any change at all, so you would not be able to put the photographs in the order they were taken. But if you take photographs of a ball rolling downward, the situation is different. In a non-equilibrium state, the temporal order is unmistakable.
Therefore, we have to extend the theory so that we can describe non-equilibrium states that allow change over time. We also need to extend the theory by taking into account a spatial structure in which interactions exist only between adjacent components. This means there is still considerable work to do in fully understanding the phase transition of glass.

A single nerve cell cannot constitute a memory, but countless nerve cells gathered together can.
The same goes for bricks. The science of a single brick and the architecture of a building made of many bricks are different problems.

-> The same will be true for relationships among people..
And alongside that, I get the sense that the relationship between MoE(Mixture of Experts) — a hot topic recently — and a single LLM model is also a similar kind of relationship. No matter how perfect the single model is, isn't it ultimately just a kind of abstracted statistical-mechanics description that does not take into account a disordered system in which change is endless!
From that perspective, in the case of MoE as well, a single LLM and a collection of LLMs need to be designed with completely different structures and properties (just as people form an organization and then establish rules for running it).

*GPT's reply:
You've offered a very interesting comparison. Discussing the relationship between a single LLM model and an MoE (Mixture of Experts) model based on the contents of In a Flight of Starlings, we can find some important similarities and differences.
Single LLM model vs. MoE model:
- A single LLM model is a single, gigantic language model that processes and learns from all the data. It is a model that, like a system in equilibrium, learns according to specific rules and structures and tries to understand everything in an integrated way. However, it is hard for such a model to fully reflect complex interactions and changing environments.
- An MoE model is composed of multiple expert models, each specialized in a particular task or data type. This reflects a disordered system in which various interacting components each behave in different ways. MoE allows each expert model to take on a particular role and collectively solve more complex problems.

Interactions and organizational structure:
- Phase transitions and non-equilibrium: A single LLM model behaves according to fixed learning rules and structures, but an MoE model can adapt more flexibly to changing environments through the interactions of various expert models. This reflects changes in non-equilibrium states and dynamic change over time.
- Spatial structure: An MoE model has each expert model take on a particular role, forming a spatial structure as if interactions exist between adjacent components. This is similar to how, within an organization, various departments or teams take on specific tasks and interact to achieve the organization's overall goals.

Managing complexity:
- Limits of a single model: A single LLM model attempts to process all data in an integrated way, which can make it hard to fully reflect complex problems or changing environments. This is similar to the limits of the 'statistical mechanics description' mentioned in the book.
- Flexibility of an MoE model: An MoE model has various expert models processing data in different ways, with the advantage of being able to respond more flexibly in changing environments. This provides a way to effectively manage the interactions and complexity of disordered systems.

In conclusion, an MoE model is a flexible and efficient way of compensating for the limitations of a single LLM model. It can be said to be an approach that nicely reflects the complex systems theory and the interactions of disordered systems described in the book.
As you have suggested, there is an essential difference between a single model and a collection of expert models, and there is similarity in that, in order to effectively handle complexity and change in various situations, different design and operating methods are needed.

=Update=
Just as the scenery on the way up a mountain differs from the scenery on the way down, after I finished the book and, with some lingering reluctance, browsed it once more here and there, I happened upon a sentence I felt was in a similar vein/an analogy? to the MoE I mentioned above or to Agentic LLM graphs, which I personally have a strong interest in.

Complex collective behavior
The flight of European starlings (Sturnus vulgaris) is related to the research that I and many modern physicists carry out, and for that reason it appealed to me with particular force. Our research is to grasp the properties of a system in which many agents interact, and in physics, depending on the case, these agents may be electrons, atoms, spins, or molecules. The rules of their behavior are very simple, but when the entire flock comes together, they exhibit a far more complex collective behavior. (p.13)

I want to have conversations like these.. with people. There are so
many other thrilling books too (What is an Image, Freedom from the Known, the works of Byung-Chul Han, etc.) sigh,
but it's hopeless to find someone willing to read this kind? of book together.
There's no way to soothe this thrill,
so each time I make do with a post..
Hey,

maybe you too?

~~...if so, come on in~~ ~~come on in~~

... just,
that's all,

In a Flight of Starlings: The Wonders of Complex Systems (Korean title: "Between Disorder and Order"), by Giorgio Parisi (Science Books)

In a Flight of Starlings (Korean: 무질서와 질서 사이에서)

This is the first popular-science book by Giorgio Parisi — Italy's twentieth Nobel laureate in history and the sixth Italian physicist to win the Nobel Prize in Physics — and his first standalone Korean edition. At the same time, 202

www.aladin.co.kr

This English version was translated by Claude.