An overview of different types of models and their relation how we understand the world. The first in a series of writings describe the role of models and representations and how they relate to reasoning.
A model has become a catch all term for many different aspects of understanding. There are mental models, toy models, computer models, animal models, and world models. A term so widely used might be expected to lose some meaning but the breadth in application of this simple word also matches the complexities of life that models seek to distill. One reason that the “idea of a model” takes on many different meanings is due to the multitude of things that we attempt to model. Models reflect our attempt to capture understanding in the world.
George E. Box once quipped that all models are wrong, but some are useful. For more than a decade, I have been haunted by this simple aphorism. What, if anything, does it mean to be a model? Under what conditions are models wrong or right? More importantly, how exactly can a model be useful? This series of writings are a short attempt to sketch out some answers to these questions. My motivation for this is my belief that models are fundamental to reasoning. Moreover, I expect that understanding or even creating some form of general intelligence will depend on model-building in a fundamental way. Hence, evaluating claims of general intelligence should benefit from a deeper understanding of what models are. As with all writing, nothing here should be seen as exhaustive.
To better understand what something is, one can start by considering how it is used. Models are ubiquitous in science, engineering, and technology. I take it as given that models relate to understanding, possibly even as a subset. However, a closer neighbour in the plane of abstract terms would be representations. Let’s consider three examples to see how models are used; animal models, mathematical models, and computer models.
Animal models are the closest to the world. Biologist working on frogs, mice, fish, or monkeys, all see their specific subject of interest as a model in which they can perform experiments, test hypotheses, and gain new insights into the secrets of life. The animal models studied are specific species such as Xenopus Laevis (frog), Drosophila melanogaster (fruit fly), or Mus musculus (house mouse). While many other animals can be and are used, the total number of these animal models are a tiny subset of all the species in the world. There are even cases, as with the axolotl (mbystoma mexicanum), where there are more animals existing in labs than in the wild. What then is the reasoning behind restricting biological investigations to these specific species? These animals are studied as representations of aspects of biology more widely. They act as models for the world, true in generality. Mechanisms identified in one animal model will likely be reproduced in many others and then taken as true without the need to test in every other species in the world. Animal models simplify our search for general natural principles.
Whereas biology is messy (multi-scale, stochastic, bottom-up), many other sciences are more amenable to models. Mathematical models, for example, are more abstract, composed of equations that also aim to capture generalisable principles. A mathematical model can be used to explain a mechanism observed in an animal model, providing supporting evidence for a hypothesis. However, the strength of support is limited given that a mathematical model will be, more likely than not, a toy model. The question of when a mathematical model ceases to be a toy model is difficult to answer. Many policy decisions in society are made with the support of a mathematical model, ranging from epidemiology to economic policy. To some extent, these models are also supported by large amounts of data and, hence, tightly coupled to the world. However, both in policy and biology, the models used rely on simplified assumptions which distill the influences of many independent components into a set of symbolic equations. This simplified world view is why the models are often seen as “toys”, developed to explore the mathematical properties of this idealised world. It also allows for a deeper understanding of fundamental processes, as the mathematical descriptions that match observations are built on and further developed.
Experiments in mathematics are performed internally. Yet an idealised thrown stone will fall very near to where a model predicts whether in in Manchester, Moscow, or on the Moon. Mathematical models have been extremely successful because many aspects of the world are amenable to simplified symbolic representations and, as a result, these representations can be parsed, shared, and re-applied. The development and proliferation of mathematical models as a codex for the world is at the heart of the scientific method and our development as a species. These models have provided an intelligible abstraction of our world through a highly efficient encoding schema. The success in mathematics has led some to expect that our understanding relies on some form of symbolic representation, a position that I will refer to as symbolism.
Animal models abstract the complexities of all species into a subset of animals that are seen as representative of biological systems. Mathematical models abstract one step further, mapping real world processes onto an idealised world containing many simplifying assumptions (the world of the spherical cow). I claim that computer models are in-between animal and mathematical models, insofar as they are not naturally occuring nor purely abstract representations. They are typically more data-driven, focussing on statistical patterns and correlations, and possibly broader as a category of models. For example, some aspect of an animal can be represented in a simulation (a subset of computer models), which combines both animal models with mathematical models. These are often used to further support a hypothesis, especially where the process is viewed as bottom-up, emergent, or complex. Statistical models, which may follow a simplified mathematical model of the population, may only be implementable on a computer. I raise these examples to stress that these terms are more likely fuzzy and context-specific than demarcating clear boundaries between modelling types.
There may, however, be a clearer distinction between different models in terms of their function. A simulation is used to reproduced some observed phenomena. Hence, they are a subset of generative models, those models that seek to reproduce the underlying generative process that produces the data space. A statistical model, instead, may be used to segment or cluster a population. Such models aim to seek boundaries within the data space, and are known as discriminative models. A model that tells the difference between a cat and a dog is discriminative whereas one that reproduces an image of a dog is generative.
The accuracy and performance for both discriminative and generative models have been massively improved by the advent of large data volumes, accelerated computing, and artificial neural networks. Neural network models take available data and construct statistical associations spread across the weights of the networks. They have been transformative in vision tasks, natural language processing, and many aspects of modern life. More recently, their use as generative models has heightened our expectation of the reasoning ability of a computer model through commercial models such as OpenAI’s GPT series or applications of the transformer and diffusion models in general.
Neural network models are almost entirely data-driven. They do not depend on domain-specific closed-form equations or simplifying assumptions like mathematical models. They are also not typically used to identify generalisable principles about the world as with animal models, though there are certainly exceptions. The representations formed using a neural network model do not as easily provide the same understanding of the world as the other models described, hence why these models are commonly referred to as “black boxes”. However, they have proven to be extremely proficient at exploiting patterns in the data and, more recently, reproducing the underlying generative process. OpenAI’s Sora model reproduced aspects of fluid flow as well as many physics-based simulators. Examples like this show that data-driven models are at least effective, if not more so, than those that depend on domain-specific assumptions. Hence, it may be that our understanding is more associative, relying on an internal process of data analysis and distillation, and that this may be more fundamental than our ability to symbolically reason. I will refer to this position as numericism, to distinguish it from symbolism above. Under this view, one could even see symbolic reasoning as an explanatory tool applied on top of data-driven representations, i.e. that numericism underpins symbolism in some unidentified way.
The reason I draw the distinction between different sub-classes of models and the view of data-driven representations (numericism) and symbolic representations (symbolism) is to highlight that both the features and function of models are highly variable. Plato’s allegory of the cave is, in a sense, an allegory of models and representations. The shadows across the cave wall are our models of the world, simplified representations of the inaccessible fire behind us. The shadows are passively experienced but the models I have described are actively constructed, whether in the different generalised components studied in an animal model or the learned weights of a neural network model. In all cases, the models form explanations about the underpinning processes though the interpretability of these explanations can vary. Finally, as I will go on to describe, I believe that reasoning is the discovery and/or construction of these models.
In so far that models seek to encode the world, there will always be some information loss. Anything else would be a reproduction, rather than representation. Yet, the evaluation of a model depends on the intended purpose. A model to help us understand will be as useful as it is intelligible (symbolism). On the other hand, a model used to predict need only be accurate (numericism). In the next piece of writing, I aim to further explore the limitations of models and the degree in which these purposes exclusive. I will also introduce the concept of world models and the criteria for a model to be representative of the world. Understanding the limitations I hope will help to highlight the process of model discovery and, alongside it, the role of reasoning in how we form these models, especially in relation to world models.