Modeling biology is a runway process

We are all accustomed to making models. We do it everyday when we speak, when we meet someone for the first time, or when we are looking for a parking spot. Without models the world would be one gray area. A model is nothing more than a construct used to evaluate a complex environment. It can break something continuous into arbitrary yet digestible bits or give something intangible a concrete existence. Human language does both.

Scientists take this a step further to systematically interrogate unresolved problems. Biologists use model organisms to study the inner workings of life, these organisms so chosen for their desirable qualities facilitating the draft of a biological roadmap that makes scientific travel easier. Chemists conceive reactions between molecular species with arrows and diagrams, none of which can accurately describe minute atomical interactions. A good model is not one that describes every component of a system, it is one that describes a system sufficiently to make practical predictions and simply to form testable hypotheses.

Despite the infinite complexity of biology, modeling its components is a well-defined process when done right. We’ll enamor ourselves of models of the mathematical variety for now. Conceptual and physical models serve their purposes and can be used alongside mathematical models, but the formalism of mathematics to describe a process with no ambiguity is powerful.

(1) A model starts with a problem to describe.  This problem hopefully has biological significance and its context can be defined. For example, modeling the effect of dopaminergic drugs on neuronal impulses requires background knowledge of action potentials, binding affinities of the drugs to dopamine receptors, and other characteristics of the synaptic and cellular environments that affect the interaction between drug and cell. 

(2) The problem has to be bounded. There may be factors that lead to inter-individual variation in the neuronal response to drugs. However, if this variation is extraneous to the problem, leave it be. 

(3) Define lift-off. There must be an endpoint to achieve that has little to do with the form of the model and everything to do with what the model must do to illuminate the problem. Modelers sometimes make the mistake of choosing the type of model before these three steps are complete. We are all limited by the set of modeling approaches we know well, but letting the needs of the model guide the problem rather than the reverse limits the power of the results. 

(4) Illuminate the runway. Most models are built from subunits that are more easily characterized than the entire process being described. Resolve all of the smaller relationships, and the path ahead is clearer. For an analytical model, this includes deriving functions to be incorporated in a larger framework. For a statistical model, this includes evaluating the variation each variable describes and the covariation between them.

(5) Proceed slowly with increasing speed. Now that the path is readied, think carefully about obstacles in the way and stop and restart as necessary. However, when speed is built, continue to completion. 

(6) Monitor the flight-worthiness of the model. At this stage, the model needs to be tested. With data in hand, the model should fit data better than alternatives.  Without data in hand, the model should exhibit explainable behaviors under a gauntlet of test conditions. If things don’t check out, the model should be grounded for repair or in favor of another.

Sometimes crafting a model and evaluating alternatives is the final goal of a modeler.  Especially with statistical models, identifying variables that are most important to an outcome is the goal. This exercise can be taken a step further. With an appropriate model, one or more hypotheses can be tested by changing model conditions and observing the outcomes. This is one very important utility of models — interpolating and making projections about scenarios of interest but that lack data. In time, data will demonstrate weaknesses of the model for certain conditions, and the model will need to be refined or replaced in favor of a better one. This lifecycle of a model is its strength. The next generation can be used to investigate further and generate new hypotheses that can be tested empirically. In many ways the scientific endeavor is modeled from the biological one, and through its adaptation it persists.

Richard Grewelle is a PhD student in the De Leo lab who studies ecological and evolutionary underpinnings of wildlife disease systems, focusing on the marine environment. Marine diseases present significant challenges to not only biologists; they may devastate fragile ecosystems supporting fisheries or providing ecological services. However, they remain poorly studied compared to terrestrial diseases. Richard builds models to understand and predict the impact of marine diseases and is particularly interested in applying these techniques to the conservation of marine species. You can find out more about his research and the De Leo lab at: https://deleolab.stanford.edu/research