November 16, 2010 § 3 Comments
When you’re trying to use models to probe the behavior of a complex biological system, there usually comes a point where you have to “fit parameters”. This happens because the model is trying to build up a macroscopic picture from underlying features that may be impossible to measure. For example, in the case of tumor growth, your model might use local nutrient density as a parameter that affects the rate of the growth of individual cells in the tumor and therefore the growth of the tumor overall. But nutrient density might not be possible to measure, and so you would have to use experimental data on something that’s easier to measure (e.g. how rapidly tumors grow) to deduce how nutrient density changes across the tumor. This might then allow you to make a prediction of what would happen in a different set of circumstances. A good deal of work has gone into figuring out how to estimate model parameters from experimental data, because it’s difficult; you may have to computationally explore a huge space to test which parameter values best fit your data, and you may find that your experimental data can’t distinguish among several different sets of parameters that each fit the data quite well. A recent paper (Fernández Slezak et al. (2010) When the Optimal Is Not the Best: Parameter Estimation in Complex Biological Models. PLoS One 5: e13283. doi:10.1371/journal.pone.0013283) highlights a disturbing problem of parameter estimation: the parameters you find by searching for the optimal fit between the model and experiment may not be biologically meaningful.
You might find this statement self-evident, and I’ll admit I didn’t fall off my chair either. But bear with me, because this is a more interesting study than you may think. What the authors do is to start with a model built by others that sets out to model how solid tumors grow when they don’t have a blood supply. The model recognizes the fact that solid tumors are composed of a mixture of live and dead cells, and treats the nutrients released by dead cells as potential fuel for the live cells. The question in the original model was how far a tumor can get in this avascular mode, and what factors lead to growth or remission. Fernández Slezak et al. aren’t interested in this question, though: they’re using the model as a test case to explore how easy it is to find parameters that match both the model and the experimental data. This particular model has six free parameters (which is more than 4, and fewer than 30); this is manageable, though large. I’ll mention two of the parameters, since they become important later: β, which is the amount of nutrient a cell consumes while undergoing mitosis; and C(c), the concentration of nutrient that maximizes the mitosis rate. Many people would have had to settle for a rather sparse sampling of parameter space for this size of model, but because some of the authors work at IBM they had access to remarkable computational resources (several months of Blue Gene‘s compute time).
Fernández Slezak et al. used a set of synthetic growth curves extrapolated from experimental measurements made by others as their test data set, and set out to find parameters that fit the “experimental” data by brute-force computation. Essentially, this means guessing values for each of the six parameters, putting those values into the model, simulating the growth of a tumor, and comparing the predicted growth curve to (in this case synthetic) experimental reality. The optimal parameter set is then the one that minimizes the “cost function” of the sum of the differences between prediction and reality across all time points. (For the aficionados, they used four different methods for minimization of the cost function: Levenberg-Marquardt, parallel tempering, MIGRAD and downhill simplex. Of these, the Levenberg-Marquardt method did best at finding cost function minima and downhill simplex did worst.)
With a relatively dense sampling of parameter space, the authors were able to see just how rugged this cost function landscape is. Parameter values that are right next to each other can be dramatically different in how well they fit the data. There are lots of local minima, and no completely clear winners: each minimization method found several “best fits” that were essentially equal in terms of the cost function, each of which described the experimental data very well. But the parameter values for these “best fits” were not close together.
This is where β and C(c) come in. These particular parameters have been measured independently in the literature, although the authors scrupulously ignored this in their parameter-fitting exercise. So now it’s possible to compare the “best fits” from the parameter-fitting to actual experimental ranges for two of the six parameters. And it turns out that only a small subset of the values found as “best fits” fall within the ranges set by the experimental values, and (embarrassingly, perhaps) the very best “best fit” is not one of them.
What does this mean for the future of parameter-fitting? Fernández Slezak et al. remind us that fitting existing data is not the point: the point is to gain insight into how real biology may behave in circumstances that have not been, or cannot be, directly tested. This is not too different from the goal in machine learning, where you want to use a training set to teach you something about how to approach new data. But in machine learning it’s well known that an algorithm that shows good performance on the training set can show very poor predictive ability; this happens when the model is “overfitted” to the data.
How are we going to avoid the problem of overfitting in biological models? The authors propose that one way to approach this is to use a subset of the available experimental data for the first round of parameter-fitting, and reserve the rest of the data to test the model. This is similar to the cross-validation techniques used in statistical learning, or the R(free) test used by structural biologists. This approach probably won’t work very well if we simply pick the “best fit” from the first round of parameter-fitting and ask whether it fits the new data, however; instead, we’ll need a way to define a set of “reasonable fits”. Instead of minimizing the cost function, we may only need to get it below a certain level. Then the question will be, which set of parameters is “reasonable” for all the data sets available?
Since the model itself is undoubtedly not a perfect representation of reality, it may be pointless to look for a perfect fit between model and experiment. As a wise person once (almost) said, it’s important not to let the perfect be the enemy of the good.
Fernández Slezak D, Suárez C, Cecchi GA, Marshall G, & Stolovitzky G (2010). When the optimal is not the best: parameter estimation in complex biological models. PloS one, 5 (10) PMID: 21049094