Resumen:
BACKGROUND: Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. METHODS: We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-a-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. RESULTS: Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p < 0.001) when using a linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p < 0.001) and slopes (p < 0.001) of the individual growth trajectories. We also identified important serial correlation within the structure of the data (rho = 0.66; 95 % CI 0.64 to 0.68; p < 0.001), which we modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19,598, respectively). While the regression parameters are more complex to interpret in the former, we argue that inference for any problem depends more on the estimated curve or differences in curves rather than the coefficients. Moreover, use of cubic regression splines provides biological meaningful growth velocity and acceleration curves despite increased complexity in coefficient interpretation. CONCLUSIONS: Through this stepwise approach, we provide a set of tools to model longitudinal childhood data for non-statisticians using linear mixed-effect models.