Terrestrial biosphere models are indispensable tools for analyzing the biosphere-atmosphere exchange of carbon and water. Evaluation of these models using site level observations scrutinizes our current understanding of biospheric responses to meteorological variables. Here we propose a novel model-data comparison strategy considering that CO2 and H 2O exchanges fluctuate on a wide range of timescales. Decomposing simulated and observed time series into subsignals allows to quantify model performance as a function of frequency, and to localize model-data disagreement in time. This approach is illustrated using site level predictions from two models of different complexity, Organizing Carbon and Hydrology in Dynamic Ecosystems (ORCHIDEE) and Lund-Potsdam-Jena (LPJ), at four eddy covariance towers in different climates. Frequency-dependent errors reveal substantial model-data disagreement in seasonal-annual and high-frequency net CO2 fluxes. By localizing these errors in time we can trace these back, for example, to overestimations of seasonal-annual periodicities of ecosystem respiration during spring greenup and autumn in both models. In the same frequencies, systematic misrepresentations of CO2 uptake severely affect the performance of LPJ, which is a consequence of the parsimonious representation of phenology. ORCHIDEE shows pronounced model-data disagreements in the high-frequency fluctuations of evapotranspiration across the four sites. We highlight the advantages that our novel methodology offers for a rigorous model evaluation compared to classical model evaluation approaches. We propose that ongoing model development will benefit from considering model-data (dis)agreements in the time-frequency domain. Copyright 2010 by the American Geophysical Union.