The Earth’s land surface and the atmosphere are strongly interlinked through the exchange of energy and matter. This coupled behaviour causes various land-Atmosphere feedbacks, and an insufficient understanding of these feedbacks contributes to uncertain global climate model projections. For example, a crucial role of the land surface in exacerbating summer heat waves in midlatitude regions has been identified empirically for high-impact heat waves, but individual climate models differ widely in their respective representation of land-Atmosphere coupling. Here, we compile an ensemble of 54 combinations of observations-based temperature (¡ i¿ T¡/i¿ ) and evapotranspiration (ET) benchmarking datasets and investigate coincidences of ¡ i¿ T¡/i¿ anomalies with ET anomalies as a proxy for land-Atmosphere interactions during periods of anomalously warm temperatures. First, we demonstrate that a large fraction of state-of-The-Art climate models from the Coupled Model Intercomparison Project (CMIP5) archive produces systematically too frequent coincidences of high ¡ i¿ T¡/i¿ anomalies with negative ET anomalies in midlatitude regions during the warm season and in several tropical regions year-round. These coincidences (high ¡ i¿ T¡/i¿, low ET) are closely related to the representation of temperature variability and extremes across the multi-model ensemble. Second, we derive a land-coupling constraint based on the spread of the ¡ i¿ T¡/i¿ -ET datasets and consequently retain only a subset of CMIP5 models that produce a land-coupling behaviour that is compatible with these benchmark estimates. The constrained multi-model simulations exhibit more realistic temperature extremes of reduced magnitude in present climate in regions where models show substantial spread in ¡ i¿ T¡/i¿ -ET coupling, i.e. biases in the model ensemble are consistently reduced. Also the multi-model simulations for the coming decades display decreased absolute temperature extremes in the constrained ensemble. On the other hand, the differences between projected and present-day climate extremes are affected to a lesser extent by the applied constraint, i.e. projected changes are reduced locally by around 0.5 to 1ĝtexteuro° C-but this remains a local effect in regions that are highly sensitive to land-Atmosphere coupling. In summary, our approach offers a physically consistent, diagnostic-based avenue to evaluate multi-model ensembles and subsequently reduce model biases in simulated and projected extreme temperatures. © Author(s) 2017.