In order to assess performance and to improve predictions, land surface models are routinely calibrated against measurements of either latent heat or sensible heat fluxes. Generally, little regard is given to the multi-output nature of these models, resulting in a model evaluation that is inherently biased towards the calibration variable. In this paper, an assessment strategy that accounts for multiple outputs is explored and an examination of incorporating alternative sources of information to assess performance is undertaken. The benefits of such a multi-objective calibration framework are illustrated through comparison with traditional single objective calibration. Results indicate that combining different observation data streams for calibration purposes assists in producing a more robust process model and provides improved surface flux predictions. Further, the utility of using correlated, if not commensurate, sources of data, is demonstrated through analysis of a time series of surface temperature measurements.