In this research, we propose a facial expression recognition system with a layered encoding cascade optimization model. Since generating an effective facial representation is a vital step to the success of facial emotion recognition, a modified Local Gabor Binary Pattern operator is first employed to derive a refined initial face representation and we then propose two evolutionary algorithms for feature optimization including (i) direct similarity and (ii) Pareto-based feature selection, under the layered cascade model. The direct similarity feature selection considers characteristics within the same emotion category that give the minimum within-class variation while the Pareto-based feature optimization focuses on features that best represent each expression category and at the same time provide the most distinctions to other expressions. Both a neural network and an ensemble classifier with weighted majority vote are implemented for the recognition of seven expressions based on the selected optimized features. The ensemble model also automatically updates itself with the most recent concepts in the data. Evaluated with the Cohn-Kanade database, our system achieves the best accuracies when the ensemble classifier is applied, and outperforms other research reported in the literature with 96.8% for direct similarity based optimization and 97.4% for the Pareto-based feature selection. Cross-database evaluation with frontal images from the MMI database has also been conducted to further prove system efficiency where it achieves 97.5% for Pareto-based approach and 90.7% for direct similarity-based feature selection and outperforms related research for MMI. When evaluated with 90 degrees side-view images extracted from the videos of the MMI database, the system achieves superior performances with >80% accuracies for both optimization algorithms. Experiments with other weighting and meta-learning combination methods for the construction of ensembles are also explored with our proposed ensemble showing great adpativity to new test data stream for cross-database evaluation. In future work, we aim to incorporate other filtering techniques and evolutionary algorithms into the optimization models to further enhance the recognition performance.