Spin torque oscillators (STOs) often exhibit multiple modes, leading to complex behavior. One example is mode hopping between different eigenmodes of a magnetic tunnel junction (MTJ) STO. This mode hopping is a strong function of current and angle between the magnetization in the free and fixed layers, and away from anti-parallel configuration, mode hopping can be the dominant decoherence process. Another example is the linewidth of a nanocontact STO that can be a complex non-monotonic function of temperature in regions where two or more modes are excited by the oscillators. These phenomena require a generalization of the single-mode nonlinear STO theory to include mode coupling. We derive equations describing the slow time evolution of the coupled system and show they describe a dynamically driven system, similar to other systems that exhibit mode hopping in the presence of thermal fluctuations. In our description, mode coupling also leads to additional coupling between power and phase fluctuations, which can in certain limited cases lead to longer relaxation times for power fluctuations, and consequently to larger linewidths through the nonlinear frequency shift.