We present a novel framework for measuring the body motion of multiple individuals in a group or crowd via a vision-based tracking algorithm, thus to enable studies of human-induced vibrations of civil engineering structures, such as floors and grandstands. To overcome the difficulties typically observed in this scenario, such as illumination change and object deformation, an online ensemble learning algorithm, which is adaptive to the non-stationary environment, is adopted. Incorporated with an easily carried and installed hardware, the system can capture the characteristics of displacements or accelerations for multiple individuals in a group of various sizes and in a real-world setting. To demonstrate the efficacy of the proposed system, measured displacements and calculated accelerations are compared to the simultaneous measurements obtained by two widely used motion tracking systems. Extensive experiments illustrate that the proposed system achieves equivalent performance as popular wireless inertial sensors and a marker-based optical system, but without limitations commonly associated with such traditional systems. The comparable experiments can also be used to guide the application of our proposed system.