This paper studies heterogeneous networks (HetNets) where multiple wireless users are supposed to associate with an optimal base station (BS) to maximize the network utility. In order to guarantee the fairness among users and enhance the capacity of network, users should be actively associated with the BS tiers with lighter load, instead of the one with the maximum signal-to-interference-plus-noise ratio (SINR). Therefore, an optimization problem of joint user association and bandwidth allocation is formulated, which is a mixed binary integer programming and also an NP-hard problem. However, it is challenging to solve this problem by using traditional methods due to its high computational complexity and sensitivity to the change of channel parameters. This paper proposes an online deep reinforcement learning (DRL) based algorithm for HetNet, where multiple parallel deep neural networks (DNNs) can generate user association solutions. We use a shared memory structure to store the best association scheme and then use these data as training data set to train all the parallel DNNs. Numerical results show that our proposed algorithm can achieve significant performance gain in terms of the value of utility function over the max-SINR user association scheme. In addition, it also performs better than the greedy algorithm, and when the proposed algorithm are adopted with 5 DNNs, it can achieve a performance gain of up to 5% compared with the greedy algorithm.