The whole Q-learning process