Abstract:The trajectory deviation exists in the movement and obstacle avoidance of the manipulator, which should be corrected through appropriate control algorithm to ensure that the actual trajectory is close to the ideal trajectory. A path planning and obstacle avoidance scheme was proposed based on improved Q-learning algorithm. The state vector set and the action set in each state were constructed respectively. BP neural network algorithm was used to improve the continuous approximation ability of the model, and the Q function value was updated continuously in the iteration. In the path planning, according to the principle of the rotation angle of the joint and the minimum space movement distance of the connecting rod, the reasonable obstacle avoidance and the minimum trajectory deviation were realized. The simulation results show that the proposed control algorithm has fast convergence speed, better path planning effect than the traditional planning scheme, and the lowest migration cost.