Abstract
Most of the current studies on autonomous vehicle decision-making and controltasks based on reinforcement learning are conducted in simulated environments.The training and testing of these studies are carried out under rule-basedmicroscopic traffic flow, with little consideration of migrating them to realor near-real environments to test their performance. It may lead to adegradation in performance when the trained model is tested in more realistictraffic scenes. In this study, we propose a method to randomize the drivingstyle and behavior of surrounding vehicles by randomizing certain parameters ofthe car-following model and the lane-changing model of rule-based microscopictraffic flow in SUMO. We trained policies with deep reinforcement learningalgorithms under the domain randomized rule-based microscopic traffic flow infreeway and merging scenes, and then tested them separately in rule-basedmicroscopic traffic flow and high-fidelity microscopic traffic flow. Resultsindicate that the policy trained under domain randomization traffic flow hassignificantly better success rate and calculative reward compared to the modelstrained under other microscopic traffic flows.