In June 2017, Xu Yadong obtained his Ph.D in the School of Computer Science and Engineering at Nanyang Technological University. His PhD topic focuses on novel algorithms to accelerate the execution of microscopic traffic simulation using high performance computers. This is particularly useful in the studies where real-time decision making is critical, or a large number of simulation runs are required.
The original contributions of the thesis are the new methods that make use of the characteristics of agent models to efficiently conduct load-balancing and synchronisation in parallel agent-based road traffic simulations. Workload balance and synchronisation are two critical factors that influence the execution time and the maximum speed-up of a parallel simulation. These topics have appeared in our previous blog and news entries. Here, we give a summary of how those topics are related and function together.
In his work, it is identified that the existing partitioning algorithms do not fully minimise the communication overhead between LPs. As for synchronisation protocols, most of the existing traffic simulators apply global barriers to synchronise LPs. Some existing works use conservative protocols to allow LPs to progress asynchronously. However, the performance may not be better than that of using global barrier, since in general agent models in traffic simulation only allow small lookahead.
Based on the analysis, a graph partitioning algorithm was developed, aiming to reduce the synchronisation overhead of the simulation while maintaining workload balance. In addition to balancing workload and minimizing communication volume, it also aims to minimise the number of neighbouring partitions of each partition. Minimising the number of neighbouring partitions can reduce synchronisation messages among LPs during the simulation. Less synchronisation messages means less communication overhead to the simulation.
To further reduce the synchronisation overhead, two methods of improving the lookahead were developed. The first method takes advantage of the intrinsic uncertainties in traffic simulation to relax synchronisation. Relaxation of synchronisation is accomplished either by simply skipping synchronisation operations, or by reducing the resolution of agent models at the boundaries of partitions. However, it is important to ensure that simulation results are statistically unaltered by the relaxation. To maximise relaxation, the behaviour of agents needs to be analysed. The second method reduces synchronisation operations by conducting computation replication to reduce data dependencies between LPs. Some data previously sent via synchronisation messages are replaced by redundant computation. Extended layers of partitions are defined according to the behaviour of agents in order to determine the amount of redundant computation required for a certain lookahead. However, there is a trade-off between the benefit of reduced synchronisation operations and the overhead of computation replication. Depending on the distribution of agents on the road network, the optimal lookahead for achieving the best speed-up of the simulation may change during the simulation. Hence, the suitable lookahead is dynamically adjusted during the simulation to ensure that the benefit of reduced synchronisation operations is greater than the overhead of computation replication.
All the proposed methods were evaluated in SEMSim traffic, an agent-based discrete-event traffic simulator (currently CityMoS) in a distributed-memory environment using real-world road network and trip data.
Results have shown that firstly the proposed partitioning algorithm produces less synchronisation overhead than stripe spatial partitioning and the popular METIS partitioning algorithm. Secondly, the two methods of improving lookahead of LPs can effectively reduce synchronisation overhead and improve the overall speed-up of the parallel simulation. For relaxing synchronisation, applying a suitable amount of relaxation does not alter simulation results statistically.
The thesis can be found here.