Mlab Simulation Paper

756 Words 4 Pages
This section demonstrates and validates the proposed distributed multi-agent area coverage control reinforcement learning algorithm (MAACC-RL) via numerical simulations. The approximate/adaptive dynamic programming (ADP) algorithm implemented for MAACC-RL uses a recursive least-squares solution.
To demonstrate the effectiveness of the proposed MAACC algorithm, a MATLAB simulation is conducted on a group of five agents placed on a 2D convex workspace Ω⊂R^2 with its boundary vertices at (1.0,0.05), (2.2,0.05), (3.0,0.5), (3.0,2.4), (2.5,3.0), (1.2,3.0), (0.05,2.40), and (0.05,0.4) m. Agents ' initial positions are at (0.20,2.20), (0.80,1.78), (0.70,1.35), (0.50,0.93), and (0.30,0.50) m. The sampling time in all simulations is chosen to be 1 s and the simulation is conducted for 180 s. A moving target inside the workspace characterizes time-varying risk density with
The RL agents’ initial estimated states (2D positions in this case) are distributed randomly such that mean positions, e[0] = 〖(e_1 [0],…,e_5 [0])〗^T∈R^10 are at (0.20,2.20), (0.80,1.78), (0.70,1.35), (0.50,0.93), and (0.30,0.50) km. There exist three targets in the environment and its
…show more content…
After 30 s, the search mission is commenced. The updated probability map is transmitted to the service vehicles every 5 s by the host computer. The final configuration of planar position and the trajectories of all service vehicles for different time steps, 40 s, 90 s, 180 s are shown in Figure 6-3.1-1-c. The distribution density function based on the most updated probability maps are also shown in the figures. The color intensity is proportional to the value of distribution density function at each point. It can be seen from this figure that the configuration of service vehicles in the environment is optimal according to time-varying density

Related Documents