Improving coordination of unmanned vehicles

Game theory using physical constraints, time-delay feedback, and asymmetric information structures guides autonomous vehicles.
01 August 2014
Dan Shen, Genshe Chen, Haibin Ling, Khanh Pham and Erik Blasch

Unmanned ground vehicles (UGVs) have a variety of uses, including clearing land mines. However, their effectiveness may be limited by the restricted field of view obtained by on-board cameras and sensors. In adversarial or other challenging environments, it can be difficult for autonomous controllers to have sufficient knowledge to decide the best navigation strategy. One solution is to use an unmanned aerial vehicle (UAV) flying above the UGV to oversee a wider area. However, this requires the UAV and UGV to make coordinated autonomous decisions.

Purchase SPIE Field Guide to IR Systems, Detectors and FPAsA mathematical tool to analyze situations, potentially in a conflict situation, is a pursuit-evasion (PE) game.1–3 Such games are applied in areas as varied as geometry and graphs,4, 5 sensor management,6,7 collision avoidance,8, 9 and high-level information fusion.10 However, PE games are mostly implemented and tested by numerical simulations, where real-life physical constraints, time-delay feedback, and computational feasibility are not fully considered. Most multiplayer PE games assume that all players are equal, unlike when a UAV is coordinating UGVs. To tackle real-world limitations of PE game solutions, we designed a framework for this setup, as indicated in Figure 1. We built a PE game with the physical operating boundaries of UAV/UGVs, considered command delays, and tested whether optimal solutions were feasible.


Figure 1. A hardware-in-loop pursuit-evasion (PE) game framework. AR: Augmented reality.

In our framework, robots (used as UGVs) and a drone (used as a UAV) are connected to a computer via a wireless local area network (WLAN). Figure 2 shows the hardware connection diagram. Pursuers are represented by a computer (pursuer agent), which also hosts the three-player PE game. The pursuer version of the game takes as inputs the states of the pursuers (measured by the drone agent), tracked evader states (from the pursuers' local cameras), and the learned cost function (that is, evaluation of how best to move to catch the evader). Similarly, an evader agent (such as the computational solver on a computer sending commands to the robot) assesses the drone agent states. A tracking module generates the pursuer states from the local camera and the intents of the pursuers (modeled by the pursuers' cost functions) obtained from online learning schemes. A UAV agent (depicted as ‘controller’ in Figure 1) coordinates the drone movements and the visual entity tracking algorithms. In the drone controller, measurement and communication delays are taken into account based on the drone dynamics model and the supported drone commands.


Figure 2. Hardware connections.

To obtain the UGV robot states (location, movement, and intent), we designed and implemented visual tracking algorithms over image sequences from the drone camera (global view) and the robot cameras (local views). We designed several markers on the robots and selected one that achieved the optimum balance of position accuracy and tracking robustness for use in the PE game's theoretical robot control schemes. The target robots are detected after background modeling, and the robot orientation is estimated from the local gradient patterns. Since the UGVs are moving to perform pursuit-evasion missions, we designed a proportional-integral-derivative (PID) controller for guiding the UAV to follow the evader UGV. With delays in measurement channels (camera and communication delays), the controller receives out-of-date information. We implemented a delay measurement compensation based on the history of robot movements and drone commands. Figure 3 shows the improved performance that results from the compensation. In parts (a) and (c) we have plotted the true position, the estimated position, and the measured position on the x- and y-axis, respectively. Measured positions are the (x,y) obtained from the images. Delays cause these measured values to be out of date, so we adjust them for the known delay and previous movements to create the estimated positions. In parts (b) and (d) we have plotted the difference between the estimated position and the true position, with and without delay measurement compensation, for the x- and y-axis, respectively. It can be seen that results with compensation (blue line) demonstrate less position error.


Figure 3. Plots against time of the true (truth), measured (meas.) and estimated (est.) (a) x-coordinate and (c) y-coordinate. Plots comparing position results for the (b) x-coordinate and (d) y-coordinate before and after delay compensation (comp.) show that the compensation improves the position (x,y) accuracies of robots (evaders) from drone images with delays. dx, dy: The difference (est. – truth) between the estimated x- or y-coordinate and the true value, respectively.

Our testbed includes a derived three-player PE game tested with real-world systems as a core component. In the PE game model, there are two UGV pursuers and one UGV evader. There are two versions of the PE game model: one is hosted by the pursuers, named PPEG (pursuer PE game), and the other is maintained by the evader, named EPEG (evader PE game). The PPEG is used to calculate the controls of pursuer robots based on the pursuers' states, the tracked evader states, and the learned evader's intents (or evader's cost function). The EPEG will help the evader to obtain its control from the evader's states, tracked pursuers' states, and the adaptively obtained pursuers' intents (or pursuers' cost functions). Action-curve-based solutions have been developed to compute the mixed Nash equilibriums, which are probabilities for each action, assuming each player knows the equilibrium strategies of the other players. Each game model includes the states of the corresponding player, the resolved possible strategies, and goals.

To perform system integration and visualization, we designed a graphical user interface (GUI)-based scenario manager, as shown in Figure 4. The hardware demonstrator combined video-based entity tracking algorithms; three-player game modeling with different information structures; learning algorithms for robot dynamics; action-curve-based (mixed) Nash solutions of nonlinear PE games; sensor and communication delay modeling and compensation; the PID controller for the UAV drone; and the GUI-based scenario manager visualizer. Using the demonstrator, we tested various scenarios, different cooperation strategies, and diverse boundary conditions.


Figure 4. Our hardware demonstrator with graphical user interface and scenario. It monitors tracking algorithms, game models, learning algorithms, physical constraint compensation, and the unmanned aerial vehicle. IFT: Intelligent Fusion Technology Inc.

In summary, we have developed a hardware testbed for autonomous networked robots (UGVs) with the help of a flying drone (UAV) to validate PE game-theoretical solutions. This has applications for clearing mines or searching for a crashed plane, where a UAV providing aerial coverage could coordinate multiple UGVs looking for mines faster than sending them out with distributed (that is, noncoordinated) coverage plans. For a plane crash, a patrol plane monitoring search ships would validate their positions, rather than just having the ships use global positioning system coordinates.

Our testbed integrated robot dynamic models, entity-tracking algorithms, sensor fusion methods, and a PE game demonstration for three robots (two slower pursuers and one faster evader). Based on the robot dynamic model and measured UGV states, we designed a three-player discrete-time game model with limited action space and limited look-ahead time horizons with robot controls based on Nash solutions from game theory. We obtained promising results from the hardware-in-loop simulations for real-time robot PE game-theoretical methods. In the next step, we will expand our testbed to include imperfect wireless communication due to adversarial jamming and interference.

This material is based on research sponsored by the Air Force Research Laboratory under agreement number FA9453-12-C-0228. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Air Force Research Laboratory or the US Government.


Dan Shen, Genshe Chen
Intelligent Fusion Technology Inc. (IFT)
Germantown, MD

Dan Shen received his MS and PhD in electrical and computer engineering from Ohio State University in 2003 and 2006. He then worked as a research scientist at Intelligent Automation Inc. (MD) and as a project manager at DCM Research Resources LLC (MD). He is currently a principal scientist at IFT, where his interests include game theory and its applications, optimal control, and adaptive control.

Genshe Chen received BS and MS degrees in electrical engineering and a PhD in aerospace engineering, all from Northwestern Polytechnical University, Xian, China. He has undertaken postdoctoral research at Beihang University (China), Wright State University, the Technical University of Braunschweig (Germany), the Flight Division of the National Aerospace Laboratory of Japan, and Ohio State University. In addition, he is CTO of IFT and has worked for over 20 years on electronic warfare, secure communication, target tracking, guidance, navigation and control of aerospace vehicles, decision making under uncertainty, space communication, and situation awareness.

Haibin Ling
Department of Computer and Information Sciences
Temple University
Philadelphia, PA

Haibin Ling is an associate professor at Temple University. He received BS and MS degrees from Peking University, China, and a PhD in computer science from the University of Maryland. He has worked at Microsoft Research Asia, the University of California, Los Angeles, and Siemens Corporate Research. His research interests include computer vision, medical image analysis, human computer interaction, and machine learning.

Khanh Pham
Space Vehicles Directorate
Air Force Research Laboratory (AFRL)
Albuquerque, NM

Khanh Pham is a senior member of SPIE as well as of IEEE. He is an associate fellow of the American Institute of Aeronautics and Astronautics (AIAA) and has been nominated for many AFRL Achievement awards.

Erik Blasch
United States Air Force
Rome, NY

Erik Blasch received his BS from the Massachusetts Institute of Technology, and master's degrees in mechanical engineering, industrial engineering, and health science from Georgia Tech. From Wright State University he has obtained an MBA, MSEE, MS in economics, and a PhD. Currently he is a principal scientist at the AFRL Information Directorate, leading programs in information fusion. He is an SPIE Fellow, associate fellow of the AIAA, and a senior member of IEEE.


References:
1. R. Isaacs, Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control, and Optimization, Wiley, New York, 1965.
2. Y. C. Ho, A. E. Bryson Jr., S. Baron, Differential games and optimal pursuit-evasion strategies, IEEE Trans. Auto. Cont. AC-10(4), 1965.
3. T. Basar, G. J. Olsder, Dynamic Noncooperative Game Theory, Society for Industrial and Applied Mathematics, 1998.
4. L. Guibas, J. C. Latombe, S. LaValle, D. Lin, R. Motwani, A visibility-based pursuit-evasion problem, Int'l J. Comput. Geom. Appl. 4(2), p. 74-123, 1985.
5. F. R. K. Chung, J. E. Cohen, R. L. Graham, Pursuit-evasion games on graphs, J. Graph Theory 12(2), p. 159-167, 1988.
6. M. Wei, G. Chen, E. Blasch, H. Chen, J. B. Cruz Jr., Game theoretic multiple mobile sensor management under adversarial environments, Int'l Conf. Info. Fusion, 2008.
7. D. Shen, G. Chen, E. Blasch, K. Pham, C. Yang, I. Kadar, Game theoretic sensor management for target tracking, Proc. SPIE 7697, p. 76970C, 2010. doi:10.1117/12.850870
8. V. Isler, D. Sun, S. Sastry, Roadmap based pursuit-evasion and collision avoidance, Proc. Robot. Sci. Syst., p. 257-264, 2005.
9. D. Shen, K. Pham, E. Blasch, H. Chen, G. Chen, Pursuit-evasion orbital game for satellite interception and collision avoidance, Proc. SPIE 8044, p. 80440B, 2011. doi:10.1117/12.882903
10. E. Blasch, E. Bosse, D. A. Lambert, High-Level Information Fusion Management and Systems Design, Artech House, Norwood, MA, 2012.
PREMIUM CONTENT
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research