Role-based Bayesian decision framework for autonomous unmanned systems

2024-01-17 09:33PANGWeijianMAXinyiLIANGXuemingLIUXiaogangandDONGErwa

PANG Weijian, MA Xinyi, LIANG Xueming, LIU Xiaogang, and DONG Erwa

1.Beijing Aeronautical Engineering Research Center, Beijing 100076, China;2.Academy of Military Sciences, Beijing 100091, China;3.Air Forces Command College of PLA, Beijing 100097, China

Abstract: In the process of performing a task, autonomous unmanned systems face the problem of scene changing, which requires the ability of real-time decision-making under dynamically changing scenes.Therefore, taking the unmanned system coordinative region control operation as an example, this paper combines knowledge representation with probabilistic decisionmaking and proposes a role-based Bayesian decision model for autonomous unmanned systems that integrates scene cognition and individual preferences.Firstly, according to utility value decision theory, the role-based utility value decision model is proposed to realize task coordination according to the preference of the role that individual is assigned.Then, multi-entity Bayesian network is introduced for situation assessment, by which scenes and their uncertainty related to the operation are semantically described, so that the unmanned systems can conduct situation awareness in a set of scenes with uncertainty.Finally, the effectiveness of the proposed method is verified in a virtual task scenario.This research has important reference value for realizing scene cognition, improving cooperative decision-making ability under dynamic scenes, and achieving swarm level autonomy of unmanned systems.

Keywords: autonomous unmanned systems, multi-entity Bayesian network (MEBN), situation awareness, decision modeling.

1.Introduction

In the process of executing a certain mission, the swarm unmanned system may face changes in environmental situations, which requires the unmanned system to have the ability to plan in dynamic scenarios.However, existing dynamic decision models are designed for specific tasks or scenarios [1-4].When the scenes, mission goals, environment, and other factors related to the task change, the original model will no longer be applicable.The decision to swarm unmanned systems faces the challenge of situation change and scene transformation.

The existing decision models can be categorized into data-based and knowledge-based models.The data-based models usually consider the task as a Markov process.The model is built based on uncertainty theory and has good adaptability to dynamic scenarios.However, the construction of the model requires the accumulation of a large amount of data, and the accuracy and computational efficiency of the model is relatively low.Whereas the knowledge-based models can make full use of the existing experience, the decision-making process can be explained, and its efficiency is high under the condition of the scenarios are well described semantically.Its disadvantage is that it has poor adaptability to the change of scene.

As shown in [5], task ontologies are widely used to semantically describe domain knowledge related to tasks.In [5], the relationship between a certain situation and a certain action is determined.However, in conditions with uncertainty, there are various actions to act under various situations, and both the estimation of situations and the actions to act are probabilistic.Therefore, there is a need for a probabilistic decision modeling method that can well characterize the uncertainty and can be well adapted to semantically description of the scenario.

Furthermore, when individuals in a swarm unmanned systems are executing a cooperative task, they tend to cooperate by assigning a certain role to each individual.Thus, the decision-making problem of swarm unmanned systems has the feature of role preference.

Therefore, this paper transforms the dynamic decisionmaking problem into a probabilistic inferencing problem and proposes a role-based Bayesian decision method based on multi-entity Bayesian network (MEBN).The research focuses on: (i) Establishing a knowledge base for situation awareness under multi-scenario.(ii) Proposing a probabilistic inferencing model for decision-making, which considers individual roles in a cooperative swarm.(iii) Integrates the above decision-making and situation awareness into a hybrid inference framework that supports probabilistic inference.

The rest of the paper is organized as follows.Section 2 introduces the related works and the framework proposed.Section 3 introduces a concept operation scenario for heterogeneous collaborative swarm unmanned systems.Section 4 focuses on the proposed role-based Bayesian decision method.Section 5 details the multiscene knowledge base and construction of the whole framework.Section 6 shows a verifying experiment for the proposed method.Finally, Section 7 summarizes this paper.

2.Related works and proposed framework

2.1 Related works

Human beings have complex thinking processes in contingency planning.In this case, rule-based first-order reasoning is difficult to meet the needs of multi-scenario decision-making.Signed directed graph (SDG) [6],dynamic Bayesian network (DBN), dynamic influence diagram (DID) [7], genetic fuzzy tree (GFT) [8], and MEBN [9] are traditional methods for decision-making.All of them describe the complex causality or correlation between decision factors in the form of graphs, which can support the complex reasoning process and have the potential to be applied to multi-scene decision-making.The characteristics of these models are compared in Table 1.

Table 1 Comparison of methods for decision-making

In this paper, MEBN is chosen as the basic modeling method of domain knowledge related to decision-making.Proposed by Laskey [9] in 2008, MEBN combines the expressiveness of first-order logic with the advantage of the Bayesian network in describing uncertainty.Compared with other methods, the main advantages are as follows.

(i) The MEBN network used for inference is generated in real-time according to the current evidence, unlike the classical Bayesian network that has a fixed structure,which makes the MEBN more suitable for the decisionmaking of unmanned systems in a dynamic environment.

(ii) MEBN’s Bayesian networks are generated in realtime by integrating relevant MEBN fragments (MFrag) in a bottom-up way according to the actual situation, which makes the knowledge modeling process pay more attention to local characteristics and avoid handling a particularly complex model.

(iii) It is applicable to web ontology labguage (OWL)[10] and can be integrated with the existing knowledge engine [11].Ontology inference, rule-based inference,and probability inference can be integrated into a comprehensive knowledge engine, which can effectively support the decision-making of collaborative swarm unmanned systems under a dynamic environment.

MEBN theory uses extended probability ontology to organize planning variables into loose MFrags formation,and then expresses the influence of different decision variables on results through local probability parameters in an explicit and formal way [12] MEBN has been widely used in the field of situation awareness since it was proposed.Probabilistic ontologies for net-centric operational systems (PROGON) [13] is a probabilistic ontology for distributed operational architecture.It aims to describe complex and uncertain environments and provide semantic interoperation means for command and control.In addition, MEBN method also provides a good foundation for model learning.Literature [14] constructed a hybrid (discrete and continuous) MEBN learning algorithm, which can learn MEBN model in the case of mixed discrete variables and continuous variables.Not only local probability parameters but also the model structure of the model can be learned [15-20].

The system of logical knowledge description and probabilistic reasoning rules composed of multiple MFrags,which is called MTheory.Fragmented MFrag provides great convenience for knowledge reuse.It can generate a Bayesian network model in real-time according to the current acquired evidence knowledge and the situation requiring reasoning, namely situation specific Bayesian network (SSBN).

By adding decision nodes and utility value nodes,MEBN uses the principle of maximum utility value to realize decision-making under uncertain conditions,which is called multi-entity decision graph (MEDG) [21].

2.2 Proposed framework

The proposed framework for autonomous is shown in Fig.1.In the framework, operational scene is described under a semantic-centered method, which lays the foundation for knowledge process.The MFrags are also organized under the predefined concept in ontology.It brings in expert’s knowledge in construction of the knowledge frags.Utility value is set previously according to the role assigned to the vehicle.The following ections detail the realization of the framework.

Fig.1 Diagram of role-based Bayesian decision framework for autonomous unmanned system

3.Region control operation concept for heterogeneous collaborative swarm unmanned systems

The proposed framework is verified based on a scenario that a heterogeneous cooperative combat swarm composed of unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs) executing region control operation.Assume that there may be enemy fortifications and UGVs in the area, and the ride side is equipped with UAVs and UGVs.The scene is shown in Fig.2, and details of the scenario are descriped as follows.

Fig.2 Swarm unmanned system cooperative region control operation scenario

3.1 Mission

UAVs and UGVs work together to detect and strike enemy targets in the area.

3.2 Task assignment of UAV/UGV

All UAVs carry out area reconnaissance operations and are equipped with visible and infrared reconnaissance payloads.

UGVs routinely visit the potential enemy region, strike or assign other UGVs to strike the target being founded.UGVs are armed with two types of ammunitions.According to operational guidelines, piercing ammunitions are usually used to attack enemy UGVs and blasting ammunitions are used to attack enemy fortifications.

3.3 Enemy targets

Enemy UGVs have the same performance as the red side’s UGVs.

Fortifications are built by the enemy to hold territory.

3.4 Scenario

The UAVs are loaded with visible light, infrared, and other reconnaissance payloads.The UGVs are equipped with piercing ammunitions and blasting ammunitions.UAVs and UGVs conduct air and ground reconnaissance missions respectively.When a UAV found enemy fortifications or UGVs, it may assign the target to a suitable UGV.The UGV decides its next action such as attacking the target, hiding, or escaping according to the situation.

4.Role-based utility value decision method

In this paper, the roles of a certain individual perform in the swarm are described by utility values.By assigning different utility value, vehicles that play different roles show different behavioral preferences.

By reconfiguring MEDG with the following proposed utility value decision model, preference decision can be realized.Table 2 describes the symbols related to the model.

Table 2 Definition of related symbols

Moreover, the dynamics of the environment are described as “time slice” by define a relationship concept called “pretimeOf(·)” that describe the chronological order of events.This enables temporal inference.For example, when the speed of the target at the current moment is higher than the speed at the last moment, the target can be judged to be accelerating.

The utility value decision model consists of four parts:rule model, probability model, role model, and utility value model.They are detailed as follows:

Rule model: The rule model describes the mapping between situation elements and decision options, which is a set of rulesR=(r1,r2,r3,···,rL).Any rules describes the mapping at timetbetween the input of the decision node and its output is defined as:

where ωPa(Dt)is the value range of the parent node of the decision nodeDt.ωDtis an output of the decision node.

Probabilistic model: The probabilistic model describes the local probability distribution of the output of a decision node under different conditions.The probalistic model is represented as

wherepjis the local probability distribution obtained from experience, and it represents the probability of choosing thejth option.

Role model: The role model is used to describe the impact on the decision result of the role they play in the swarm when they are performing a cooperative task.This mechanism is mainly achieved by giving different roles different utility values:

whereucis the utility value of rolec.

Utility model: The utility model is used to calculate the expected utility value under a certain decision option.The expected utility value can be calculated by the following formula:

whereUjrepresents the utility value of thejth decision option, andPj(Xt,Ot,Dt) is the probability of choosing thejth option under the current stateXt, observation valueOtand decision actionDt.

5.Multi-scene knowledge base for decisionmaking in a dynamic scenario

5.1 Problem analysis

The decision problem of the swarm unmanned systems has obvious hierarchical characteristics, they are: swarm level decision, coordination level decision, and individual level decision.At the swarm level, it is necessary to evaluate the overall situation of the swarm and decide whether the current mission should continue or not.At the coordination level, the corresponding unmanned vehicles should be coordinated in time to deal with a certain situation.At the individual level, action should be decided according to the situation the individual faces under the constraints of swarm task goals or collaborative instructions.

In order to illustrate the principle of the model, the following decision scenes faced by the swarm are modeled and verified.

(i) Overall situation assessment and decision at swarm level.The comprehensive situation assessment is mainly carried out at the swarm level to evaluate the current mission capability of the swarm and generate swarm behavior in the next stage.

(ii) Role-based decision at coordination level.Several unmanned vehicles work cooperatively with a preassigned role, the decision model should generate behaviors to act for these vehicles according to the situation these vehicles confront and their pre-assigned roles.

(iii) Individual situation assessment and decision-making at the individual level.In this scene, a single vehicle generates its behavior according to the situation it confronts.

5.2 Swarm level situation assessment and decision-making

Swarm level decision-making outputs the behaviors of the entire swarm, mainly considering the following aspects of the situation.

(i) Confrontation capability: such as the overall situation of both sides has changed significantly, lacking the key combat forces needed to complete the mission.If enemy armored vehicles are found and the piercing ammunitions of the swarm are exhausted, the “withdraw”decision option should be produced.Situation events that affect the situation level include the number of blasting ammunition and the number of piercing ammunition.

(ii) Swarm state: the swarm state is divided into two components: environment state and swarm serviceability.The environment state describes the influence of environmental factors such as electronic interference, visibility,and other factors.

(iii) Sustainability: this factor measures the sustained mission ability of UAVs and UGVs.

(iv) Enemy threat: this factor evaluates the influence of enemy threats in the current situation, such as the number of enemy fortifications and the number of enemy UGVs.

Moreover, two options are considered at the swarm level: withdraw or continue executing the task.Fig.3 shows the influencing factors of the swarm situation assessment.Fig.4 shows the knowledge base established according to the above analysis.

Fig.3 Key factors of real time dynamic planning for swarm unmanned systems

Fig.4 MFrags for swarm level situation assessment and decision model

5.3 Cooperation level role-based decision-making

When the swarm is marching in formation, if they are suddenly attacked by the enemy or enter dangerous areas such as urban areas, the vehicles should be able to carry out cooperative tactical actions according to the assigned role.Thus, the role-based decision model proposed in Section 4 is integrated into MEBN, as shown in Fig.5.Therein, uavAttacked(uav,t) means UAV is attacked at timet, ugvAttacked(ugv,t) means UGV is attacked at timet, Attacked(swarm,t) generates probability of the whole swarm being attacked, alert(swarm,t) generates alert level of the swarm, hasRole(ugv,t) means UGV’s role at timet.Decision(ugv,t) defines the probability of each decision options, and finally Utility(ugv,t) calculates the utility values of each decision options.The decision factors considered in the model are: the UAV is attacked, the UGV is attacked, entering the urban area,and march on the open road.The decision options are search, march, and hide.

Fig.5 Role-based cooperative task planning

5.4 Individual level situation awareness and decision-making

Individual level decision-making is a hybrid problem of knowledge reasoning and algebraic solving, which considers the strategy of executing the target when the individual is fighting independently or participating in some cooperative operations.This strategy will affect its motion planning results.Such as when two vehicles cooperate in the chase-intercept pattern, the chaser’s trace point is behind the goal, and the interceptor’s tracking point is in front of the target.This cooperative strategy can reduce the escape possibility of the target.

The UAVs are responsible for reconnaissance and target allocation.Its situation awareness mainly evaluates its own safety condition, environmental state, payload state,and so on.Decision factors and outputs are as follows.

(i) Decision factors: UAV state (endurance, health state), environmental condition (whether there are aerial obstacles, wind speed), mission condition (mission payload, communication condition).

(ii) Decision output: obstacle avoidance, return, reconnaissance, task assignment.

Similarly, the decision factors and outputs of situation assessment and decision for UGVs are as follows.

(i) Decision factors: motion parameters of the target,such as the speed (acceleration, deceleration, smooth), the direction of movement (approach, away), and distance(far, medium, near); the category of the target is UGV or fortification; the state of the natural environment, such as the weather, the presence, or absence of shelter.

(ii) Decision output: for collaborative interception situations, decision outputs are front-tracking, back-tracking,and line-of-sight tracking; for the decision of whether or how to attack the target, the decision outputs are attack(piercing ammunition, blasting ammunition), and hide.

The decision model of UAVs and UGVs is shown in Fig.6.

Fig.6 Decision model of UGV/UAV

6.Experimental verification and analysis

6.1 Experimental design

The experiment is based on the region control operation concept proposed in Section 3, and the following scenarios are designed to verify the effectiveness of the decision model:

(i) Swarm level situational awareness and decisionmaking.

(ii) Role-based collaborative situation awareness and decision-making when the swarm is ambushed during marching.

The task plot setting is shown in Table 3 and Table 4.Among them, the situation is represented as symbolics.The value of situation factors is also symbolics such as TRUE, FALSE, HIGH, and LOW.The uncertainty of the environmental situation is described through the probability value of symbolically represented situation factors.

Table 3 Events for swarm level decision

Table 4 Events for cooperation level role-based decision-making

6.2 Result analysis

6.2.1 Swarm level situational awareness and decision-making In the plot summary shown in Table 3, T0-T5 refer to the situation information of the extracted six decision moments.In the table, the three situation components of the environment, swarm serviceability rate, and sustainability are initially set to HIGH, and the subsequent states are inferenced according to the observed events and the state at the last moment.Fig.7 shows the events that occur at each moment.

Fig.7 Events setting of situation awareness experiment

Fig.8 shows the inference results of the environmental situation component at time T0-T5.Fig.8(a) shows the SSBN generated at T0, which shows that the inferenced environmental state is GOOD.Fig.8(b) shows the probability curves of environmental situation components at each time.

Among them, at T0, the probability of the initial environmental situation being GOOD is 1, and there is no clear observation value at T1-T3, which leads to a decrease in the probability.At T4, if the visibility is GOOD, the probability of the environmental situation being GOOD rises to 0.9.If interference is observed at time T5 and the visibility evidence is missing, the probability of the environmental situation being GOOD decreases to 0.4.The simulation results are consistent with expected, which proves that the environmental situational awareness function works normally.

Fig.9(a)-Fig.9(d) show the state curves of the main situation components of the swarm level decision at each time, and Fig.9(e) shows the output curves of the model.Analysis of the curve in the figure shows that at T3,although the endurance keeps a high level, but the availability, swarm state, and confrontation ability are at a low level, especially the significant decline in confrontation ability and the increase in threat degree, these make the swarm more inclined to withdraw from the battle.

Fig.9 Probability curves of each situation element in group contingency planning model

At T5, a slight change in the swarm state and a slight increase in the threat lead to a more obvious strategic tendency to withdraw from the battle.The decision output results of the SSBN at T3 and T5 are shown in Fig.10.As a result, the probability of withdrawing from the battle increased from 0.665 4 to 0.71.

Fig.10 Decision output at T3 and T5

6.2.2 Situation awareness and decision-making at the collaborative level

Fig.11 shows the events in the collaborative situational awareness scene at each time.Due to the good visibility on the open road in the battlefield, the enemy will not ambush in such an open place, while in the urban scenario, the enemy is easy to hide, and the possibility of the swarm being attacked is higher.Therefore, in the study of this paper, it is considered that the threat when unmanned vehicles enter urban areas is significantly higher than marching on open roads.

Fig.11 Defined events of collaborative situation awareness

Fig.12 shows the computing resource consumption during the decision-making.It can be seen that the inference time at each moment is at the millisecond level, and its resource consumption can meet the real-time decisionmaking requirements in normal circumstances.Table 5 lists the utility values under the influencing factors and policy options.The utility value ranges from [-10,10].As shown in the table, by defining the distribution of utility values, the guarder tends to search in the high-threat situation and keep moving in the low-threat situation, the follower tends to hide in the high-threat situation, and keep moving in the low-threat situation.

Fig.12 Computing resource consumption of decision-making

Table 5 Utility values for different roles in cooperative decision problem

Through this simulation experiments, the decision outputs results at each time are shown in Table 6, and the results shown in bold is the decision results at that time.As can be seen from the table, the decision results conform to the predefined role tendency.In particular,observe the utility value of the follower at time T3, which is the scene where the UAV is attacked by the enemy on the open road.At this time, although the follower can make a correct decision, there is little difference between the utility value of march and hide.This is related to the setting of the local probability value of the impact degree on the threat degree when UAVs are attacked and UGVs are attacked.According to the setting, the contribution of UAVs attacked to the threat degree is lower than that of UGVs attacked, whose value is { UAV = TRUE, UGV =FALSE } = { 0.6, 0.4 }, { UAV = FALSE, UGV = TRUE } ={0.8, 0.2}.This scene represents an unexpected attack in low level threat situation, in which it is acceptable to have some ambiguity in the decision results.It can be seen from the experimental results that the multi-scene decision model of the swarm unmanned system proposed in this paper can effectively support the decision-making of the swarm unmanned system at the swarm, cooperative, and individual levels.The role-based decision framework proposed in this paper is applicable to multiple scenarios.Compared with [22], the framework in this paper has the advantage in role-based cooperation decision, better adaptability to dynamic and changing scenes.Moreover, this framework integrates the experience knowledge with description knowledge which provides a feasible way of using prior experience knowledge.

Table 6 Collaborative decision results

7.Conclusions

This paper focuses on the multi-scene decision-making problem of unmanned systems.In the background of cooperative region control task, a role-based Bayesian decision framework for swarm unmanned systems is proposed.The proposed framework integrates description logic, experience knowledge, and preference decision theory, and its advantages as follows.Firstly, the framework is constructed under a semantic-centered roadmap,which solves the problem of extending it to adapt to new scenarios.Secondly, it integrates experience knowledge as a local structure of MFrag, which provides an easy way to use experience knowledge.Thirdly, it integrates preference decision theory, which solves the problem of coordination of swarm cooperation of autonomous unmanned systems.The research in this paper has important reference value for improving the real-time decision-making ability of unmanned systems in multiple dynamic scenes,improving their adaptability to complex battlefield environments.