Adjusting Behaviours at Run Time

I have been looking into way I could have improved my AI and my approach to my Final Project (Dissertation equivalent) and I have found a few methods, and tips that I could implement to make the AI feel less "rigid" to the player and become more "believable" by reacting to situations differently depending on various factors, for example if the player fires at the AI, how should they react?  Should they take cover, return fire or flee?  Should this always be the same and hand coded in?  I have found two methods from Steve Rabin's Game AI Pro 2 which look at ways to make the AI less predictable building upon the illusion of intelligence.

Dual-Utility Reasoning
Where there are multiple decisions an agent is able to make they will usually pick the first available option, especially in a BT which worked through its children based on positioning priority (unless a random composite node is used).  However, utility-based AI is able to make better decisions and “will base its decision on an evaluation of the relative appropriateness and/or importance of each option instead of picking at random or simple taking the first valid option” (Dill 2015, p23).  Each behaviour has a weighting value that is assigned during run time based on the current situation which helps in building a responsive, dynamic behaviour giving agents a chance of reacting to different situations in the most appropriate way.

There are two common uses of utility being absolute utility and relative utility.  Absolute utility picks the behaviour that has the highest weighting for that current situation and perform that task.  Although this might be the best reaction to the situation, this kind of behaviour becomes predictable and will give the player a chance to “coerce” the agent into certain situations that could benefit the player.  The other option is relative utility which uses the weighting as a probability of selection and generates a random number which is referred to as weight-based random.  To work out which behaviour is selected all the weights of valid options are added up and a number between 0 and the total weight is generated, this then selects a behaviour.  The code would look somewhat like this;


Which ever action is then picked is performed giving a sense of unpredictability and variation, however, “there is always some chance that an option with very low utility will be selected” (Dill 2015, p24)
An alternative to these utility options is Dual-Utility Reasoning which takes the strengths from both previously mentioned utilities and combines them together.  Instead of assigning each task a weighting and selecting the highest value as absolute utility handles it or picking a random number from all available tasks as reactive utility processed the data.  Dual-utility gives each task a rank and a weighting.  Tasks will be prioritised and categorised into the various ranks, this is where absolute utility is used, as any task that is not considered part of the highest rank will be discarded, then we switch to relative utility and pick a task from within that category.  This helps ensure that the best possible action is selected while maintaining a sense of unpredictability.

Hinted-Execution Behaviour Trees (HeBTs)
Another way to control how the AI reacts to situations within the game world is to modify an existing BT to a Hinted-Execution Behaviour Tree (HeBT) which is built upon the existing code base and allows "developers to dynamically modify the priorities of a BT based on some high-level logic" (Barriales 2015, p69). 

HeBTs are able to reorder their branches at run time to produces different results for differing situations, to do this they try and imitate a (CoC) structure where the higher-level trees instruct the lower level behaviours what to do.  The "commands" that are sent down the chain are called hints and used to aid with reordering the priority of the nodes within the BT changing the flow of control.

There are not many additions needed to convert a BT into a HeBT, for example a new type of selector node would need to be created, as explained in previous posts, selector nodes store a list of their children and iterate through them in order of addition (left -> right) until any child returns a success, returns failure if all children return failure.  The new selector node will need to be “assigned a unique identifier, which is assigned at creation time” (Barriales 2015, p75).  Hints are then passed through these selector nodes and instruct the nodes how they should be reordered with a new priority.  The hints that are passed through are either positive, negative or neutral.  As their names suggest, positive hints instruct the BT that those behaviours marked as so are desirable, while negative hints are not desirable in that situations.  Neutral hints mean no action is required and to reset the BT back to its original state. 

The new selectors will be required to have four lists instead of the standard singular list.  The original list will still work in the same way, storing the children in their original order of priority, however, the next three lists will hold all the children with the respective hints, for example one list will hold all of the positively hinted children and so on although “these extra lists are still stored using the original order, so if two or more nodes are hinted, AIs will know which action is more important according to their original behaviour” (Barriales 2015, p75).

A HeBT works differently from a BT (Barriales 2015, p78) suggests that rather than the AI being controlled by a single tree, there will be multiple layers of BTs controlled by a Behaviour Controller.  The higher-level trees will have a new node that replaces the action/leaf node called a hinters which are used to send hints down to the next layer of the HeBT directly below it.  All the new trees and the base BT are owned and controlled by a behaviour controller which works like a stack and any newly created trees are pushed to the top.  When a new tree is created and added to the stack it is informed of what tree is immediately below it, this is shown in the flowing picture;

(Barriales 2015, p79)

HeBTs are executed from the top layer of the stack and work their way down to the original BT.  By the time execution runs on the base BT the nodes will have received all the hints presented and reordered all nodes correctly.  HeBTs are very powerful and expand upon already existing BTs without needed to change much of the original code, which is to be avoided once behaviours have been thoroughly tested.  They can be used for many different situations “such as rapid prototyping, dynamic behaviour[sic] adaption, or group behaviours” (Barriales 2015, p86) allowing for changes in the BT to happen dynamically and without adjusting too much of the existing behaviours already written

Conclusion
While these are both very good approaches and I believe would help improve my AI and bring me closer to achieving believable AI.  With the amount of time remaining I do not believe I will be able to fully learn and implement these features.  Although if there is time remaining and all my milestones are hit I would like to attempt both methods as a stretch goal.  

Reference List
[1] Barriales, S (2015) “Building a Risk-Free Environment to Enhance Prototyping” in Game AI Pro 2, edited by Steve Rabin. Boca Raton, FL: CRC Press 2015, pp. 69-87.
[2] Dill, K. (2011). A Game AI Approach to Autonomous Control of Virtual Characters. Interservice/Industry Training, Simulation, and Education Conference (I/ITSEC), pp.1-11.
[3] Dill, K (2015) “Dual Utility Reasoning” in Game AI Pro 2, edited by Steve Rabin. Boca Raton, FL: CRC Press 2015, pp. 23-26.
[4] Jadon, S. Singhal, A. Dawn, S 2014, 'Military Simulator - A Case Study of Behaviour Tree and Unity based architecture', International Journal of Computer Applications, vol. 88, no.5, pp 26-29.