hexops/mach

Fork 0

proposal: Artificial intelligence systems #176

New issue

Open

opened 2022-03-03 06:51:20 +00:00 by tauoverpi · 7 comments

tauoverpi commented

2022-03-03 06:51:20 +00:00

(Migrated from github.com)

After looking into game AI systems I stumbled upon GOAP which a planning system used in games like F.E.A.R, Tomb Raider (2013), and more which is generic enough to be part of a game engine framework in that the core planner can be implemented without prior knowledge of the game. I think this would be a nice addition to mach given that zig comptime or the vtable pattern can provide the implementation/specialization of the system where needed.

The GOAP tl;dr as I understand it is:

have list of goals (usually desired world states) which can be chosen by level of importance/possibility
have a list of actions which the character can perform where each lists required state from the world to run along with state which they provide (subgoals they complete)
pick the most important goal currently available/feasible
search through the actions while trying to find a path which satisfies a transition to the desired state while satisfying the requirements of actions used to create a plan (usually A*)
return the plan as a stack of tasks to complete (e.g kill player could be: pickup a weapon seen earlier, move closer to the player, attack)

Since I've just found it I have no code to show other than that I've found in presentations and papers.

After looking into game AI systems I stumbled upon GOAP which a planning system used in games like F.E.A.R, [Tomb Raider] (2013), and more which is generic enough to be part of a game engine framework in that the core planner can be implemented without prior knowledge of the game. I think this would be a nice addition to mach given that zig comptime or the vtable pattern can provide the implementation/specialization of the system where needed. The GOAP tl;dr as I understand it is: - have list of goals (usually desired world states) which can be chosen by level of importance/possibility - have a list of actions which the character can perform where each lists required state from the world to run along with state which they provide (subgoals they complete) - pick the most important goal currently available/feasible - search through the actions while trying to find a path which satisfies a transition to the desired state while satisfying the requirements of actions used to create a plan (usually A\*) - return the plan as a stack of tasks to complete (e.g kill player could be: pickup a weapon seen earlier, move closer to the player, attack) Since I've just found it I have no code to show other than that I've found in presentations and papers. [GOAP]: http://alumni.media.mit.edu/~jorkin/goap.html [Tomb Raider]: https://www.gdcvault.com/play/1022020/Goal-Oriented-Action-Planning-Ten

emidoots commented

2022-03-03 07:48:55 +00:00

(Migrated from github.com)

I'd like to have something here, I think, but I'm not sure what exactly yet.

You should check out some of Dave Mark's work, and this thread in which he notes some scalability issues with GOAP.

Also this talk: https://www.gdcvault.com/play/1021848/Building-a-Better-Centaur-AI

I'd like to have _something_ here, I think, but I'm not sure what exactly yet. You should check out some of Dave Mark's work, and [this thread](https://www.gamedev.net/forums/topic/700989-fsm-bt-htn-goap-other/) in which he notes some scalability issues with GOAP. Also this talk: https://www.gdcvault.com/play/1021848/Building-a-Better-Centaur-AI

👍 2

InKryption commented

2022-03-03 16:39:27 +00:00

(Migrated from github.com)

Whichever approach is chosen, it shouldn't be too hard, I don't think, to execute it, since zig lazily evaluates declarations, and thus if one doesn't want to use the AI module, they'd simply have to not use to any functions in the ECS module that refer to it. Although one could also argue that this could potentially lead to overly-tight coupling, so maybe you want to make it a build option.

Lots to consider.

When/If this is worked on, the ability of this module to be integrated into a GUI should be carefully considered, same as the ECS - and on that note, would be quite important to weigh how much integration with the ECS module there would/should be. Should it act like an extension to the ECS, or be its own separate set of logic modules that just happen to operate well in tandem with the ECS? Whichever approach is chosen, it shouldn't be too hard, I don't think, to execute it, since zig lazily evaluates declarations, and thus if one doesn't want to use the AI module, they'd simply have to not use to any functions in the ECS module that refer to it. Although one could also argue that this could potentially lead to overly-tight coupling, so maybe you want to make it a build option. Lots to consider.

❤️ 1

tauoverpi commented

2022-03-03 19:15:32 +00:00

(Migrated from github.com)

A few more resources

A few more resources - [Game AI Pro](http://www.gameaipro.com/) - [Linear Logic Programming for Narrative Generation](https://www.cs.cmu.edu/~cmartens/lpnmr13.pdf) - [IA on AI](https://intrinsicalgorithm.com/IAonAI/) - [Hierarchical Task Networks](https://blog.inductorsoftware.com/blog/htnoverview) - [Modeling Individual Personalities in The Sims 3](https://www.gdcvault.com/play/1012450/Modeling-Individual-Personalities-in-The) - [Neural Networks - A Systematic Introduction](https://page.mi.fu-berlin.de/rojas/neural/) - [Prediction Markets as an Aggregation Mechanism for Collective Intelligence](https://escholarship.org/content/qt8mg0p0zc/qt8mg0p0zc.pdf)

❤️ 2

emidoots commented

2022-04-05 05:12:50 +00:00

(Migrated from github.com)

There's a new room in the Matrix chat about this where Levy and myself are discussing quite a bit: https://app.element.io/#/room/#mach-ai:matrix.org

tauoverpi commented

2023-01-31 10:53:04 +00:00

(Migrated from github.com)

Adam: A Method For Stochastic Optimization
Instant Neural Graphics Primitives with a Multiresolution Hash Encoding (SDFs specifically)
Clever Algorithms

For both the GUI portion of IAUS at least it would make sense to have a graph of the strength of the activations of the actions (same for components if zoomed in on an action) over time allowing the designer to tweak behaviour and see how it affects the choices the character would have made given a short sequence. This along with a recording of a single session where the designer can observe behaviour and decide how the character should behave by setting a constraints on which actions must be chosen where as a guide when tweaking utility constraints. The same can be done for exploring outcomes by replaying the session with the new behaviour. One could also explore multiple behaviour variants in parallel by running simulations side-by-side to see how they diverge (either different cameras or a 2D merged view with different colours for each variant).

Think of it as a video editor where it's only possible to tell characters what they should think about before each improv play.

- [Adam: A Method For Stochastic Optimization](https://arxiv.org/pdf/1412.6980.pdf) - [Instant Neural Graphics Primitives with a Multiresolution Hash Encoding](https://nvlabs.github.io/instant-ngp/assets/mueller2022instant.pdf) (SDFs specifically) - [Clever Algorithms](https://github.com/clever-algorithms/CleverAlgorithms) For both the GUI portion of IAUS at least it would make sense to have a graph of the strength of the activations of the actions (same for components if zoomed in on an action) over time allowing the designer to tweak behaviour and see how it affects the choices the character would have made given a short sequence. This along with a recording of a single session where the designer can observe behaviour and decide how the character should behave by setting a constraints on which actions must be chosen where as a guide when tweaking utility constraints. The same can be done for exploring outcomes by replaying the session with the new behaviour. One could also explore multiple behaviour variants in parallel by running simulations side-by-side to see how they diverge (either different cameras or a 2D merged view with different colours for each variant). Think of it as a video editor where it's only possible to tell characters what they should think about before each improv play.

tauoverpi commented

2023-10-24 22:40:42 +00:00

(Migrated from github.com)

The system I have at the moment consists of a list of categories used to group actions that make sense together (e.g combat actions go in the combat category without dance/emote as you wouldn't want your character to start dancing at random when at 0.1hp) where each contains a list of actions. Each category is chosen by the same utility evaluation used for actions thus they share the same type of sensors. Thus, picking an action involves:

second = lambda x: x[1]
tasks = []
for target in area:
    cats = []
    best = 0 # allows categories to short-circuit when they can't win
    for category in behaviour:
        score = category.score(best)
        if score > best:
            best = score
        cats.append((category, score))
    cats.sort(key=second, reverse=True)
    
    acts = []
    best = 0 # allows actions to short-circuit when they can't win
    for action in cats[0].actions:
        score = action.score(best)
        if score > best:
            best = score
        acts.append((action, score))
    acts.sort(key=second, reverse=True)
    
    tasks.append(acts[0])

tasks.sort(key=second, reverse=True)
chosen_action = tasks[0]

return chosen_action

To improve performance at runtime and ease editing I've defined both a text-based format for quickly editing character behaviour:

cat example:
    linear(1, 0, 0) [1m...20m]: my.distance
    
    act moveto:
        linear(-1, 0, 0) [0m...1m]: my.distance
        
    act pickup:
        logistic(1, -7, 0, 0) [0m...1m]: my.distance
        logit(-1, 0, 0) [0kg...20kg]: my.bag_weight

cat another:
    // ...

Where actions and sensors are defined zig side:

const iaus = @import("lib").iaus;

const namespaces = struct {
    my: struct {
        const My = @This();

        pub fn distance(_: *My) iaus.Decimetre {
            return .{ .value = 0.0 }; // placeholder
        }

        pub fn bag_weight(_: *My) iaus.Gram {
            return .{ .value = 0.0 }; // placeholder
        }
    },
};

const Action = struct {
    pub fn moveto(_: *Action) void {}
    pub fn pickup(_: *Action) void {}
};

const Be = iaus.Behaviour(.{ .namespaces = namespaces, .actions = Action });
const Parser = iaus.script.Parser(.{ .namespaces = namespaces, .actions = Action });

And the in-memory format which can be mapped to memory and uses as-is:

header: Image.Header
- magic: u32
- major: u8
- minor: u8
- patch: u16
- size: u32
- number_of_categories: u16
- number_of_actions: u16
- number_of_considerations: u16
- number_of_parameters: u16
- sensor_symbol_table_size: u16
- action_symbol_table_size: u16
- name_pool_size: u16
- comment_pool_size: u16
padding to 64 bytes
signature: [32]u8 (ed25519, identifying the author)
checksum: [32]u8 (Blake3, file integrity)
data: Image.Data
- categories: [number_of_categories]Category
  - name: u16
  - sensor_offset: u16 (consideration)
  - action_offset: u16
  - parameter_offset: u16
- actions: [number_of_actions]Action
  - name: enum(u16) (one tag per action)
  - sensor_offset: u16 (consideration)
  - parameter_offset: u16
- considerations: [number_of_considerations]Consideration
  - sensor: Sensor
    - function: enum(u8) (linear, bezier, logit, logistic, normal, poly, etc...)
    - namespace: enum(u4) (sensor group, e.g: my, target, game, imap, ...)
    - padding: u4 = 0
    - name: enum(u16) (name of the sensor within the given sensor group)
  - min: f32 (minimum bound when clamping)
  - max: f32 (maximum bound when clamping)
- parameters: [number_of_parameters]f32
- sensor_symbol_table: [action_symbol_table_size]u8 (optional)
- action_symbol_table: [sensor_symbol_table_size]u8 (optional)
- name_pool: [name_pool_size]u8
- comment_pool: [comment_pool_size]u8

Evaluation follows traversal order and each guarantees cache is filled with useful data while processing. Each function "knows" how many parameters to consume from the parameter array and processing is always sequential for the evaluation of a single action's / category's considerations thus there's no need to store further indices.

Debugging character behaviour relies on rendering graphs for each of the response curves (sensor) initially. Plotting the estimated utility over time for categories, actions, and considerations, allow for further insight into how a character thinks such as why they chose to take a particular path over the expected. Given a replay of events it's then possible for a non-programmer to tweak response curves and see the change in behaviour live on the graph of estimated utility for each category/action/consideration as long as they have some concept of what utility is.

The main advantage I've found is the ability for characters to multi-task when given a set of atomic actions. Perhaps picking up an apple from the table to eat is of lower importance until the lost cat that entered the cafe has been dealt with. Such task switching is often painful to configure (more so when it involves more than just the one distraction) but with dual utility it's handled by default. If eating the apple isn't valuable after dealing with all distractions then the character will just pick something else to do, otherwise they resume eating unless distracted again.

I've also found that the same system is suitable for selecting which track to play from ambient sound to combat based on the current situation (even selecting between variations / tone) rather than setting up areas/npcs with triggers.

TL;DR - Dual Utility is all you need

Note: I cannot use discord (they don't like VPNs) so can only discuss this here or on matrix.
Note: The format presented above is a "snapshot" of it at the time of writing and might not reflect it's current state.

The order in which categories, actions, and sensors appear has no effect on behaviour thus one can collect information on the frequency of activations at runtime to later sort the image to place cateories/actions/sensors that will likely score high in the list at the top to reduce the number of sensor queries (increase the chance for a short-circuit while taking into account the execution cost, effectively PGO).

After experimenting a bit with different systems a plain Dual Utility system seems like the most effective for both performance and quality. It's capable of replacing state machines pretty much anywhere and there's no need for behaviour trees at all and planners as they really don't scale well performance wise. The system I have at the moment consists of a list of categories used to group actions that make sense together (e.g combat actions go in the combat category without dance/emote as you wouldn't want your character to start dancing at random when at 0.1hp) where each contains a list of actions. Each category is chosen by the same utility evaluation used for actions thus they share the same type of sensors. Thus, picking an action involves: ```python second = lambda x: x[1] tasks = [] for target in area: cats = [] best = 0 # allows categories to short-circuit when they can't win for category in behaviour: score = category.score(best) if score > best: best = score cats.append((category, score)) cats.sort(key=second, reverse=True) acts = [] best = 0 # allows actions to short-circuit when they can't win for action in cats[0].actions: score = action.score(best) if score > best: best = score acts.append((action, score)) acts.sort(key=second, reverse=True) tasks.append(acts[0]) tasks.sort(key=second, reverse=True) chosen_action = tasks[0] return chosen_action ``` To improve performance at runtime and ease editing I've defined both a text-based format for quickly editing character behaviour: ``` cat example: linear(1, 0, 0) [1m...20m]: my.distance act moveto: linear(-1, 0, 0) [0m...1m]: my.distance act pickup: logistic(1, -7, 0, 0) [0m...1m]: my.distance logit(-1, 0, 0) [0kg...20kg]: my.bag_weight cat another: // ... ``` Where actions and sensors are defined zig side: ```zig const iaus = @import("lib").iaus; const namespaces = struct { my: struct { const My = @This(); pub fn distance(_: *My) iaus.Decimetre { return .{ .value = 0.0 }; // placeholder } pub fn bag_weight(_: *My) iaus.Gram { return .{ .value = 0.0 }; // placeholder } }, }; const Action = struct { pub fn moveto(_: *Action) void {} pub fn pickup(_: *Action) void {} }; const Be = iaus.Behaviour(.{ .namespaces = namespaces, .actions = Action }); const Parser = iaus.script.Parser(.{ .namespaces = namespaces, .actions = Action }); ``` And the in-memory format which can be mapped to memory and uses as-is: - header: `Image.Header` - magic: `u32` - major: `u8` - minor: `u8` - patch: `u16` - size: `u32` - number_of_categories: `u16` - number_of_actions: `u16` - number_of_considerations: `u16` - number_of_parameters: `u16` - sensor_symbol_table_size: `u16` - action_symbol_table_size: `u16` - name_pool_size: `u16` - comment_pool_size: `u16` - padding to 64 bytes - signature: `[32]u8` (ed25519, identifying the author) - checksum: `[32]u8` (Blake3, file integrity) - data: `Image.Data` - categories: `[number_of_categories]Category` + name: `u16` + sensor_offset: `u16` (consideration) + action_offset: `u16` + parameter_offset: `u16` - actions: `[number_of_actions]Action` + name: `enum(u16)` (one tag per action) + sensor_offset: `u16` (consideration) + parameter_offset: `u16` - considerations: `[number_of_considerations]Consideration` + sensor: `Sensor` + function: `enum(u8)` (linear, bezier, logit, logistic, normal, poly, etc...) + namespace: `enum(u4)` (sensor group, e.g: my, target, game, imap, ...) + padding: `u4 = 0` + name: `enum(u16)` (name of the sensor within the given sensor group) + min: `f32` (minimum bound when clamping) + max: `f32` (maximum bound when clamping) - parameters: `[number_of_parameters]f32` - sensor_symbol_table: `[action_symbol_table_size]u8` (optional) - action_symbol_table: `[sensor_symbol_table_size]u8` (optional) - name_pool: `[name_pool_size]u8` - comment_pool: `[comment_pool_size]u8` Evaluation follows traversal order and each guarantees cache is filled with useful data while processing. Each `function` _"knows"_ how many parameters to consume from the parameter array and processing is always sequential for the evaluation of a single action's / category's considerations thus there's no need to store further indices. Debugging character behaviour relies on rendering graphs for each of the response curves (`sensor`) initially. Plotting the estimated utility over time for categories, actions, and considerations, allow for further insight into how a character thinks such as why they chose to take a particular path over the expected. Given a replay of events it's then possible for a non-programmer to tweak response curves and see the change in behaviour live on the graph of estimated utility for each category/action/consideration as long as they have some concept of what _utility_ is. The main advantage I've found is the ability for characters to multi-task when given a set of atomic actions. Perhaps picking up an apple from the table to eat is of lower importance until the lost cat that entered the cafe has been dealt with. Such task switching is often painful to configure (more so when it involves more than just the one distraction) but with dual utility it's handled by default. If eating the apple isn't valuable after dealing with all distractions then the character will just pick something else to do, otherwise they resume eating unless distracted again. I've also found that the same system is suitable for selecting which track to play from ambient sound to combat based on the current situation (even selecting between variations / tone) rather than setting up areas/npcs with triggers. TL;DR - Dual Utility is all you need Note: I cannot use discord (they don't like VPNs) so can only discuss this here or on matrix. Note: The format presented above is a "snapshot" of it at the time of writing and might not reflect it's current state. --- The order in which categories, actions, and sensors appear has no effect on behaviour thus one can collect information on the frequency of activations at runtime to later sort the image to place cateories/actions/sensors that will likely score high in the list at the top to reduce the number of sensor queries (increase the chance for a short-circuit while taking into account the execution cost, effectively PGO).

tauoverpi commented

2023-10-25 07:59:38 +00:00

(Migrated from github.com)

Another topic of research:

To construct a vector database for objects such that characters can calculate similarity of objects using tsss/cosine similarity which may be useful for utility calculation and "discovering" actions which can be applied based on objects that are similar.

Edit:

In addition to GOAP, ukanren may be of interest and it's extension. It's suitable for long-term high-level goals where search may span multiple update frames.

Another topic of research: - [Adam: A Method for Stochastic Optimization](https://arxiv.org/abs/1412.6980) - [GloVe: Global Vectors for Word Representation](https://nlp.stanford.edu/projects/glove) - [A Hybrid Geometric Approach for Measuring Similarity Level Among Documents and Document Clustering](https://www.researchgate.net/publication/303513110_A_Hybrid_Geometric_Approach_for_Measuring_Similarity_Level_Among_Documents_and_Document_Clustering) - [Distributed Representations of Words and Phrases and their Compositionality](https://arxiv.org/abs/1310.4546) - [Embeddings: What they are and why they matter](https://simonwillison.net/2023/Oct/23/embeddings/) To construct a vector database for objects such that characters can calculate similarity of objects using tsss/cosine similarity which may be useful for utility calculation and "discovering" actions which can be applied based on objects that are similar. Edit: In addition to GOAP, [ukanren](http://webyrd.net/scheme-2013/papers/HemannMuKanren2013.pdf) may be of interest and it's [extension](http://minikanren.org/workshop/2021/minikanren-2021-final8.pdf). It's suitable for long-term high-level goals where search may span multiple update frames.