WIP: mach<->ECS integration, ECS type safety, high-level application API proposal #349

Closed
emidoots wants to merge 12 commits from sg/ecs-type-safety-1 into main
emidoots commented 2022-06-12 21:15:43 +00:00 (Migrated from github.com)

Context

I've been thinking non-stop about, and exploring, a lot of overlapping questions:

  • What does integrating mach/ecs into the higher-level mach/src/ API look like?
  • What should a 'typical' Mach application look like? We had this discussion but the ECS integration side was largely "we'll figure it out"
  • How do we handle type-safety in our ECS? At runtime? At comptime?
  • How do we handle namespacing in our ECS? e.g. if renderer and physics2d modules both want to provide a "location" component of a different type (vec3 vs vec2) - how do we handle that? Is there a convention of prefixing component names like "renderer.vec3", or do we do something else?
  • And more: What does order-of-execution for ECS systems look like? How can ECS modules talk to eachother to signal events? Where do we store globals/singletons? Where do end-user applications store their own state? How do we handle serialization?

Painting the whole picture (or at least 60% of it)

The thing that hit me like an overwhelming bag of bricks is that there are just so many different ways we can tackle these problems. Like, literally, hundreds if not thousands of combinations. And each choice has far-reaching implications for other parts of the system. If we don't consider them in total, all combined together, we're not seeing the whole picture we're painting.

Two good examples of this:

  1. In https://github.com/hexops/mach/issues/309#issuecomment-1140479760 we agreed upon a high-level application API, but in doing so we didn't realize at the time that our choice of API would actually prevent anyone (not just us) from passing a comptime parameter to their code through the ECS without type-erasure. There's not a place in the type system of that example that one could pass a comptime parameter to say their own ECS module. We only noticed this because it later came to light that we couldn't pass comptime parameters to ecs.Entities(T) if we were to try and tackle type-safety that way.
  2. If we tackle type-safety in the ECS as runtime-type-safety, you can add whatever component type at any point in time, then we introduce a serialization problem: we would know say the type ID of a component, but we wouldn't have a way to say call a method on that type to serialize it. We could require components register a fn serialize(component: *anyopaque, writer: anytype) !void function - but do we want that?

This PR

This is an exploratory PR, if we're happy with how this looks/feels generally, then I'll work on sending separate smaller/cleaner PRs to land this same idea into main. It's not 100% thought out, and so it wouldn't be merged directly.

This PR takes several ideas from everyone I've talked to: Levy, MasterQ32, Ayush, Ali, Zargio, and more into account.

A standard high-level Mach application

Now looks like this:

const std = @import("std");
const mach = @import("mach");
const gpu = mach.gpu;
const ecs = mach.ecs;

const renderer = @import("renderer.zig");
const physics2d = @import("physics2d.zig");

const modules = ecs.Modules(.{
    .mach = mach.module,
    .renderer = renderer.module,
    .physics2d = physics2d.module,
});

pub const App = mach.App(modules, init);

pub fn init(engine: *ecs.World(modules)) !void {
    const core = engine.get(.mach, .core);
    try core.setOptions(.{ .title = "Hello, ECS!" });

    const device = engine.get(.mach, .device);
    _ = device; // use the GPU device ...

    const player = try engine.entities.new();
    try engine.entities.setComponent(player, .renderer, .location, .{.x = 0, .y = 0, .z = 0});
    try engine.entities.setComponent(player, .physics2d, .location, .{.x = 0, .y = 0});
    _ = player;
}

Where physics2d.module looks like this:

pub const module = Module(.physics2d);

pub fn Module(namespace: anytype) type {
    _ = namespace;
    return ecs.Module(.{
        .components = .{
            .location = Vec2,
            .rotation = Vec2,
            .velocity = Vec2,
        },
        // ...systems/state/etc...
    });
}

pub const Vec2 = struct { x: f32, y: f32 };

First impressions

The first thing you'll notice is that:

  1. We declare modules up-front, at compile-time. You'll have to write out all of the modules you intend to use in your application, and each module has to write out all of the ECS components it intends to use.
  2. We've brought back the semi-magical pub const App = mach.App(modules, init);, more on this below.
  3. Components live in a namespace, API calls like .setComponent(player, .renderer, .location, .{.x = 0, .y = 0, .z = 0}) find a component called .location in the .renderer module and set that component on the entity.
  4. The ECS is fully type-safe: based on the .renderer, .location parameters it knows the exact type the component must be at comptime. Similarly, .getComponent(player, .renderer, .location) returns the concrete type.

I highly encourage reading the examples/ecs-app source as it explains in greater detail some other concepts like namespacing of modules.

The type-safety/explicitness vs. flexibility tradeoff

There is a very real tradeoff here around type-safety: requiring modules to be declared up-front, requiring components you might use to be enumerated by modules, etc. can be tedious, it adds some real overhead.

For example, modules must live in a namespace. And if two modules have the same namespace, we need to let the consumer of them rename them. This is fine, it's just a comptime parameter namespace: anytype but it means all of your module code that intends to work with the ECS needs to have that namespace: anytype parameter so it can pass it to .setComponent, .getComponent, etc.:

pub const module = Module(.physics2d); // The default namespace for this module

// Lets you rename the module.
pub fn Module(namespace: anytype) type {
    //
    _ = namespace;
    return ecs.Module(.{
        .components = .{
            .location = Vec2,
            .rotation = Vec2,
            .velocity = Vec2,
        },
        // A system function which iterates entities with physics here would need to use e.g.
        // `.getComponent(entity, namespace, .velocity)` and not
        // `.getComponent(entity, .physics2d, .velocity)` so the user can rename it.
        // ...systems/state/etc...
    });
}

pub const Vec2 = struct { x: f32, y: f32 };

All of this adds a real level of mental overhead for the programmer. It will be stuff to learn, and will make writing a Mach application out-of-the-box harder. Is it worth it?

Initially, my impression was no. I actually set out to create this PR as evidence and proof that it is not worth it for when anyone complains in the future. But, after implementing and reflecting on all aspects it actually seems quite nice.

It's possible I change my mind again later, but it seems fair to go with the most restrictive approach first (you have to define all modules/components globally), try and push that as far as we can and make it as nice an API as we can, and go to a less restrictive one later if we truly need to.

Unimplemented aspects

Not implemented in this PR is a few important things:

Initialization

Module initialization is something I haven't implemented yet. This would be easy to handle, though, because we have all the type information. For example, we could call an init function defined in the module and require it return a valid type to us.

Systems

  • How ECS systems get defined in a module (it would just be a function basically, so pretty much a solved problem)
  • System state: the idea here is that when an ECS system function is called, it gets a pointer to it's own state struct where it can store information over time if it desires. This is a get-out-of-jail-free card, where if you don't care about ECS much and just want local variables you can shove them in here.

Order of system execution, multi-threading, communication

None of these are solved problems, but I have some ideas I plan to explore. The fact of this PR having us declare modules at comptime globally gives us the widest range of possibilities here in general. What I will explore is mapping the Elm data model (which I recently learned was used in the PSVR Dreams game, apparently?!) into ECS a bit.

Serialization

We have full type information with this approach, and so we could serialize ECS components as we see fit using that information. How exactly we do that is not settled, but I am leaning towards a custom binary format. The important part here is how this interacts with the future editor, which one can imagine talking to your game over a socket as it runs. In this scenario, the editor and the game need to speak the same serialization format in order for the editor to instruct the game to e.g. change a components value.

Other learning: high-level vs low-level apps

This might at first appear to be a thing I am going back on after our discussion in https://github.com/hexops/mach/issues/309#issuecomment-1140479760:

pub const App = mach.App(modules, init);

It's important to note that this is just "mach uses the low-level API to provide the high-level API". The only change from our agreed upon API earlier is that there is no special 'high level' API:

  • I think this makes a ton of sense in fact, the more I thought about it the weirder it seemed: Mach would have a special-cased "high level" API, but if you don't want to use that (our ECS) then you would have no choice but to use the "uglier" low-level API and require all of your users make use of that too. (for example, if someone uses the low-level mach API to create a UI application framework of their own on top of WebGPU.)
  • By doing this, we're just saying "Mach's high-level API is just a struct which implements that low-level API, you provide an init function that interacts with the ECS only and we provide the update, deinit, etc. low-level functions."

Additionally, there is no "I forgot pub" issue either like before, because we can always assert that App is defined - which we were already going to require in low-level applications.

In short, I think it'll be a "eh that's a little weird, but OK" by anyone initially looking at it. But "with this you get android/ios/webassembly" seems a reasonable answer to a one-line oddity.

Thoughts?

I'd love to hear them. Overall I'm convinced this is the right path forward, at least as an idea to try out for now, but if others object heavily I might reconsider.

  • By selecting this checkbox, I agree to license my contributions to this project under the license(s) described in the LICENSE file, and I have the right to do so or have received permission to do so by an employer or client I am producing work for whom has this right.
# Context I've been thinking non-stop about, and exploring, a lot of overlapping questions: * What does integrating `mach/ecs` into the higher-level `mach/src/` API look like? * What should a 'typical' Mach application look like? We had [this discussion](https://github.com/hexops/mach/issues/309) but the ECS integration side was largely "we'll figure it out" * How do we handle type-safety in our ECS? At runtime? At comptime? * How do we handle namespacing in our ECS? e.g. if `renderer` and `physics2d` modules both want to provide a "location" component of a different type (`vec3` vs `vec2`) - how do we handle that? Is there a convention of prefixing component names like "renderer.vec3", or do we do something else? * And more: What does order-of-execution for ECS systems look like? How can ECS modules talk to eachother to signal events? Where do we store globals/singletons? Where do end-user applications store their own state? How do we handle serialization? # Painting the whole picture (or at least 60% of it) The thing that hit me like an overwhelming bag of bricks is that there are just _so many different ways_ we can tackle these problems. Like, literally, hundreds if not thousands of combinations. And each choice has far-reaching implications for other parts of the system. If we don't consider them in total, all combined together, we're not seeing the whole picture we're painting. Two good examples of this: 1. In https://github.com/hexops/mach/issues/309#issuecomment-1140479760 we agreed upon a high-level application API, but in doing so we didn't realize at the time that our choice of API would actually prevent **anyone** (not just us) from passing a comptime parameter to their code through the ECS without type-erasure. There's not a place in the type system of that example that one *could* pass a comptime parameter to say their own ECS module. We only noticed this because it later came to light that we couldn't pass comptime parameters to `ecs.Entities(T)` if we were to try and tackle type-safety that way. 2. If we tackle type-safety in the ECS as runtime-type-safety, you can add whatever component type at any point in time, then we introduce a serialization problem: we would know say the type ID of a component, but we wouldn't have a way to say call a method on that type to serialize it. We could require components register a `fn serialize(component: *anyopaque, writer: anytype) !void` function - but do we want that? # This PR This is an exploratory PR, if we're happy with how this looks/feels generally, then I'll work on sending separate smaller/cleaner PRs to land this same idea into `main`. It's not 100% thought out, and so it wouldn't be merged directly. This PR takes several ideas from everyone I've talked to: Levy, MasterQ32, Ayush, Ali, Zargio, and more into account. ## A standard high-level Mach application Now looks like this: ```zig const std = @import("std"); const mach = @import("mach"); const gpu = mach.gpu; const ecs = mach.ecs; const renderer = @import("renderer.zig"); const physics2d = @import("physics2d.zig"); const modules = ecs.Modules(.{ .mach = mach.module, .renderer = renderer.module, .physics2d = physics2d.module, }); pub const App = mach.App(modules, init); pub fn init(engine: *ecs.World(modules)) !void { const core = engine.get(.mach, .core); try core.setOptions(.{ .title = "Hello, ECS!" }); const device = engine.get(.mach, .device); _ = device; // use the GPU device ... const player = try engine.entities.new(); try engine.entities.setComponent(player, .renderer, .location, .{.x = 0, .y = 0, .z = 0}); try engine.entities.setComponent(player, .physics2d, .location, .{.x = 0, .y = 0}); _ = player; } ``` Where `physics2d.module` looks like this: ```zig pub const module = Module(.physics2d); pub fn Module(namespace: anytype) type { _ = namespace; return ecs.Module(.{ .components = .{ .location = Vec2, .rotation = Vec2, .velocity = Vec2, }, // ...systems/state/etc... }); } pub const Vec2 = struct { x: f32, y: f32 }; ``` ## First impressions The first thing you'll notice is that: 1. We declare `modules` up-front, at compile-time. You'll have to write out all of the modules you intend to use in your application, and each module has to write out all of the ECS components it intends to use. 2. We've brought back the semi-magical `pub const App = mach.App(modules, init);`, more on this below. 3. Components live in a namespace, API calls like `.setComponent(player, .renderer, .location, .{.x = 0, .y = 0, .z = 0})` find a component called `.location` in the `.renderer` module and set that component on the entity. 4. The ECS is fully type-safe: based on the `.renderer, .location` parameters it knows the exact type the component must be at comptime. Similarly, `.getComponent(player, .renderer, .location)` returns the concrete type. I highly encourage reading the [examples/ecs-app](https://github.com/hexops/mach/pull/349/files#diff-d09c4186fcdaeeefd52c5637846f6a0b033c391c0d954d737c0c3cb1d4a6e0ea) source as it explains in greater detail some other concepts like namespacing of modules. ## The type-safety/explicitness vs. flexibility tradeoff There is a very real tradeoff here around type-safety: requiring modules to be declared up-front, requiring components you might use to be enumerated by modules, etc. can be tedious, it adds some real overhead. For example, modules must live in a namespace. And if two modules have the same namespace, we need to let the consumer of them rename them. This is fine, it's just a comptime parameter `namespace: anytype` but it means all of your module code that intends to work with the ECS needs to have that `namespace: anytype` parameter so it can pass it to `.setComponent`, `.getComponent`, etc.: ```zig pub const module = Module(.physics2d); // The default namespace for this module // Lets you rename the module. pub fn Module(namespace: anytype) type { // _ = namespace; return ecs.Module(.{ .components = .{ .location = Vec2, .rotation = Vec2, .velocity = Vec2, }, // A system function which iterates entities with physics here would need to use e.g. // `.getComponent(entity, namespace, .velocity)` and not // `.getComponent(entity, .physics2d, .velocity)` so the user can rename it. // ...systems/state/etc... }); } pub const Vec2 = struct { x: f32, y: f32 }; ``` All of this adds a real level of mental overhead for the programmer. It will be stuff to learn, and will make writing a Mach application out-of-the-box harder. Is it worth it? Initially, my impression was no. I actually set out to create this PR as evidence and proof that it is **not worth it** for when anyone complains in the future. But, after implementing and reflecting on all aspects it actually seems quite nice. It's possible I change my mind again later, but it seems fair to go with the most restrictive approach first (you have to define all modules/components globally), try and push that as far as we can and make it as nice an API as we can, and go to a less restrictive one later if we truly need to. ## Unimplemented aspects Not implemented in this PR is a few important things: #### Initialization Module initialization is something I haven't implemented yet. This would be easy to handle, though, because we have all the type information. For example, we could call an `init` function defined in the module and require it return a valid type to us. #### Systems * How ECS systems get defined in a module (it would just be a function basically, so pretty much a solved problem) * System state: the idea here is that when an ECS system function is called, it gets a pointer to it's own state struct where it can store information over time if it desires. This is a get-out-of-jail-free card, where if you don't care about ECS much and just want local variables you can shove them in here. #### Order of system execution, multi-threading, communication None of these are solved problems, but I have some ideas I plan to explore. The fact of this PR having us declare modules at comptime globally gives us the widest range of possibilities here in general. What I will explore is mapping the Elm data model (which I recently learned was used in the PSVR Dreams game, apparently?!) into ECS a bit. #### Serialization We have full type information with this approach, and so we could serialize ECS components as we see fit using that information. How exactly we do that is not settled, but I am [leaning towards a custom binary format](https://twitter.com/slimsag/status/1532049174643953666). The important part here is how this interacts with the future editor, which one can imagine talking to your game over a socket as it runs. In this scenario, the editor and the game need to speak the same serialization format in order for the editor to instruct the game to e.g. change a components value. # Other learning: high-level vs low-level apps This might at first appear to be a thing I am going back on after our discussion in https://github.com/hexops/mach/issues/309#issuecomment-1140479760: ```zig pub const App = mach.App(modules, init); ``` It's important to note that this is **just** "mach uses the low-level API to provide the high-level API". The only change from our agreed upon API earlier is that there is no special 'high level' API: * I think this makes a *ton* of sense in fact, the more I thought about it the weirder it seemed: Mach would have a special-cased "high level" API, but if you don't want to use that (our ECS) then you would have no choice but to use the "uglier" low-level API and require all of your users make use of that too. (for example, if someone uses the low-level mach API to create a UI application framework of their own on top of WebGPU.) * By doing this, we're just saying "Mach's high-level API is just a struct which implements that low-level API, you provide an init function that interacts with the ECS only and we provide the `update`, `deinit`, etc. low-level functions." Additionally, there is no "I forgot `pub`" issue either like before, because we can always assert that `App` is defined - which we were already going to require in low-level applications. In short, I think it'll be a "eh that's a little weird, but OK" by anyone initially looking at it. But "with this you get android/ios/webassembly" seems a reasonable answer to a one-line oddity. # Thoughts? I'd love to hear them. Overall I'm convinced this is the right path forward, at least as an idea to try out for now, but if others object heavily I might reconsider. - [x] By selecting this checkbox, I agree to license my contributions to this project under the license(s) described in the LICENSE file, and I have the right to do so or have received permission to do so by an employer or client I am producing work for whom has this right.
emidoots commented 2022-06-13 05:01:30 +00:00 (Migrated from github.com)

Follow-up thoughts:

  • I posted a Twitter thread soliciting feedback: https://twitter.com/slimsag/status/1536189712821456897

  • We can eliminate the namespace renaming logic (the namespace: anytype parameter passed to Module functions in e.g. the physics2d module) if we establish naming conventions (maybe enforce them even), e.g. <prefix>_<module name> for third-party modules, <module name> for Mach standard modules.

  • Modules would still need the set of modules passed to them, so they can operate on ecs.World(modules), unless we use @import("root").App.modules magic to access it - which may be worth it? We could eliminate all comptime parameters to modules then.

Follow-up thoughts: * I posted a Twitter thread soliciting feedback: https://twitter.com/slimsag/status/1536189712821456897 * We can eliminate the namespace renaming logic (the `namespace: anytype` parameter passed to `Module` functions in e.g. the physics2d module) if we establish naming conventions (maybe enforce them even), e.g. `<prefix>_<module name>` for third-party modules, `<module name>` for Mach standard modules. * Modules would still need the set of modules passed to them, so they can operate on `ecs.World(modules)`, unless we use `@import("root").App.modules` magic to access it - which may be worth it? We could eliminate all comptime parameters to modules then.
emidoots commented 2022-06-15 16:26:11 +00:00 (Migrated from github.com)

Takeaways from Twitter feedback:

  • "Why do I need to pass mach.module to ecs.Modules?" (2 people)
  • ECS OOM operations should just panic, remove try (Srekel)
  • Add .module(.physics2d) wrapper API? Discussed in Matrix and would alleviate this question

Writing better examples/documentation in the future:

Takeaways from Twitter feedback: * [ ] "Why do I need to pass mach.module to `ecs.Modules`?" (2 people) * [ ] ECS OOM operations should just panic, remove `try` ([Srekel](https://twitter.com/Srekel/status/1536204001691287552?s=20&t=iH3wm-odGL9WFNqj-2q45w)) * [ ] Add `.module(.physics2d)` wrapper API? Discussed in Matrix and [would alleviate this question](https://twitter.com/lilhxnna/status/1536190668867084289?s=20&t=G3WWEydKXwOn_mj_lyFUWQ) Writing better examples/documentation in the future: * Don't use `.{}` struct prefix or `_ = foo` in examples/tutorials material (2-3 people not coming from Zig background don't understand it easily) * Needs documentation: the purpose of Modules * Make [the tradeoffs of high vs. low-level Mach apps clear](https://twitter.com/croloris/status/1536684980880953344) * [Explain generic functions](https://twitter.com/williamsharkey/status/1536801389594255363) (1 person) * Paint a [clear picture of how Mach ECS type safety works](https://twitter.com/devgerred/status/1536199972277870593?s=20&t=G3WWEydKXwOn_mj_lyFUWQ) (1 person)
tauoverpi (Migrated from github.com) reviewed 2022-06-23 08:47:55 +00:00
tauoverpi (Migrated from github.com) commented 2022-06-23 08:47:54 +00:00
            comptime namespace_name: @Type(.EnumLiteral),
            comptime component_name: @Type(.EnumLiteral),

Using a more constrained type here would better convey intent. You could also use all_components to derive the two types which constrains the interface further.

```suggestion comptime namespace_name: @Type(.EnumLiteral), comptime component_name: @Type(.EnumLiteral), ``` Using a more constrained type here would better convey intent. You could also use `all_components` to derive the two types which constrains the interface further.
emidoots commented 2022-07-05 05:25:49 +00:00 (Migrated from github.com)
Merged via: * https://github.com/hexops/mach/pull/381 * https://github.com/hexops/mach/pull/385 * https://github.com/hexops/mach/pull/386 * https://github.com/hexops/mach/pull/387

Pull request closed

Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
hexops/mach!349
No description provided.