sysaudio: read/write callback design goal #1099

Open
opened 2023-11-04 22:29:42 +00:00 by emidoots · 2 comments
emidoots commented 2023-11-04 22:29:42 +00:00 (Migrated from github.com)

@alichraghi I think we should work towards this API design:

const Recorder/Player = struct {
    /// The number of channels
    ///
    /// This field is initialized after a call to (TODO: device create function) and matches the
    /// number of audio channels reported by the underlying device, but it may not match the number
    /// of channels you requested at creation time if the device did not support that number of
    /// channels.
    channels: u8,

    /// The format of each audio sample
    ///
    /// This field is initialized after a call to (TODO: device create function) and matches the
    /// format reported by the underlying device, but it may not match the format you requested at
    /// creation time if the device did not support that format. 
    format: Format,

    /// Whether the channels' samples are interleaved (`ABABAB`) or planar (`AAABBB`) in memory.
    ///
    /// This field is initialized after a call to (TODO: device create function) and always matches
    /// your requested preference.
    ///
    /// Most native platforms support interleaved audio, but browsers/WebAudio only support planar
    /// audio. If the platform API does not support your preference, sysaudio will automatically
    /// perform conversion for you. This both prevents you from needing to do any conversion
    /// yourself, and also enables sysaudio to handle it per-platform to reduce any unneccessary
    /// conversions.
    interleaved: bool,
};
fn readCallback(ctx: Context, raw_audio: []const u8, recorder: sysaudio.Recorder) void {
    _ = ctx;
    const num_samples = raw_samples.len / recorder.format.size();
    const num_samples_per_channel = num_samples / recorder.channels;
    const format_size = format.size();
    const frames = input.len / format_size;

    // NOTE: sysaudio should expose a clear buffer size that can be used here, 16*1024 should not be
    // hard-coded like this:
    //
    // Also, what guarantees can we make about `raw_audio`? e.g. can we say
    // it has a static length per-platform, or static length for the lifetime of a device? Something
    // like that would be ideal, whatever guarantee we can make.
    var samples: [16 * 1024]f32 = undefined;

    // Convert raw_audio in the device' format to f32 samples:
    sysaudio.convert(f32, samples[0..num_samples])

    // Write f32 samples to disk
    //
    // Note: this is just an example, things like file I/O should not be performed in a callback
    // as any stall here can result in losing samples from a recorder, failing to write enough
    // samples to a player. In a real application you should e.g. do this work in a separate thread
    // and utilize e.g. ring buffers.
    _ = file.write(std.mem.sliceAsBytes(samples[0..num_samples])) catch {};
}
-fn writeCallback(_: ?*anyopaque, output: []u8) void {
+fn writeCallback(ctx: Context, raw_audio_out: []u8, player: sysaudio.Player) void {
    // replace player.write() with sysaudio.convert()

Notes:

  • _: ?*anyopaque parameter is replaced by a typed generic context parameter. The user can decide this type, and ctx: void would be a valid choice. They would need to pass this type into the player create API or similar.
  • input: []const u8 is replaced by raw_audio: []const u8 to hint that it is raw audio in the devices' native format, whatever that may be.
  • recorder.read is replaced by sysaudio.convert to make it super clear that function is converting samples for you.
  • recorder: sysaudio.Recorder is now a parameter to readCallback, and player to writeCallback.
    • This gives the callback access to recorder.channels, recorder.format.size(), etc.
  • Use num_samples instead of frames, "frames" has a specific meaning in audio processing. 1 sample == 1 sample, but 1 frame == multiple samples (one for each channel.) Don't confuse the two.
  • The user should be able to request interleaved or planar format when creating a device, and sysaudio should do that conversion internally per-backend as needed.
@alichraghi I think we should work towards this API design: ``` const Recorder/Player = struct { /// The number of channels /// /// This field is initialized after a call to (TODO: device create function) and matches the /// number of audio channels reported by the underlying device, but it may not match the number /// of channels you requested at creation time if the device did not support that number of /// channels. channels: u8, /// The format of each audio sample /// /// This field is initialized after a call to (TODO: device create function) and matches the /// format reported by the underlying device, but it may not match the format you requested at /// creation time if the device did not support that format. format: Format, /// Whether the channels' samples are interleaved (`ABABAB`) or planar (`AAABBB`) in memory. /// /// This field is initialized after a call to (TODO: device create function) and always matches /// your requested preference. /// /// Most native platforms support interleaved audio, but browsers/WebAudio only support planar /// audio. If the platform API does not support your preference, sysaudio will automatically /// perform conversion for you. This both prevents you from needing to do any conversion /// yourself, and also enables sysaudio to handle it per-platform to reduce any unneccessary /// conversions. interleaved: bool, }; ``` ``` fn readCallback(ctx: Context, raw_audio: []const u8, recorder: sysaudio.Recorder) void { _ = ctx; const num_samples = raw_samples.len / recorder.format.size(); const num_samples_per_channel = num_samples / recorder.channels; const format_size = format.size(); const frames = input.len / format_size; // NOTE: sysaudio should expose a clear buffer size that can be used here, 16*1024 should not be // hard-coded like this: // // Also, what guarantees can we make about `raw_audio`? e.g. can we say // it has a static length per-platform, or static length for the lifetime of a device? Something // like that would be ideal, whatever guarantee we can make. var samples: [16 * 1024]f32 = undefined; // Convert raw_audio in the device' format to f32 samples: sysaudio.convert(f32, samples[0..num_samples]) // Write f32 samples to disk // // Note: this is just an example, things like file I/O should not be performed in a callback // as any stall here can result in losing samples from a recorder, failing to write enough // samples to a player. In a real application you should e.g. do this work in a separate thread // and utilize e.g. ring buffers. _ = file.write(std.mem.sliceAsBytes(samples[0..num_samples])) catch {}; } ``` ``` -fn writeCallback(_: ?*anyopaque, output: []u8) void { +fn writeCallback(ctx: Context, raw_audio_out: []u8, player: sysaudio.Player) void { // replace player.write() with sysaudio.convert() ``` Notes: * `_: ?*anyopaque` parameter is replaced by a _typed_ generic context parameter. The user can decide this type, and `ctx: void` would be a valid choice. They would need to pass this type into the player create API or similar. * `input: []const u8` is replaced by `raw_audio: []const u8` to hint that it is raw audio in the devices' native format, whatever that may be. * `recorder.read` is replaced by `sysaudio.convert` to make it super clear that function is _converting samples for you_. * `recorder: sysaudio.Recorder` is now a parameter to `readCallback`, and player to `writeCallback`. * This gives the callback access to `recorder.channels`, `recorder.format.size()`, etc. * Use `num_samples` instead of `frames`, "frames" has a specific meaning in audio processing. 1 sample == 1 sample, but 1 frame == multiple samples (one for each channel.) Don't confuse the two. * The user should be able to request interleaved or planar format when creating a device, and sysaudio should do that conversion internally per-backend as needed.
ulzu commented 2023-11-05 19:59:40 +00:00 (Migrated from github.com)

Do you mean locking in a specific type to the callback function? Why not have a gen function so the user could choose whatever context type she wants, whether a recorder or something else? (Add a flag to generate a function signature with a Player and we have a choice between all variants)

Your proposal would lock out the library from use in audio dev.

Do you mean locking in a specific type to the callback function? Why not have a gen function so the user could choose whatever context type she wants, whether a recorder or something else? (Add a flag to generate a function signature with a Player and we have a choice between all variants) Your proposal would lock out the library from use in audio dev.
alichraghi commented 2023-11-05 21:03:01 +00:00 (Migrated from github.com)

@plaukiu ctx: Context is a generic type specified at createPlayer/createRecorder. here is an example:

cosnt MyContext = struct {
   data: [4]u8 = undefined,
};

fn main() void {
    var ctx: MyContext = .{};
    var player = try sysaudio.createPlayer(*MyContext, &ctx, .{ .writeFn = writeCallback });
}

fn writeCallback(ctx: *MyContext, raw_audio_out: []u8, player: sysaudio.Player) void {
   // do something with ctx.data
}
@plaukiu `ctx: Context` is a generic type specified at `createPlayer`/`createRecorder`. here is an example: ```zig cosnt MyContext = struct { data: [4]u8 = undefined, }; fn main() void { var ctx: MyContext = .{}; var player = try sysaudio.createPlayer(*MyContext, &ctx, .{ .writeFn = writeCallback }); } fn writeCallback(ctx: *MyContext, raw_audio_out: []u8, player: sysaudio.Player) void { // do something with ctx.data } ```
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
hexops/mach#1099
No description provided.