Version: Next

`src/sydra/storage/object_store.zig`

Purpose

Implements a content-addressed object store (Git-inspired):

Objects are addressed by a BLAKE3 hash of (type, payload).
Loose objects are stored under objects/<prefix>/<hex>.
Packed objects are stored under objects/packs/*.pack with objects/packs/*.idx fanout indexes.

Public types

`pub const ObjectType = enum(u8)`

blob = 1
tree = 2
commit = 3
ref = 4

`pub const ObjectId`

hash: [32]u8
pub fn toHex(self) [64]u8 – lower-hex encoding

ObjectId.toHex (from src/sydra/storage/object_store.zig)
pub const ObjectId = struct {
    hash: [32]u8,

    pub fn toHex(self: ObjectId) [64]u8 {
        var out: [64]u8 = undefined;
        _ = std.fmt.bufPrint(&out, "{s}", .{std.fmt.fmtSliceHexLower(self.hash[0..])}) catch unreachable;
        return out;
    }
};

`pub const LoadedObject`

id: ObjectId
obj_type: ObjectType
payload: []u8 – slice referencing the read buffer returned by get

`pub const ObjectStore`

Fields:

allocator: std.mem.Allocator
root: std.fs.Dir
fsync: cfg.FsyncPolicy

Public API

`pub fn init(allocator, path: []const u8, fsync) !ObjectStore`

Ensures path/ exists.
Creates objects/, objects/packs/, objects/info/, and refs/ under path/.

`pub fn put(self, obj_type: ObjectType, payload: []const u8) !ObjectId`

Hashes (obj_type, payload) via BLAKE3.
Stores objects under objects/<first_byte_hex>/<object_id_hex>.
Uses a 5-byte header:
- [type:u8][payload_len:u32 little]
If the object already exists, returns the id without rewriting.

put() header + payload write (excerpt)
var header = [_]u8{ @intFromEnum(obj_type), 0, 0, 0, 0 };
const payload_len: u32 = @intCast(payload.len);
std.mem.writeInt(u32, header[1..5], payload_len, .little);

try file.writeAll(&header);
try file.writeAll(payload);

`pub fn get(self, allocator, id: ObjectId) !LoadedObject`

Resolves loose objects first, then packed objects through any matching .idx file in objects/packs/.
Validates the header and payload length.
For packed objects, validates the record id and recomputes the object hash from (type, payload).
Returns a LoadedObject with payload as a slice into that buffer.

Callers must free loaded.payload using the same allocator passed to get.

get() header validation (excerpt)
const stat = try file.stat();
if (stat.size < 5) return error.CorruptObject;

var buffer = try allocator.alloc(u8, stat.size);
errdefer allocator.free(buffer);
try file.readAll(buffer);

const obj_type = std.meta.intToEnum(ObjectType, buffer[0]) catch return error.UnknownObjectType;
const payload_len = std.mem.readInt(u32, buffer[1..5], .little);
if (payload_len > buffer[5..].len) return error.CorruptObject;

const payload = buffer[5 .. 5 + payload_len];

`pub fn writePack(self, allocator, ids: []const ObjectId) !PackWriteResult`

Writes a whole-object pack for the provided ids.
Sorts object ids lexicographically by BLAKE3 hash.
Emits:
- objects/packs/pack-<timestamp>.pack
- objects/packs/pack-<timestamp>.idx
The .idx file contains:
- a 256-entry cumulative fanout table by first hash byte
- sorted object ids
- 64-bit record offsets into the .pack
- the packed file size
- a BLAKE3 checksum of the pack file
- a trailing BLAKE3 checksum of the index contents
After the new pack lands, older pack files are pruned and redundant loose copies for the packed ids are removed.

ObjectId hashing (excerpt)
fn hash(obj_type: ObjectType, payload: []const u8) ObjectId {
    var hasher = std.crypto.hash.blake3.Blake3.init(.{});
    hasher.update(&[_]u8{@intFromEnum(obj_type)});
    hasher.update(payload);
    var out: [32]u8 = undefined;
    hasher.final(out[0..]);
    return .{ .hash = out };
}

Tests

test "object store write/read round-trip"
test "object store can read packed objects after loose copies are pruned"
test "object store rejects corrupt pack indexes"

Purpose​

Public types​

pub const ObjectType = enum(u8)​

pub const ObjectId​

pub const LoadedObject​

pub const ObjectStore​

Public API​

pub fn init(allocator, path: []const u8, fsync) !ObjectStore​

pub fn put(self, obj_type: ObjectType, payload: []const u8) !ObjectId​

pub fn get(self, allocator, id: ObjectId) !LoadedObject​

pub fn writePack(self, allocator, ids: []const ObjectId) !PackWriteResult​

Tests​

Purpose

Public types

`pub const ObjectType = enum(u8)`

`pub const ObjectId`

`pub const LoadedObject`

`pub const ObjectStore`

Public API

`pub fn init(allocator, path: []const u8, fsync) !ObjectStore`

`pub fn put(self, obj_type: ObjectType, payload: []const u8) !ObjectId`

`pub fn get(self, allocator, id: ObjectId) !LoadedObject`

`pub fn writePack(self, allocator, ids: []const ObjectId) !PackWriteResult`

Tests