Files
huskies/server/src/crdt_sync/mod.rs
T

77 lines
3.6 KiB
Rust
Raw Normal View History

//! CRDT sync — WebSocket-based replication of pipeline state between huskies nodes.
/// WebSocket-based CRDT sync layer for replicating pipeline state between
/// huskies nodes.
///
/// # Protocol
///
/// ## Version negotiation
///
/// After the auth handshake, both sides send their first sync message:
///
/// - **v2 peers** send a `clock` frame: `{"type":"clock","clock":{ <node_id_hex>: <max_count>, ... }}`
/// containing a vector clock that maps each author's hex Ed25519 pubkey to the
/// count of ops received from that author. Upon receiving the peer's clock,
/// each side computes the delta via [`crdt_state::ops_since`] and sends only
/// the missing ops as a `bulk` frame.
///
/// - **v1 (legacy) peers** send a `bulk` frame directly (full op dump).
/// A v2 peer receiving a `bulk` first (instead of a `clock`) falls back to
/// the full-dump path: applies the incoming bulk and responds with its own
/// full bulk. This preserves backward compatibility — no code change needed
/// on the v1 side.
///
/// ## Text frames
/// A JSON object with a `"type"` field:
/// - `{"type":"clock","clock":{...}}` — Vector clock (v2 protocol).
/// - `{"type":"bulk","ops":[...]}` — Ops dump (full or delta).
/// - `{"type":"ready"}` — Signals that the bulk-delta phase is complete and the
/// sender is ready for real-time op streaming. Locally-generated ops are
/// buffered until the peer's `ready` is received, then flushed in order.
///
/// ## Binary frames (real-time op broadcast)
/// Individual `SignedOp`s encoded via [`crate::crdt_wire`] (versioned JSON
/// envelope: `{"v":1,"op":{...}}`). Each locally-applied op is immediately
/// broadcast as a binary frame to all connected peers.
///
/// Both the server endpoint and the rendezvous client use the same protocol,
/// making the connection fully symmetric.
///
/// ## Backpressure
/// Each connected peer has its own [`tokio::sync::broadcast`] receiver. If a
/// slow peer allows the channel to fill (indicated by a `Lagged` error), the
/// connection is dropped with a warning log. The peer can reconnect and
/// receive a fresh bulk state dump to catch up.
// ── Cross-cutting constants ─────────────────────────────────────────
// ── Auth configuration ──────────────────────────────────────────────
/// Default timeout for the auth handshake (seconds).
pub(super) const AUTH_TIMEOUT_SECS: u64 = 10;
// ── Keepalive configuration ─────────────────────────────────────────
/// Interval (seconds) between WebSocket Ping frames sent by each side.
pub const PING_INTERVAL_SECS: u64 = 30;
/// Seconds without a Pong response before the connection is dropped.
pub const PONG_TIMEOUT_SECS: u64 = 60;
// ── Sub-modules ─────────────────────────────────────────────────────
mod auth;
mod client;
mod dispatch;
mod handshake;
mod server;
mod wire;
// ── Public API re-exports ───────────────────────────────────────────
pub use auth::{add_join_token, init_token_auth, init_trusted_keys};
pub(crate) use client::connect_and_sync;
pub use client::{RENDEZVOUS_ERROR_THRESHOLD, spawn_rendezvous_client};
pub use server::crdt_sync_handler;
// Test-only re-export used by `crdt_snapshot` tests.
#[cfg(test)]
pub(crate) use wire::SyncMessagePublic;