Files
huskies/server/src/crdt_sync/mod.rs
T

76 lines
3.6 KiB
Rust
Raw Normal View History

//! CRDT sync — WebSocket-based replication of pipeline state between huskies nodes.
//! WebSocket-based CRDT sync layer for replicating pipeline state between
//! huskies nodes.
//!
//! # Protocol
//!
//! ## Version negotiation
//!
//! After the auth handshake, both sides send their first sync message:
//!
//! - **v2 peers** send a `clock` frame: `{"type":"clock","clock":{ <node_id_hex>: <max_count>, ... }}`
//! containing a vector clock that maps each author's hex Ed25519 pubkey to the
//! count of ops received from that author. Upon receiving the peer's clock,
//! each side computes the delta via [`crdt_state::ops_since`] and sends only
//! the missing ops as a `bulk` frame.
//!
//! - **v1 (legacy) peers** send a `bulk` frame directly (full op dump).
//! A v2 peer receiving a `bulk` first (instead of a `clock`) falls back to
//! the full-dump path: applies the incoming bulk and responds with its own
//! full bulk. This preserves backward compatibility — no code change needed
//! on the v1 side.
//!
//! ## Text frames
//! A JSON object with a `"type"` field:
//! - `{"type":"clock","clock":{...}}` — Vector clock (v2 protocol).
//! - `{"type":"bulk","ops":[...]}` — Ops dump (full or delta).
//! - `{"type":"ready"}` — Signals that the bulk-delta phase is complete and the
//! sender is ready for real-time op streaming. Locally-generated ops are
//! buffered until the peer's `ready` is received, then flushed in order.
//!
//! ## Binary frames (real-time op broadcast)
//! Individual `SignedOp`s encoded via [`crate::crdt_wire`] (versioned JSON
//! envelope: `{"v":1,"op":{...}}`). Each locally-applied op is immediately
//! broadcast as a binary frame to all connected peers.
//!
//! Both the server endpoint and the rendezvous client use the same protocol,
//! making the connection fully symmetric.
//!
//! ## Backpressure
//! Each connected peer has its own [`tokio::sync::broadcast`] receiver. If a
//! slow peer allows the channel to fill (indicated by a `Lagged` error), the
//! connection is dropped with a warning log. The peer can reconnect and
//! receive a fresh bulk state dump to catch up.
// ── Cross-cutting constants ─────────────────────────────────────────
// ── Auth configuration ──────────────────────────────────────────────
/// Default timeout for the auth handshake (seconds).
pub(super) const AUTH_TIMEOUT_SECS: u64 = 10;
// ── Keepalive configuration ─────────────────────────────────────────
/// Interval (seconds) between WebSocket Ping frames sent by each side.
pub const PING_INTERVAL_SECS: u64 = 30;
/// Seconds without a Pong response before the connection is dropped.
pub const PONG_TIMEOUT_SECS: u64 = 60;
// ── Sub-modules ─────────────────────────────────────────────────────
mod auth;
mod client;
mod dispatch;
mod handshake;
mod server;
mod wire;
// ── Public API re-exports ───────────────────────────────────────────
pub use auth::{add_join_token, init_token_auth, init_trusted_keys};
pub(crate) use client::connect_and_sync;
pub use client::{RENDEZVOUS_ERROR_THRESHOLD, spawn_rendezvous_client};
pub use server::crdt_sync_handler;
// Test-only re-export used by `crdt_snapshot` tests.
#[cfg(test)]
pub(crate) use wire::SyncMessagePublic;