fix(1001): stop create_* from half-writing onto tombstoned IDs
Root cause: db::next_item_number scanned the visible CRDT index and the
content store but not the tombstone set, so it would hand out a numeric
ID whose CRDT entry had been tombstoned. crdt_state::write_item then
silently no-op'd the insert (tombstone-match guard) while the content
store and SQLite shadow happily accepted the row, producing a split-
brain half-write that was invisible to every CRDT-driven read path and
couldn't be cleaned up by delete_story / purge_story.
This change closes the loop:
- crdt_state::read::{is_tombstoned, tombstoned_ids} expose the
tombstone set so callers outside crdt_state can consult it.
- db::next_item_number now scans tombstoned_ids() too. The allocator
skips past tombstoned numeric IDs instead of treating their slots as
free.
- write_item logs a WARN when it rejects a write for a tombstoned ID
(was silent). The warn is a tripwire — if the allocator ever lets one
slip through again we'll see it in the log.
- create_item_in_backlog adds two defence-in-depth checks:
(a) before any write, reject if the allocator returned a
tombstoned ID;
(b) after the writes, call read_item to confirm the CRDT entry
materialised. If not, roll back the content-store + shadow-DB
rows via db::delete_item and return Err.
Regression tests cover the allocator skip, the is_tombstoned accessor,
and the create_item_in_backlog rollback path.
Out of scope for this commit:
- Recovery of the already-half-written items currently in the running
pipeline (989, 1000, 1001) — Stage 2/3 of the plan, handled
separately.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -42,7 +42,8 @@ pub use presence::{
|
||||
};
|
||||
pub use read::{
|
||||
CrdtItemDump, CrdtStateDump, check_archived_deps_crdt, check_unmet_deps_crdt,
|
||||
dep_is_archived_crdt, dep_is_done_crdt, dump_crdt_state, evict_item, read_all_items, read_item,
|
||||
dep_is_archived_crdt, dep_is_done_crdt, dump_crdt_state, evict_item, is_tombstoned,
|
||||
read_all_items, read_item, tombstoned_ids,
|
||||
};
|
||||
pub use state::{init, subscribe};
|
||||
pub use types::{
|
||||
|
||||
@@ -205,6 +205,37 @@ pub fn read_item(story_id: &str) -> Option<PipelineItemView> {
|
||||
extract_item_view(&state.crdt.doc.items[idx])
|
||||
}
|
||||
|
||||
/// Return `true` if `story_id` has been tombstoned via `evict_item`.
|
||||
///
|
||||
/// Tombstoned IDs are invisible to `read_item` / `read_all_items` but `write_item`
|
||||
/// silently rejects writes to them. Callers that allocate fresh IDs (notably
|
||||
/// `db::next_item_number`) must consult this to avoid handing out a tombstoned ID,
|
||||
/// which would cause a silent half-write split-brain (bug 1001).
|
||||
pub fn is_tombstoned(story_id: &str) -> bool {
|
||||
let Some(state_mutex) = get_crdt() else {
|
||||
return false;
|
||||
};
|
||||
let Ok(state) = state_mutex.lock() else {
|
||||
return false;
|
||||
};
|
||||
state.tombstones.contains(story_id)
|
||||
}
|
||||
|
||||
/// Snapshot of all tombstoned story IDs.
|
||||
///
|
||||
/// Used by `db::next_item_number` to compute the highest tombstoned numeric ID
|
||||
/// so the allocator can skip past it. Returns an empty vec if the CRDT layer
|
||||
/// is not initialised.
|
||||
pub fn tombstoned_ids() -> Vec<String> {
|
||||
let Some(state_mutex) = get_crdt() else {
|
||||
return Vec::new();
|
||||
};
|
||||
let Ok(state) = state_mutex.lock() else {
|
||||
return Vec::new();
|
||||
};
|
||||
state.tombstones.iter().cloned().collect()
|
||||
}
|
||||
|
||||
/// Mark a story as deleted in the in-memory CRDT and persist a tombstone op.
|
||||
///
|
||||
/// This is the eviction primitive for story 521 — it lets external callers
|
||||
|
||||
@@ -244,7 +244,18 @@ pub fn write_item(
|
||||
// Reject any write (insert or update) for a tombstoned story_id.
|
||||
// This prevents a concurrent or late-arriving write from resurrecting
|
||||
// a story that was permanently deleted via evict_item.
|
||||
//
|
||||
// Historically this was a silent return, which caused split-brain
|
||||
// half-writes when an upstream allocator (e.g. db::next_item_number)
|
||||
// handed out a tombstoned numeric ID (bug 1001). We log a WARN here
|
||||
// so the failure mode is visible; the structural fix is that callers
|
||||
// who allocate fresh IDs must consult `is_tombstoned` first AND
|
||||
// verify the write via `read_item` afterwards.
|
||||
if state.tombstones.contains(story_id) {
|
||||
crate::slog_warn!(
|
||||
"[crdt_state] write_item rejected for tombstoned story_id '{story_id}' \
|
||||
(stage='{stage_str}'); caller should have skipped this ID"
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user