From 824989644935453b7484223504a826fcd092b3c6 Mon Sep 17 00:00:00 2001 From: dave Date: Wed, 25 Mar 2026 13:58:06 +0000 Subject: [PATCH] storkit: create 392_refactor_extract_shared_transport_utilities_from_matrix_module_into_chat_submodule --- ...i_panics_on_multi_byte_utf_8_characters.md | 27 ------------------- ...i_panics_on_multi_byte_utf_8_characters.md | 27 ------------------- ..._from_matrix_module_into_chat_submodule.md | 26 ++++++++++++++++++ 3 files changed, 26 insertions(+), 54 deletions(-) delete mode 100644 .storkit/work/1_backlog/391_bug_strip_prefix_ci_panics_on_multi_byte_utf_8_characters.md delete mode 100644 .storkit/work/1_backlog/392_bug_strip_prefix_ci_panics_on_multi_byte_utf_8_characters.md create mode 100644 .storkit/work/1_backlog/392_refactor_extract_shared_transport_utilities_from_matrix_module_into_chat_submodule.md diff --git a/.storkit/work/1_backlog/391_bug_strip_prefix_ci_panics_on_multi_byte_utf_8_characters.md b/.storkit/work/1_backlog/391_bug_strip_prefix_ci_panics_on_multi_byte_utf_8_characters.md deleted file mode 100644 index e5217d4..0000000 --- a/.storkit/work/1_backlog/391_bug_strip_prefix_ci_panics_on_multi_byte_utf_8_characters.md +++ /dev/null @@ -1,27 +0,0 @@ ---- -name: "strip_prefix_ci panics on multi-byte UTF-8 characters" ---- - -# Bug 391: strip_prefix_ci panics on multi-byte UTF-8 characters - -## Description - -strip_prefix_ci in commands/mod.rs slices text by byte offset using prefix.len(), which panics when the slice boundary falls inside a multi-byte UTF-8 character (e.g. right single quote U+2019, emojis). The function assumes ASCII-safe byte boundaries but real WhatsApp/Matrix messages contain Unicode. - -## How to Reproduce - -1. Send a message to the bot containing a smart quote or emoji within the first N bytes (where N = bot name length)\n2. e.g. "For now let\u2019s just deal with it" where the bot name prefix check slices at byte 12, inside the 3-byte \u2019 character - -## Actual Result - -Thread panics: "byte index 12 is not a char boundary; it is inside \u2018\u2019\u2019 (bytes 11..14)" - -## Expected Result - -The function should safely handle multi-byte UTF-8 without panicking. If the slice boundary isn't a char boundary, the prefix doesn't match — return None. - -## Acceptance Criteria - -- [ ] strip_prefix_ci does not panic on messages containing multi-byte UTF-8 characters (smart quotes, emojis, CJK, etc.) -- [ ] Use text.get(..prefix.len()) or text.is_char_boundary() instead of direct indexing -- [ ] Add test cases for messages with emojis and smart quotes diff --git a/.storkit/work/1_backlog/392_bug_strip_prefix_ci_panics_on_multi_byte_utf_8_characters.md b/.storkit/work/1_backlog/392_bug_strip_prefix_ci_panics_on_multi_byte_utf_8_characters.md deleted file mode 100644 index 1072e95..0000000 --- a/.storkit/work/1_backlog/392_bug_strip_prefix_ci_panics_on_multi_byte_utf_8_characters.md +++ /dev/null @@ -1,27 +0,0 @@ ---- -name: "strip_prefix_ci panics on multi-byte UTF-8 characters" ---- - -# Bug 392: strip_prefix_ci panics on multi-byte UTF-8 characters - -## Description - -strip_prefix_ci in matrix/commands/mod.rs panics when slicing text at a byte offset that falls inside a multi-byte UTF-8 character (e.g. right single quote ' is 3 bytes). This affects all transports (WhatsApp, Slack, Matrix) since they all share try_handle_command → strip_bot_mention → strip_prefix_ci. The panic occurs at line 234: text[..prefix.len()] when prefix.len() is not a char boundary in the input text. - -## How to Reproduce - -1. Send a message to the WhatsApp bot containing multi-byte UTF-8 characters (e.g. right single quotes or emojis) where the bot name prefix length lands inside a multi-byte character\n2. Example: "For now let's just deal with it" where the ' (right single quote, bytes 11..14) gets sliced at byte 12 - -## Actual Result - -Thread panics: byte index 12 is not a char boundary; it is inside ''' (bytes 11..14) - -## Expected Result - -The function should handle multi-byte UTF-8 characters gracefully without panicking. If the prefix length doesn't fall on a char boundary, the text doesn't match the prefix — return None. - -## Acceptance Criteria - -- [ ] strip_prefix_ci checks text.is_char_boundary(prefix.len()) before slicing and returns None if not on a boundary -- [ ] Messages containing multi-byte UTF-8 characters (smart quotes, emojis, CJK, etc.) do not panic -- [ ] All transports (WhatsApp, Slack, Matrix) are covered since they share the same code path diff --git a/.storkit/work/1_backlog/392_refactor_extract_shared_transport_utilities_from_matrix_module_into_chat_submodule.md b/.storkit/work/1_backlog/392_refactor_extract_shared_transport_utilities_from_matrix_module_into_chat_submodule.md new file mode 100644 index 0000000..488a8ac --- /dev/null +++ b/.storkit/work/1_backlog/392_refactor_extract_shared_transport_utilities_from_matrix_module_into_chat_submodule.md @@ -0,0 +1,26 @@ +--- +name: "Extract shared transport utilities from matrix module into chat submodule" +--- + +# Refactor 392: Extract shared transport utilities from matrix module into chat submodule + +## Current State + +- TBD + +## Desired State + +Several functions currently living in the matrix transport module are used by all transports (WhatsApp, Slack, Matrix). These should be pulled up into a shared location under the chat module. Candidates include: strip_prefix_ci, strip_bot_mention, try_handle_command, drain_complete_paragraphs, markdown_to_whatsapp (pattern could generalize), chunk_for_whatsapp, and the command dispatch infrastructure. A chat::util or chat::text submodule would be a natural home for string utilities like strip_prefix_ci. The command dispatch (try_handle_command, CommandDispatch, BotCommand registry) could live in chat::commands. + +## Acceptance Criteria + +- [ ] Shared string utilities (strip_prefix_ci, strip_bot_mention, drain_complete_paragraphs) moved to a chat::util or chat::text submodule +- [ ] Command dispatch infrastructure (try_handle_command, CommandDispatch, BotCommand, command registry) moved to chat::commands +- [ ] Per-transport formatting functions (markdown_to_whatsapp, markdown_to_slack) remain in their respective transport modules +- [ ] All transports import from the new shared location instead of reaching into matrix:: +- [ ] No functional changes — purely structural refactor +- [ ] All existing tests pass and move with their code + +## Out of Scope + +- TBD