Commit Graph

8 Commits

Author SHA1 Message Date
9b8a8dd272 Remove Comment variant and old comment-mode parser — CST handles all
Delete from sx_types.ml:
- Comment of string variant (no longer needed)

Delete from sx_parser.ml:
- _preserve_comments mutable ref
- collect_comment_node function
- comment-mode branches in read_value, read_list
- ~comments parameter from parse_all and parse_file
- skip_whitespace and read_comment (only used by old comment mode)

Delete from mcp_tree.ml:
- has_interior_comments function
- Comment handling in pretty_print_value
- pretty_print_file function (replaced by CST write-back)
- ~comments parameter from local parse_file

Migrate sx_pretty_print, sx_write_file, sx_doc_gen to CST path.
Net: -69 lines. 24/24 CST round-trips, 2583/2583 evaluator tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 18:19:19 +00:00
5390df7b0b CST parser: lossless concrete syntax tree for .sx files
New sx_cst.ml: CstAtom, CstList, CstDict node types with leading/trailing
trivia (whitespace + comments). Two projections:
- cst_to_source/cst_file_to_source: exact source reconstruction
- cst_to_ast: strip trivia → Sx_types.value for evaluation

New parse_all_cst/parse_file_cst in sx_parser.ml: parallel CST parser
alongside existing AST parser. Reuses read_string, read_symbol, try_number.
Trivia collected via collect_trivia (replaces skip_whitespace_and_comments).

Round-trip invariant: cst_file_to_source(parse_all_cst(src)) = src
Verified on 13 synthetic tests + 7 real codebase files (101KB evaluator,
parser, primitives, render, tree-tools, engine, io).

CST→AST equivalence: cst_to_ast matches parse_all output.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 18:07:35 +00:00
38556af423 Interior comments, fragment comments, get_siblings + doc_gen comment support
Parser: read_value/read_list now capture Comment nodes inside lists
when ~comments:true. Module-level _preserve_comments ref threads the
flag through the recursive descent without changing signatures.

Pretty printer: has_interior_comments (recursive) forces multi-line
when any nested list contains comments. Comment nodes inside lists
emit as indented comment lines.

Edit tools: separate_comments strips interior comments recursively
via strip_interior_comments before passing to tree-tools (paths stay
correct). extract_fragment_comments parses new source with comments,
attaches leading comments to the target position in the comment map.

sx_get_siblings: injects comments for top-level siblings.

sx_doc_gen: parses with comments, tracks preceding Comment node,
includes cleaned comment text in generated component documentation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 17:00:56 +00:00
2e329f273a Preserve ;; comments through MCP tree edit round-trips
Parser gains Comment(string) AST variant and ~comments:true mode that
captures top-level ;; lines instead of discarding them. All MCP edit
tools (replace_node, insert_child, delete_node, wrap_node, rename_symbol,
replace_by_pattern, insert_near, rename_across, pretty_print, write_file)
now preserve comments: separate before tree-tools operate (so index paths
stay correct), re-interleave after editing, emit in pretty_print_file.

Default parse path (evaluator, runtime, compiler) is unchanged — comments
are still stripped unless explicitly requested.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 16:35:44 +00:00
ebbdec8f4c Fix orchestration.sx parse error, add parser line/col diagnostics
The parser was reporting "Unexpected char: )" with no position info.
Added line number, column, and byte position to all parse errors.

Root cause: bind-sse-swap had one extra close paren that naive paren
counting missed because a "(" exists inside a string literal on L1074
(starts-with? trimmed "("). Parse-aware counting (skipping strings
and comments) correctly identified the imbalance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 15:29:28 +00:00
0c7567925e Align OCaml parser with spec/parser.sx character classification
Replace permissive is_symbol_char (negative check — everything not a
delimiter) with spec-compliant is_ident_start/is_ident_char (positive
check matching the exact character sets documented in parser.sx).

Changes:
- ident-start: remove extra chars (|, %, ^, $) not in spec
- ident-char: add comma (,) per spec
- Comma (,) now handled as dedicated unquote case in match, not in
  the catch-all fallback — matches spec dispatch order
- Remove ~@ splice-unquote alias (spec only defines ,@)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 10:13:03 +00:00
313f7d6be1 OCaml bootstrapper Phase 2: HTML renderer, SX server, Python bridge
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 23:28:48 +00:00
818e5d53f0 OCaml bootstrapper: transpiler compiles full CEK evaluator (61/61 tests)
SX-to-OCaml transpiler (transpiler.sx) generates sx_ref.ml (~90KB, ~135
mutually recursive functions) from the spec evaluator. Foundation tests
all pass: parser, primitives, env operations, type system.

Key design decisions:
- Env variant added to value type for CEK state dict storage
- Continuation carries optional data dict for captured frames
- Dynamic var tracking distinguishes OCaml fn calls from SX value dispatch
- Single let rec...and block for forward references between all defines
- Unused ref pre-declarations eliminated via let-bound name detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 20:51:59 +00:00