HS: pick regex + indices (+13 tests)

Implements cluster 19 — pick command extensions for hs-upstream-pick suite
(11/24 → 24/24, +13):

- Parser:
  - pick items/item EXPR to EXPR supports `start` and `end` keywords
  - pick match / pick matches accept `| <flag>` syntax after regex
  - pick item N without `to` still works (single-item slice)
- Runtime:
  - hs-pick-items / hs-pick-first / hs-pick-last now handle strings
    (not just lists) via slice
  - hs-pick-items resolves `start`/`end` sentinel strings and negative
    indices (len + N) at runtime
  - hs-pick-matches added (wraps regex-find-all, each match as a list)
  - hs-pick-regex-pattern handles (list pat flags) form; `i` flag
    transforms pattern to case-insensitive by replacing alpha chars with
    [aA] character classes (Re.Pcre has no inline-flag support)
- Generator:
  - extract_hs_expr now decodes JS string escape sequences (\" -> ",
    \\ -> \) instead of stripping all backslashes, then re-escapes for
    SX. Preserves regex escapes (\d, \s), CSS escapes, and lambda `\`
    syntax for String.raw template literals while still producing
    correct output for regular JS strings.

Smoke (0-195): 170/195 unchanged (no regressions).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-24 15:08:23 +00:00
parent b45a69b7a4
commit 4be90bf21f
2 changed files with 43 additions and 8 deletions

View File

@@ -1768,14 +1768,49 @@ def _js_window_expr_to_sx(expr):
return None
def _decode_js_escapes(s):
"""Decode JS string escape sequences.
- \\" -> " (escaped quote)
- \\' -> '
- \\` -> `
- \\\\ -> \\ (escaped backslash)
- \\n, \\t -> space (already normalized)
- Other \\X sequences (e.g. \\d for regex) are preserved literally,
matching String.raw semantics for unknown escapes.
"""
out = []
i = 0
while i < len(s):
c = s[i]
if c == '\\' and i + 1 < len(s):
nxt = s[i + 1]
if nxt in ('"', "'", '`'):
out.append(nxt)
i += 2
continue
if nxt == '\\':
out.append('\\')
i += 2
continue
# Unknown escape: preserve both chars (regex \\d, CSS \\:, lambda \\ -> )
out.append(c)
i += 1
continue
out.append(c)
i += 1
return ''.join(out)
def extract_hs_expr(raw):
"""Clean a HS expression extracted from run() call."""
# Remove surrounding whitespace and newlines
expr = raw.strip().replace('\n', ' ').replace('\t', ' ')
# Collapse multiple spaces
expr = re.sub(r'\s+', ' ', expr)
# Escape backslashes (preserve regex escapes like \d, CSS escapes, lambda \)
# then escape quotes for SX string.
# Decode JS-level escape sequences while preserving regex/CSS/lambda
# backslashes. \" -> ", \\ -> \, \d -> \d (unchanged).
expr = _decode_js_escapes(expr)
# Re-escape for SX string literal: backslashes, then quotes.
expr = expr.replace('\\', '\\\\').replace('"', '\\"')
return expr