Refactor CodeProcessor to use safe state-machine and tokenize-based stripping, add Jinja {# #} support, and introduce unit tests with Makefile targets

- Added LanguageSpec dataclass and mapping for extensions
- Implemented state-machine for C/CPP/JS comment stripping (handles strings correctly)
- Improved Python comment/docstring removal using tokenize
- Added regex-based stripping for hash (#) and Jinja {# #} comments
- Added Makefile with test and install targets
- Added unit test suite under tests/unit covering Python, C-style, hash, and Jinja cases
- Added compress/decompress roundtrip test
- Added directory handler tests

See: https://chatgpt.com/share/68e0250f-40d4-800f-911d-2b4700246574
This commit is contained in:
2025-10-03 21:34:02 +02:00
parent c5938cf482
commit b55576beb2
6 changed files with 484 additions and 46 deletions

0
__init__.py Normal file
View File