Refactor web health checker & domain expectations (filter-based)

- Move all domain→expected-status mapping to filter `web_health_expectations`.
- Require explicit app selection via non-empty `group_names`; only those apps are included.
- Add `www_enabled` flag (wired via `WWW_REDIRECT_ENABLED`) to generate/force www.* → 301.
- Support `redirect_maps` to include manual redirects (sources forced to 301), independent of app selection.
- Aliases always 301; canonicals use per-key override or `server.status_codes.default`, else [200,302,301].
- Remove legacy fallbacks (`server.status_codes.home` / `landingpage`).
- Wire filter output into systemd ExecStart script as JSON expectations.
- Normalize various templates to use `to_json` and minor spacing fixes.
- Update app configs (e.g., YOURLS default=301; Confluence default=302; Bluesky web=405; MediaWiki/Confluence canonical/aliases).
- Constructor now uses `WWW_REDIRECT_ENABLED` for domain generation.

Tests:
- Add comprehensive unit tests for filter: selection by group, keyed/default codes, aliases, www handling, redirect_maps, input sanitization.
- Add unit tests for the standalone checker script (JSON parsing, OK/mismatch counting, sanitization).

See conversation: https://chatgpt.com/share/68c2b93e-de58-800f-8c16-ea05755ba776
This commit is contained in:
2025-09-11 13:58:16 +02:00
parent 6418a462ec
commit cbfb096cdb
35 changed files with 717 additions and 106 deletions

View File

@@ -0,0 +1,72 @@
#!/usr/bin/env python3
"""
Ultra-thin checker: consume a JSON mapping of {domain: [expected_status_codes]}
and verify HTTP HEAD responses. All mapping logic is done in the filter
`web_health_expectations`.
"""
import argparse
import json
import sys
from typing import Dict, List
import requests
def parse_args(argv=None):
p = argparse.ArgumentParser(description="Web health checker (expects precomputed domain→codes mapping).")
p.add_argument("--web-protocol", default="https", choices=["http", "https"], help="Protocol to use")
p.add_argument("--expectations", required=True, help="JSON STRING: {\"domain\": [codes], ...}")
return p.parse_args(argv)
def _parse_json_mapping(name: str, value: str) -> Dict[str, List[int]]:
try:
obj = json.loads(value)
except json.JSONDecodeError as e:
raise SystemExit(f"--{name} must be a valid JSON string: {e}")
if not isinstance(obj, dict):
raise SystemExit(f"--{name} must be a JSON object (mapping)")
# sanitize list-of-ints shape
clean = {}
for k, v in obj.items():
if isinstance(v, list):
try:
clean[k] = [int(x) for x in v]
except Exception:
clean[k] = []
else:
clean[k] = []
return clean
def main(argv=None) -> int:
args = parse_args(argv)
expectations = _parse_json_mapping("expectations", args.expectations)
errors = 0
for domain in sorted(expectations.keys()):
expected = expectations[domain] or []
url = f"{args.web_protocol}://{domain}"
try:
r = requests.head(url, allow_redirects=False, timeout=10)
if expected and r.status_code in expected:
print(f"{domain}: OK")
elif not expected:
# If somehow empty list slipped through, treat as failure to be explicit
print(f"{domain}: ERROR: No expectations provided. Got {r.status_code}.")
errors += 1
else:
print(f"{domain}: ERROR: Expected {expected}. Got {r.status_code}.")
errors += 1
except requests.RequestException as e:
print(f"{domain}: error due to {e}")
errors += 1
if errors:
print(f"Warning: {errors} domains responded with an unexpected https status code.")
return errors
if __name__ == "__main__":
sys.exit(main())