The GitHub Actions DinD environment failed to start inner containers due to
cgroup v2 namespace isolation problems ('cannot enter cgroupv2 ... invalid state').
To resolve this, all docker run calls inside the CI workflow were updated
to include --cgroupns=host, ensuring the inner dockerd inherits the host
cgroup namespace instead of being sandboxed.
This aligns the CI runtime with the expectations of runc and prevents OCI-level
container creation failures.
Details and troubleshooting steps documented here:
https://chatgpt.com/share/6930e285-9604-800f-aad8-7a81c928548c
Includes:
- Rewrite of test-deploy workflow to use isolated inner dockerd with privileged mode.
- Switch logging drivers to 'json-file' when IS_CONTAINER=true for compatibility with non-systemd CI runners.
- Adjust Dockerfile to install docker CLI and simplify package setup.
- Improve inventory creation and deploy steps for CI stability.
- Fully compatible with Ansible 2.20 variable handling.
Conversation reference:
https://chatgpt.com/share/6930e285-9604-800f-aad8-7a81c928548c
- Introduce global IS_CONTAINER flag based on ansible_virtualization facts
- Skip systemd-based handlers and tasks when running inside containers
- Extend EXCLUDED_ROLES list in GitHub Actions test-deploy workflow
- Ensure docker.sock is mounted for all CI deploy stages
- Improve sys-svc-docker by suppressing service restarts inside containers
- Add meta: flush_handlers to properly trigger delayed docker restarts
- Update sys-service handlers with container guards
- Update sys-timer tasks to avoid systemctl inside CI containers
- Enhance drv-non-free role with Manjaro detection and mhwd fallback warning
- Skip swapfile generation in containers
- Minor service template fixes and cleanup in proxy.conf.j2
Details and discussion: https://chatgpt.com/share/6930a4ca-56f4-800f-9b3d-4791f040a03b
- Updated CLI argument parsing to use --exclude instead of --ignore.
- Adjusted help texts, comments, and error messages accordingly.
- Updated role filtering logic and references (include → exclude).
- Added new unit tests for parse_roles_list(), filter_inventory_by_include(), and filter_inventory_by_ignore().
- Improved wording and consistency in docstrings.
This change is part of the refactoring required for the Ansible 2.18 → 2.20 upgrade, ensuring naming clarity and avoiding confusion with Python's 'ignore' semantics.
Conversation reference: https://chatgpt.com/share/69307ef2-1fb4-800f-a2ec-d56020019269
- Replace legacy docker_container-based MariaDB deployment with docker-compose based workflow
- Add custom Dockerfile and docker-compose templates for MariaDB
- Split MariaDB command into separate arguments to avoid entrypoint parsing errors
- Introduce MARIADB_CUSTOM_IMAGE and MARIADB_EXPOSE_LOCAL variables
- Add docker_compose_flush_handlers to ensure correct handler execution on first run
- Replace utils/once/finalize.yml with utils/once/flag.yml for new run-once semantics
- Align variable naming with Infinito.Nexus UPPERCASE conventions
- Fix PostgreSQL custom image variable name (POSTGRES_CUSTOM_IMAGE_NAME → POSTGRES_CUSTOM_IMAGE)
- Remove obsolete flush_handlers var injection in svc-db-postgres/tasks/main.yml
- General cleanup after migration from Ansible 2.18 → 2.20
Conversation reference:
https://chatgpt.com/share/69306c81-9934-800f-b317-f53a8f246a73
This change sets ansible_python_interpreter to /usr/bin/python3 when including
01_core.yml. It avoids permission issues when Ansible runs module-based tasks
as the non-privileged AUR builder user, since the virtualenv Python binary is
not executable for that user.
Context and discussion:
https://chatgpt.com/share/6930230d-d7e0-800f-a5dc-67d7f75020e5
- Renamed test-cli.yml to test-code.yml and updated job name.
- Extended timeout for test-deploy workflow from 30 to 240 minutes.
- Skipped Ansible timezone configuration inside Docker/Podman/containerd to avoid write errors in CI.
- Added --skip-tests to the initial deploy step for improved CI stability.
Origin: https://chatgpt.com/share/69301c58-6628-800f-9e3a-f026c01b6e17
- Implement ensure_become_password() to handle explicit, generated, and existing become passwords
- Integrate VaultHandler for encrypted ansible_become_password storage
- Add CLI parameter --become-password to inventory creation workflow
- Ensure backwards compatibility: existing passwords remain untouched unless explicitly overridden
- Add unit test verifying non-overwrite behaviour when no password is provided
- Part of migration and refactoring for Ansible 2.20 upgrade
Reference: https://chatgpt.com/share/69301a6d-e920-800f-b19c-e5ca7c3bdd24
This commit updates multiple roles to ensure compatibility with Ansible 2.20.
Several include paths and task-loading mechanisms required adjustments,
as Ansible 2.20 applies stricter evaluation rules for complex Jinja expressions
and no longer resolves certain relative include paths the way Ansible 2.18 did.
Key changes:
- Replaced legacy once_finalize.yml and once_flag.yml with the new structure
under tasks/utils/once/finalize.yml and tasks/utils/once/flag.yml.
- Updated all include_tasks statements to use 'path_join' with playbook_dir,
ensuring deterministic and absolute file resolution across roles.
- Fixed all network helper includes by converting direct relative paths such as
'roles/docker-compose/tasks/utils/network.yml' to proper Jinja-evaluated paths.
- Normalized MATOMO_* variable names for consistency with the updated variable
scope behavior in Ansible 2.20.
- Removed deprecated patterns that were implicitly supported in Ansible 2.18
but break under the more strict variable and path resolution model in 2.20.
These changes are part of the full migration step required to ensure the
infinito-nexus roles remain stable, deterministic, and forward-compatible with
Ansible 2.20.
Details of the discussion and reasoning can be found in this conversation:
https://chatgpt.com/share/69300a8d-24d4-800f-bec0-e895a695618a
The deploy wrapper previously used subprocess.run(..., capture_output=True),
which buffered all Ansible output until the playbook finished.
This made the CLI appear stuck at 'Launching Ansible Playbook…'.
Switching to subprocess.run(cmd) restores live streaming of Ansible output.
Details: https://chatgpt.com/share/693008b4-b7b0-800f-bd35-5a307a76fc59
This change introduces a WHITELISTED_HANDLERS mechanism, allowing specific
handlers to be intentionally skipped due to conditional 'False' evaluations
without causing test failures. Improves flexibility while keeping the
architectural policy enforced for all other handlers.
Reference: https://chatgpt.com/share/692f6841-c19c-800f-8d6c-aa1ef48dcf7e
Why:
- Ansible 2.20+ deprecates INJECT_FACTS_AS_VARS and direct usage of top-level ansible_* facts.
- This change updates all affected roles and vars files to the new supported syntax.
- Ensures compatibility with upcoming Ansible 2.24 removal of implicit fact injection.
Conversation reference:
https://chatgpt.com/share/692f639b-1380-800f-9f18-732f7108e9e2
- Correct grouping of reachability check
- Replace incorrect boolean cast for mailu_token with length check
- Load Mailu routines only when host is unreachable or token is missing
Details: https://chatgpt.com/share/692f1e58-0d6c-800f-9699-e9a26f1e8db9
This change updates test-cli.yml to use --network=host for docker build
and docker run steps. This significantly reduces intermittent Arch mirror
timeouts observed in local and CI environments.
Reference:
https://chatgpt.com/share/692f1bd7-f144-800f-b2ac-900d78a69e9d