chore: rename repository state file to STATE.md

2026-04-26 02:35:38 +02:00
parent c138d9201e
commit 76101e77c9
4 changed files with 139 additions and 166 deletions
@@ -5,10 +5,10 @@ alwaysApply: true
 Apply consistent coding standards, preserve existing project conventions, and avoid introducing breaking changes in all files unless explicitly requested.
 Also read and consider the repository state file when it exists:
- `state.md`
+- `STATE.md`
 When this file is present:
 - use it as evolving project context,
 - keep recommendations aligned with it,
 - update suggestions to remain consistent with documented architecture, constraints, and review notes,
- update `state.md` as part of the same task whenever changes materially affect repository behavior, architecture, constraints, or review notes.
+- update `STATE.md` as part of the same task whenever changes materially affect repository behavior, architecture, constraints, or review notes.
@@ -0,0 +1,83 @@
 # Repository state
 ## Project summary
 - Name: `netupgrade`
 - Language: Bash
 - Entry point: `bin/netupgrade`
 - Purpose: interactive CLI to run maintenance and upgrade actions on multiple remote hosts over SSH
 - Config model: Bash configuration files sourced from `~/.config/netupgrade/*.cfg`
 ## Repository structure
 - `bin/netupgrade`: main executable script containing CLI parsing, node selection, remote execution, and logging
 - `config/netupgrade/*.cfg`: sample configuration files defining host groups and action sequences
 - `README.md`: installation and usage documentation
 - `TODO.md`: prioritized next actions and follow-up work
 - `docs/`: project documentation
 ## Current behavior
 - Loads a Bash configuration file
 - Expects a `NODES` array with entries formatted as `host;display-name;action1;action2;...`
 - Displays an interactive multi-select checklist using `whiptail`
 - Executes selected actions sequentially over SSH
 - Uses `root@host` by default, or `SSH_USER@host` when configured
 - Writes execution logs to `~/netupgrade.log`
 - Opens the log with `$EDITOR`, then `nano`, `vi`, or `less`; if none is available, prints the log path and continues with the existing optional cleanup prompt
 ## Supported actions
 - `apt`
 - `yum`
 - `pkg`
 - `pacman`
 - `apk`
 - `reboot`
 - `cmd:<remote command>`
 - `docker-stacks:<directory>`
 ## Architecture and constraints
 - The project is intentionally lightweight and shell-based
 - The main execution flow still lives mostly in a single Bash script
 - Configuration is code-driven rather than declarative, because config files are sourced as shell files
 - Config files are trusted inputs and imply arbitrary code execution
 - Remote operations are sequential, not parallel
 - Logging is file-based and tightly coupled to command execution
 - Backward compatibility for existing config files should be preserved where possible
 - Changes to SSH command construction require extra care because quoting regressions are easy to introduce
 ## Notable implementation details
 - SSH execution is centralized through a `runSSH` helper
 - `SSH_USER` is configurable and defaults to `root`
 - `NODES` parsing was hardened to preserve spaces in action values such as `cmd:...`
 - `cmd:<...>` is executed through a remote shell to support shell operators, but remains intentionally powerful and unsafe
 - `reboot` uses a detached remote command to reduce false failures caused by expected SSH disconnects
 - `docker-stacks` runs through a remote shell script sent over SSH stdin to improve path handling and quoting
 - Startup dependency checks cover required local commands, including `mv` for log-summary rewriting
 ## Sensitive areas for future changes
 - SSH command construction and shell quoting behavior
 - `cmd:<...>` execution semantics
 - `docker-stacks:<...>` path handling and remote shell behavior
 - Package-manager cleanup commands and confirmation flags
 - Error propagation consistency across action types
 - Documentation and runtime behavior alignment
 - Sample configuration accuracy to avoid operator mistakes
 ## Recent meaningful changes
 - README and CLI help were aligned with current behavior, including the meaning of `-f`
 - Startup dependency checks were added and updated
 - Log viewer fallback was made more flexible
 - `NODES` parsing was hardened to preserve spaces in action values
 - SSH execution was centralized and made configurable through `SSH_USER`
 - Remote command construction was improved for `pacman`, `docker-stacks`, and `reboot`
 - Log summary generation was rewritten to avoid `sed -i` interpolation and now replaces the log atomically
 - `apk` and `apt` cleanup behavior was corrected to better match real package-manager semantics
 - The sample configuration in `config/netupgrade/hypervisor-01.cfg` now keeps the alternate `docker-01` reboot example commented out rather than active by default
 ## Change guidance
 - `STATE.md` should stay focused on current state, architecture, and constraints
 - `TODO.md` should hold prioritized follow-up work and next implementation steps
 - Prefer small, reviewable hardening changes over broad rewrites
 - Keep the tool simple and admin-friendly
 - Treat documentation consistency as part of functional correctness
 - Be cautious with remote command construction, because quoting changes can easily introduce regressions
 - Avoid introducing a global `set -euo pipefail` baseline in one step without first documenting and validating the intended failure semantics
 - Focus future testing first on parsing, command construction, and result reporting
@@ -0,0 +1,54 @@
 # TODO
 ## Priorité haute
 ### 1. Définir une politique d'exécution et de gestion d'erreur explicite
 - Documenter ce qui est considéré comme fatal vs best-effort
 - Définir si la séquence d'actions d'un hôte doit continuer après un échec
 - Standardiser les codes de sortie
 - Aligner le comportement réel, l'aide CLI et le README
 ### 2. Sécuriser les zones sensibles de construction de commandes SSH
 - Revoir les zones de quoting encore fragiles
 - Prioriser `cmd:<...>` et `docker-stacks:<...>`
 - Documenter des cas de test manuels pour les commandes avec espaces, pipes, redirections, `&&`, `||` et arguments quotés
 - Réduire le risque de régressions lors des prochains durcissements
 ## Priorité moyenne
 ### 3. Améliorer le mode semi-automatisé / non interactif
 - Ajouter de vraies options batch-friendly, par exemple :
  - `--all`
  - `--no-log-view`
  - `--keep-log`
  - `--log-path <path>`
 - Clarifier la différence entre préselection interactive (`-f`) et exécution réellement non interactive
 ### 4. Ajouter un minimum de validation et de qualité shell
 - Ajouter de la guidance ShellCheck dans le projet
 - Envisager `shfmt` si cela reste léger
 - Ajouter des vérifications simples autour des fichiers de configuration sourcés
 - Rester incrémental, sans imposer brutalement un strict mode global
 ### 5. Réduire progressivement la complexité du script principal
 - Extraire des helpers quand cela réduit la duplication sans casser le comportement
 - Simplifier les branches liées aux package managers
 - Garder des changements petits et relisibles
 ## Priorité basse
 ### 6. Réfléchir à une configuration plus sûre si le projet grossit
 - Le sourcing Bash reste acceptable tant que le modèle est assumé
 - Si le projet s'étend, envisager à terme un format déclaratif plus sûr
 ### 7. Introduire des tests ciblés
 - Prioriser les tests de parsing
 - Ajouter des tests de construction de commandes
 - Couvrir les régressions connues autour du quoting SSH
 - Commencer petit plutôt que viser immédiatement une grosse suite end-to-end
 ## Rappels de conduite
 - Préserver la compatibilité des fichiers de configuration existants autant que possible
 - Préférer de petits commits logiques
 - Traiter la cohérence documentation/comportement comme un sujet fonctionnel
 - Être particulièrement prudent sur toute modification du quoting ou des commandes distantes
@@ -1,164 +0,0 @@
 # Repository analysis
 ## Project overview
 - Project name: `netupgrade`
 - Primary language: Bash
 - Main entrypoint: `bin/netupgrade`
 - Project type: interactive CLI administration tool
 - Main purpose: orchestrate remote upgrade and maintenance actions on multiple hosts over SSH
 - Configuration model: Bash-based configuration files sourced from `~/.config/netupgrade/*.cfg`
 ## Repository structure
 - `bin/netupgrade`: main executable script containing CLI parsing, node selection, remote execution, and logging
 - `config/netupgrade/*.cfg`: sample configuration files defining host groups and action sequences
 - `README.md`: installation and usage documentation
 - `docs/`: project documentation
 ## Functional behavior
 The tool:
 1. Loads a Bash configuration file
 2. Expects a `NODES` array populated with entries formatted like:
   `host;display-name;action1;action2;...`
 3. Displays an interactive multi-select checklist using `whiptail`
 4. Executes the selected actions on each selected host through SSH, using `root@host` by default or `SSH_USER@host` when configured
 5. Writes execution logs to `~/netupgrade.log`
 6. Opens the log with `$EDITOR` when available, otherwise `nano`, `vi`, or `less`; if none is available, it prints the log path, then optionally removes the log file
 Supported action types currently include:
 - `apt`
 - `yum`
 - `pkg`
 - `pacman`
 - `apk`
 - `reboot`
 - `cmd:<remote command>`
 - `docker-stacks:<directory>`
 ## Architecture notes
 - The project is intentionally lightweight and script-based
 - Configuration is code-driven rather than declarative, since config files are sourced as shell files
 - The entire execution flow currently lives in a single Bash script
 - Remote operations are performed sequentially, not in parallel
 - Logging is file-based and coupled directly to command execution
 ## Strengths
 - Very small and easy to deploy
 - Clear practical purpose for system administration workflows
 - Flexible host/action configuration model
 - Supports several Linux/BSD package managers
 - Suitable for use from a bastion host or admin workstation
 ## Main issues identified
 ### 1. Documentation accuracy problems
 - `README.md` and CLI help were updated to better match current behavior
 - The previous typo in the configuration path (`netuprade`) has been fixed
 - The unsupported `-b` option was removed from the displayed help
 - The configuration format and supported actions are now documented in more detail
 ### 2. Shell robustness concerns
 - Config files are sourced directly, which is flexible but implies arbitrary code execution
 - The script still has some quoting-sensitive areas and does not use a stricter shell safety baseline
 - The `NODES` parsing was hardened to split on `;` with `IFS`/`read -r -a`, which now preserves spaces in action values such as `cmd:...`
 ### 3. Remote execution correctness and safety
 - SSH execution now goes through a dedicated `runSSH` helper and the SSH user is configurable via `SSH_USER`, defaulting to `root`
 - `cmd:<...>` intentionally allows arbitrary remote command execution and is now executed through a remote shell, which improves support for shell operators but remains a powerful unsafe feature
 - The `pacman` orphan-removal command was corrected so orphan detection happens on the remote host instead of locally
 - The `docker-stacks` remote loop was rewritten to pass the stack root as an argument to a remote shell script, improving quoting and path handling
 ### 4. UX and dependency issues
 - Required runtime dependencies are now checked at startup (`ssh`, `whiptail`, `cat`, `tee`, `rm`, `touch`)
 - Log viewing no longer depends strictly on `nano`; the script now falls back to `$EDITOR`, then `nano`, `vi`, or `less`
 - If no supported log viewer is available, execution continues and the log path is shown
 - The workflow is highly interactive and not well suited for automation
 - `rm -i` introduces an extra prompt even when the rest of the flow is meant to be streamlined
 ### 5. Error handling limitations
 - Error propagation is inconsistent depending on the action type
 - Cleanup commands often do not affect the final failure state
 - The script continues through action sequences without a documented policy
 - The advertised “break on error” behavior does not exist yet
 ### 6. Maintainability limitations
 - Most logic is concentrated in one script
 - There is duplication in package-manager handling
 - No tests or validation tooling are present in the repository
 - Some wording, typos, and naming inconsistencies reduce clarity
 ## Recommended direction
 ### Short term
 - Tackle the next hardening work as small, reviewable commits instead of one broad patch
 - Define and document an explicit error-handling policy: what is fatal, what is best-effort, and whether host action sequences continue after a failure
 - Review the remaining quoting-sensitive areas, especially around remote shell command construction for `cmd:<...>` and `docker-stacks:<...>`
 - Align `README.md`, CLI help, and runtime behavior on dependencies and interaction details, especially the actual meaning of `-f`, current log-file behavior, and whether `sed` is still required
 - Review sample configuration files for naming or targeting inconsistencies that could lead to operator mistakes
 ### Medium term
 - Make SSH user, log path, and editor configurable
 - Improve non-interactive usage options with explicit batch-friendly flags such as a true `--all`, `--no-log-view`, `--keep-log`, and custom log-path support
 - Standardize error handling and exit codes with a documented policy for best-effort cleanup steps versus fatal failures
 - Add lightweight validation or warnings around sourced config files, for example around trust expectations or overly permissive file permissions
 - Consider adopting a clearer shell option baseline such as an explicit global `pipefail` policy
 ### Long term
 - Refactor the script into smaller functions with less duplication
 - Add shell linting guidance and automation (for example ShellCheck, optionally shfmt)
 - Consider a safer declarative configuration format if the project grows
 - Add test coverage for parsing, command construction, and non-regression around SSH quoting behavior
 ## Recent changes
 - `README.md` was expanded to document installation, requirements, usage, configuration format, and supported actions
 - `bin/netupgrade` help output was aligned with actual CLI behavior and now documents `--help`
 - A startup dependency check was added before loading configuration or opening the interactive selector
 - Log viewer selection was made more flexible: `$EDITOR` is preferred, then `nano`, `vi`, or `less`
 - `nano` is no longer a strict runtime dependency
 - The unsupported `-b` option remains unimplemented and is no longer shown in help output
 - `NODES` parsing was hardened to preserve spaces in action values by splitting on `;` with `IFS` and `read -r -a`
 - SSH calls were centralized through a `runSSH` helper and `SSH_USER` is now configurable, defaulting to `root`
 - The `pacman` orphan cleanup now runs entirely on the remote host instead of evaluating orphan detection locally
 - The `pacman` orphan cleanup remote command now avoids nested `bash -lc` argument-passing issues by selecting between two simple remote `sh -c` commands, one with `--noconfirm` and one without
 - The `docker-stacks` action uses a remote shell script sent over SSH stdin, with the stack directory exported as a remote environment assignment before `bash -s`, to keep path handling working after recent SSH command-construction changes
 - Unknown actions and reboot SSH failures now propagate error status more consistently
 - The `reboot` action now triggers a detached remote reboot command (`reboot || /sbin/reboot || shutdown -r now` under `nohup`) so an expected SSH disconnect during restart is less likely to be reported as a failure
 - A focused code review identified the next recommended work items and suggested splitting them into separate commits rather than combining them in one larger hardening change
 - `whiptail` checklist defaults are now passed explicitly as `ON`/`OFF`, and selected items are parsed through a dedicated helper instead of relying on raw shell word splitting
 - The CLI help and README now clarify that `-f` preselects all nodes in the interactive checklist
 - Log summary generation no longer uses `sed -i` interpolation; the script now writes a temporary file with the summary header plus the existing log content and replaces the original log atomically
 - The `apk` action no longer passes `-y` to `apk upgrade`, because current Alpine `apk` does not accept that option there; `-y` remains a best-effort flag for other supported package managers
 - The `apt` action now uses `apt-get autoremove --purge` and no longer runs `apt-get purge` without arguments, which makes the cleanup step more meaningful and avoids a misleading command in the log
 - The `pacman` action was further hardened by simplifying orphan cleanup command construction, reducing quoting-related regressions while still skipping removal when no orphan packages are present
 - `README.md` no longer lists `sed` as a required dependency and now better reflects the local utilities actually used by the script
 - Startup dependency checks now include `mv`, which is required by the log-summary rewrite
 - The sample configuration in `config/netupgrade/hypervisor-01.cfg` now comments out the alternate `docker-01` reboot step so the two-step example remains visible without being active by default
 ## Change guidance
 - Preserve backward compatibility for existing config files where possible
 - Prefer incremental hardening over a full rewrite
 - Keep the tool simple and admin-friendly
 - Split behavioral fixes into small logical commits when possible, for example: selection handling, log generation, package-manager cleanup semantics, error-policy changes, and non-interactive CLI improvements
 - Be cautious with changes to remote command construction, as quoting changes can introduce regressions
 - Treat documentation and behavior alignment as part of functional quality, not as a separate cleanup task
 - Avoid introducing a global `set -euo pipefail` baseline in one step without first documenting and testing the expected failure semantics
 ## Suggested review focus for future changes
 - Correctness of package-manager cleanup commands and confirmation flags
 - Correctness of remote command execution
 - Safe quoting and shell expansion behavior
 - Compatibility of config format with existing user setups
 - Error-handling policy consistency across action types
 - Package-manager command correctness and cleanup-step behavior
 - Usability in both interactive and semi-automated contexts
 - Documentation consistency with actual runtime behavior and dependencies
 - Review of sample config accuracy to avoid mislabelled hosts or risky operator confusion
 ## Additional recommendations from latest review
 - Highest priority should go to defining an explicit execution and failure policy, because it currently affects operator trust more than missing features do
 - The next highest priority should be protecting against regressions in SSH command construction by documenting manual test cases for commands with spaces, pipes, redirections, `&&`, `||`, and quoted arguments
 - A small CLI usability pass would have strong value: `-f` currently only preselects nodes in `whiptail`, so a true non-interactive selection mode would improve automation without changing the overall project model
 - The dependency list was realigned: `README.md` no longer mentions `sed`, and the script dependency check now includes `mv` for the log-summary rewrite
 - The sample configuration set was clarified to avoid an active duplicate reboot step for `docker-01`; the alternate `cmd:reboot` example remains commented out for illustration
 - The sample configuration set should be reviewed for consistency; for example, duplicate or mismatched display names attached to different IPs increase the risk of accidental operations on the wrong host
 - Shell quality improvements should favor linting, targeted helpers, and incremental refactors before any broad strict-mode changes
 - Future testing should focus first on parser behavior, command construction, and result reporting rather than trying to build a large end-to-end framework immediately