1
0
Files
netupgrade/state.md
T

11 KiB

Repository analysis

Project overview

  • Project name: netupgrade
  • Primary language: Bash
  • Main entrypoint: bin/netupgrade
  • Project type: interactive CLI administration tool
  • Main purpose: orchestrate remote upgrade and maintenance actions on multiple hosts over SSH
  • Configuration model: Bash-based configuration files sourced from ~/.config/netupgrade/*.cfg

Repository structure

  • bin/netupgrade: main executable script containing CLI parsing, node selection, remote execution, and logging
  • config/netupgrade/*.cfg: sample configuration files defining host groups and action sequences
  • README.md: installation and usage documentation
  • docs/: project documentation

Functional behavior

The tool:

  1. Loads a Bash configuration file
  2. Expects a NODES array populated with entries formatted like: host;display-name;action1;action2;...
  3. Displays an interactive multi-select checklist using whiptail
  4. Executes the selected actions on each selected host through SSH, using root@host by default or SSH_USER@host when configured
  5. Writes execution logs to ~/netupgrade.log
  6. Opens the log with $EDITOR when available, otherwise nano, vi, or less; if none is available, it prints the log path, then optionally removes the log file

Supported action types currently include:

  • apt
  • yum
  • pkg
  • pacman
  • apk
  • reboot
  • cmd:<remote command>
  • docker-stacks:<directory>

Architecture notes

  • The project is intentionally lightweight and script-based
  • Configuration is code-driven rather than declarative, since config files are sourced as shell files
  • The entire execution flow currently lives in a single Bash script
  • Remote operations are performed sequentially, not in parallel
  • Logging is file-based and coupled directly to command execution

Strengths

  • Very small and easy to deploy
  • Clear practical purpose for system administration workflows
  • Flexible host/action configuration model
  • Supports several Linux/BSD package managers
  • Suitable for use from a bastion host or admin workstation

Main issues identified

1. Documentation accuracy problems

  • README.md and CLI help were updated to better match current behavior
  • The previous typo in the configuration path (netuprade) has been fixed
  • The unsupported -b option was removed from the displayed help
  • The configuration format and supported actions are now documented in more detail

2. Shell robustness concerns

  • Config files are sourced directly, which is flexible but implies arbitrary code execution
  • The script still has some quoting-sensitive areas and does not use a stricter shell safety baseline
  • The NODES parsing was hardened to split on ; with IFS/read -r -a, which now preserves spaces in action values such as cmd:...

3. Remote execution correctness and safety

  • SSH execution now goes through a dedicated runSSH helper and the SSH user is configurable via SSH_USER, defaulting to root
  • cmd:<...> intentionally allows arbitrary remote command execution and is now executed through a remote shell, which improves support for shell operators but remains a powerful unsafe feature
  • The pacman orphan-removal command was corrected so orphan detection happens on the remote host instead of locally
  • The docker-stacks remote loop was rewritten to pass the stack root as an argument to a remote shell script, improving quoting and path handling

4. UX and dependency issues

  • Required runtime dependencies are now checked at startup (ssh, whiptail, cat, tee, rm, touch)
  • Log viewing no longer depends strictly on nano; the script now falls back to $EDITOR, then nano, vi, or less
  • If no supported log viewer is available, execution continues and the log path is shown
  • The workflow is highly interactive and not well suited for automation
  • rm -i introduces an extra prompt even when the rest of the flow is meant to be streamlined

5. Error handling limitations

  • Error propagation is inconsistent depending on the action type
  • Cleanup commands often do not affect the final failure state
  • The script continues through action sequences without a documented policy
  • The advertised “break on error” behavior does not exist yet

6. Maintainability limitations

  • Most logic is concentrated in one script
  • There is duplication in package-manager handling
  • No tests or validation tooling are present in the repository
  • Some wording, typos, and naming inconsistencies reduce clarity

Short term

  • Tackle the next hardening work as small, reviewable commits instead of one broad patch
  • Define and document an explicit error-handling policy: what is fatal, what is best-effort, and whether host action sequences continue after a failure
  • Review the remaining quoting-sensitive areas, especially around remote shell command construction for cmd:<...> and docker-stacks:<...>
  • Align README.md, CLI help, and runtime behavior on dependencies and interaction details, especially the actual meaning of -f, current log-file behavior, and whether sed is still required
  • Review sample configuration files for naming or targeting inconsistencies that could lead to operator mistakes

Medium term

  • Make SSH user, log path, and editor configurable
  • Improve non-interactive usage options with explicit batch-friendly flags such as a true --all, --no-log-view, --keep-log, and custom log-path support
  • Standardize error handling and exit codes with a documented policy for best-effort cleanup steps versus fatal failures
  • Add lightweight validation or warnings around sourced config files, for example around trust expectations or overly permissive file permissions
  • Consider adopting a clearer shell option baseline such as an explicit global pipefail policy

Long term

  • Refactor the script into smaller functions with less duplication
  • Add shell linting guidance and automation (for example ShellCheck, optionally shfmt)
  • Consider a safer declarative configuration format if the project grows
  • Add test coverage for parsing, command construction, and non-regression around SSH quoting behavior

Recent changes

  • README.md was expanded to document installation, requirements, usage, configuration format, and supported actions
  • bin/netupgrade help output was aligned with actual CLI behavior and now documents --help
  • A startup dependency check was added before loading configuration or opening the interactive selector
  • Log viewer selection was made more flexible: $EDITOR is preferred, then nano, vi, or less
  • nano is no longer a strict runtime dependency
  • The unsupported -b option remains unimplemented and is no longer shown in help output
  • NODES parsing was hardened to preserve spaces in action values by splitting on ; with IFS and read -r -a
  • SSH calls were centralized through a runSSH helper and SSH_USER is now configurable, defaulting to root
  • The pacman orphan cleanup now runs entirely on the remote host instead of evaluating orphan detection locally
  • The docker-stacks action was rewritten to use a remote shell script with the stack directory passed as an argument
  • Unknown actions and reboot SSH failures now propagate error status more consistently
  • A focused code review identified the next recommended work items and suggested splitting them into separate commits rather than combining them in one larger hardening change
  • whiptail checklist defaults are now passed explicitly as ON/OFF, and selected items are parsed through a dedicated helper instead of relying on raw shell word splitting
  • The CLI help and README now clarify that -f preselects all nodes in the interactive checklist
  • Log summary generation no longer uses sed -i interpolation; the script now writes a temporary file with the summary header plus the existing log content and replaces the original log atomically
  • The apk action now applies -y to apk upgrade when the CLI -y flag is set, making its behavior consistent with the documented intent for supported package managers
  • The apt action now uses apt-get autoremove --purge and no longer runs apt-get purge without arguments, which makes the cleanup step more meaningful and avoids a misleading command in the log

Change guidance

  • Preserve backward compatibility for existing config files where possible
  • Prefer incremental hardening over a full rewrite
  • Keep the tool simple and admin-friendly
  • Split behavioral fixes into small logical commits when possible, for example: selection handling, log generation, package-manager cleanup semantics, error-policy changes, and non-interactive CLI improvements
  • Be cautious with changes to remote command construction, as quoting changes can introduce regressions
  • Treat documentation and behavior alignment as part of functional quality, not as a separate cleanup task
  • Avoid introducing a global set -euo pipefail baseline in one step without first documenting and testing the expected failure semantics

Suggested review focus for future changes

  • Correctness of package-manager cleanup commands and confirmation flags
  • Correctness of remote command execution
  • Safe quoting and shell expansion behavior
  • Compatibility of config format with existing user setups
  • Error-handling policy consistency across action types
  • Package-manager command correctness and cleanup-step behavior
  • Usability in both interactive and semi-automated contexts
  • Documentation consistency with actual runtime behavior and dependencies
  • Review of sample config accuracy to avoid mislabelled hosts or risky operator confusion

Additional recommendations from latest review

  • Highest priority should go to defining an explicit execution and failure policy, because it currently affects operator trust more than missing features do
  • The next highest priority should be protecting against regressions in SSH command construction by documenting manual test cases for commands with spaces, pipes, redirections, &&, ||, and quoted arguments
  • A small CLI usability pass would have strong value: -f currently only preselects nodes in whiptail, so a true non-interactive selection mode would improve automation without changing the overall project model
  • The dependency list should be rechecked: README.md still mentions sed, while the current implementation no longer appears to require it after the log-summary rewrite
  • The sample configuration set should be reviewed for consistency; for example, duplicate or mismatched display names attached to different IPs increase the risk of accidental operations on the wrong host
  • Shell quality improvements should favor linting, targeted helpers, and incremental refactors before any broad strict-mode changes
  • Future testing should focus first on parser behavior, command construction, and result reporting rather than trying to build a large end-to-end framework immediately