12 KiB
12 KiB
Repository analysis
Project overview
- Project name:
netupgrade - Primary language: Bash
- Main entrypoint:
bin/netupgrade - Project type: interactive CLI administration tool
- Main purpose: orchestrate remote upgrade and maintenance actions on multiple hosts over SSH
- Configuration model: Bash-based configuration files sourced from
~/.config/netupgrade/*.cfg
Repository structure
bin/netupgrade: main executable script containing CLI parsing, node selection, remote execution, and loggingconfig/netupgrade/*.cfg: sample configuration files defining host groups and action sequencesREADME.md: installation and usage documentationdocs/: project documentation
Functional behavior
The tool:
- Loads a Bash configuration file
- Expects a
NODESarray populated with entries formatted like:host;display-name;action1;action2;... - Displays an interactive multi-select checklist using
whiptail - Executes the selected actions on each selected host through SSH, using
root@hostby default orSSH_USER@hostwhen configured - Writes execution logs to
~/netupgrade.log - Opens the log with
$EDITORwhen available, otherwisenano,vi, orless; if none is available, it prints the log path, then optionally removes the log file
Supported action types currently include:
aptyumpkgpacmanapkrebootcmd:<remote command>docker-stacks:<directory>
Architecture notes
- The project is intentionally lightweight and script-based
- Configuration is code-driven rather than declarative, since config files are sourced as shell files
- The entire execution flow currently lives in a single Bash script
- Remote operations are performed sequentially, not in parallel
- Logging is file-based and coupled directly to command execution
Strengths
- Very small and easy to deploy
- Clear practical purpose for system administration workflows
- Flexible host/action configuration model
- Supports several Linux/BSD package managers
- Suitable for use from a bastion host or admin workstation
Main issues identified
1. Documentation accuracy problems
README.mdand CLI help were updated to better match current behavior- The previous typo in the configuration path (
netuprade) has been fixed - The unsupported
-boption was removed from the displayed help - The configuration format and supported actions are now documented in more detail
2. Shell robustness concerns
- Config files are sourced directly, which is flexible but implies arbitrary code execution
- The script still has some quoting-sensitive areas and does not use a stricter shell safety baseline
- The
NODESparsing was hardened to split on;withIFS/read -r -a, which now preserves spaces in action values such ascmd:...
3. Remote execution correctness and safety
- SSH execution now goes through a dedicated
runSSHhelper and the SSH user is configurable viaSSH_USER, defaulting toroot cmd:<...>intentionally allows arbitrary remote command execution and is now executed through a remote shell, which improves support for shell operators but remains a powerful unsafe feature- The
pacmanorphan-removal command was corrected so orphan detection happens on the remote host instead of locally - The
docker-stacksremote loop was rewritten to pass the stack root as an argument to a remote shell script, improving quoting and path handling
4. UX and dependency issues
- Required runtime dependencies are now checked at startup (
ssh,whiptail,cat,tee,rm,touch) - Log viewing no longer depends strictly on
nano; the script now falls back to$EDITOR, thennano,vi, orless - If no supported log viewer is available, execution continues and the log path is shown
- The workflow is highly interactive and not well suited for automation
rm -iintroduces an extra prompt even when the rest of the flow is meant to be streamlined
5. Error handling limitations
- Error propagation is inconsistent depending on the action type
- Cleanup commands often do not affect the final failure state
- The script continues through action sequences without a documented policy
- The advertised “break on error” behavior does not exist yet
6. Maintainability limitations
- Most logic is concentrated in one script
- There is duplication in package-manager handling
- No tests or validation tooling are present in the repository
- Some wording, typos, and naming inconsistencies reduce clarity
Recommended direction
Short term
- Tackle the next hardening work as small, reviewable commits instead of one broad patch
- Define and document an explicit error-handling policy: what is fatal, what is best-effort, and whether host action sequences continue after a failure
- Review the remaining quoting-sensitive areas, especially around remote shell command construction for
cmd:<...>anddocker-stacks:<...> - Align
README.md, CLI help, and runtime behavior on dependencies and interaction details, especially the actual meaning of-f, current log-file behavior, and whethersedis still required - Review sample configuration files for naming or targeting inconsistencies that could lead to operator mistakes
Medium term
- Make SSH user, log path, and editor configurable
- Improve non-interactive usage options with explicit batch-friendly flags such as a true
--all,--no-log-view,--keep-log, and custom log-path support - Standardize error handling and exit codes with a documented policy for best-effort cleanup steps versus fatal failures
- Add lightweight validation or warnings around sourced config files, for example around trust expectations or overly permissive file permissions
- Consider adopting a clearer shell option baseline such as an explicit global
pipefailpolicy
Long term
- Refactor the script into smaller functions with less duplication
- Add shell linting guidance and automation (for example ShellCheck, optionally shfmt)
- Consider a safer declarative configuration format if the project grows
- Add test coverage for parsing, command construction, and non-regression around SSH quoting behavior
Recent changes
README.mdwas expanded to document installation, requirements, usage, configuration format, and supported actionsbin/netupgradehelp output was aligned with actual CLI behavior and now documents--help- A startup dependency check was added before loading configuration or opening the interactive selector
- Log viewer selection was made more flexible:
$EDITORis preferred, thennano,vi, orless nanois no longer a strict runtime dependency- The unsupported
-boption remains unimplemented and is no longer shown in help output NODESparsing was hardened to preserve spaces in action values by splitting on;withIFSandread -r -a- SSH calls were centralized through a
runSSHhelper andSSH_USERis now configurable, defaulting toroot - The
pacmanorphan cleanup now runs entirely on the remote host instead of evaluating orphan detection locally - The
pacmanorphan cleanup remote command now avoids nestedbash -lcargument-passing issues by selecting between two simple remotesh -ccommands, one with--noconfirmand one without - The
docker-stacksaction uses a remote shell script sent over SSH stdin, with the stack directory exported as a remote environment assignment beforebash -s, to keep path handling working after recent SSH command-construction changes - Unknown actions and reboot SSH failures now propagate error status more consistently
- A focused code review identified the next recommended work items and suggested splitting them into separate commits rather than combining them in one larger hardening change
whiptailchecklist defaults are now passed explicitly asON/OFF, and selected items are parsed through a dedicated helper instead of relying on raw shell word splitting- The CLI help and README now clarify that
-fpreselects all nodes in the interactive checklist - Log summary generation no longer uses
sed -iinterpolation; the script now writes a temporary file with the summary header plus the existing log content and replaces the original log atomically - The
apkaction no longer passes-ytoapk upgrade, because current Alpineapkdoes not accept that option there;-yremains a best-effort flag for other supported package managers - The
aptaction now usesapt-get autoremove --purgeand no longer runsapt-get purgewithout arguments, which makes the cleanup step more meaningful and avoids a misleading command in the log - The
pacmanaction was further hardened by simplifying orphan cleanup command construction, reducing quoting-related regressions while still skipping removal when no orphan packages are present README.mdno longer listssedas a required dependency and now better reflects the local utilities actually used by the script- Startup dependency checks now include
mv, which is required by the log-summary rewrite - The sample configuration in
config/netupgrade/hypervisor-01.cfgnow comments out the alternatedocker-01reboot step so the two-step example remains visible without being active by default
Change guidance
- Preserve backward compatibility for existing config files where possible
- Prefer incremental hardening over a full rewrite
- Keep the tool simple and admin-friendly
- Split behavioral fixes into small logical commits when possible, for example: selection handling, log generation, package-manager cleanup semantics, error-policy changes, and non-interactive CLI improvements
- Be cautious with changes to remote command construction, as quoting changes can introduce regressions
- Treat documentation and behavior alignment as part of functional quality, not as a separate cleanup task
- Avoid introducing a global
set -euo pipefailbaseline in one step without first documenting and testing the expected failure semantics
Suggested review focus for future changes
- Correctness of package-manager cleanup commands and confirmation flags
- Correctness of remote command execution
- Safe quoting and shell expansion behavior
- Compatibility of config format with existing user setups
- Error-handling policy consistency across action types
- Package-manager command correctness and cleanup-step behavior
- Usability in both interactive and semi-automated contexts
- Documentation consistency with actual runtime behavior and dependencies
- Review of sample config accuracy to avoid mislabelled hosts or risky operator confusion
Additional recommendations from latest review
- Highest priority should go to defining an explicit execution and failure policy, because it currently affects operator trust more than missing features do
- The next highest priority should be protecting against regressions in SSH command construction by documenting manual test cases for commands with spaces, pipes, redirections,
&&,||, and quoted arguments - A small CLI usability pass would have strong value:
-fcurrently only preselects nodes inwhiptail, so a true non-interactive selection mode would improve automation without changing the overall project model - The dependency list was realigned:
README.mdno longer mentionssed, and the script dependency check now includesmvfor the log-summary rewrite - The sample configuration set was clarified to avoid an active duplicate reboot step for
docker-01; the alternatecmd:rebootexample remains commented out for illustration - The sample configuration set should be reviewed for consistency; for example, duplicate or mismatched display names attached to different IPs increase the risk of accidental operations on the wrong host
- Shell quality improvements should favor linting, targeted helpers, and incremental refactors before any broad strict-mode changes
- Future testing should focus first on parser behavior, command construction, and result reporting rather than trying to build a large end-to-end framework immediately