m/netupgrade

Fork 0

Files

T

matmoul afea447887 docs: update roadmap with review findings and next priorities

2026-04-26 00:45:03 +02:00

11 KiB

Raw Blame History

Repository analysis

Project overview

Project name: netupgrade
Primary language: Bash
Main entrypoint: bin/netupgrade
Project type: interactive CLI administration tool
Main purpose: orchestrate remote upgrade and maintenance actions on multiple hosts over SSH
Configuration model: Bash-based configuration files sourced from ~/.config/netupgrade/*.cfg

Repository structure

bin/netupgrade: main executable script containing CLI parsing, node selection, remote execution, and logging
config/netupgrade/*.cfg: sample configuration files defining host groups and action sequences
README.md: installation and usage documentation
docs/: project documentation

Functional behavior

The tool:

Loads a Bash configuration file
Expects a NODES array populated with entries formatted like: host;display-name;action1;action2;...
Displays an interactive multi-select checklist using whiptail
Executes the selected actions on each selected host through SSH, using root@host by default or SSH_USER@host when configured
Writes execution logs to ~/netupgrade.log
Opens the log with $EDITOR when available, otherwise nano, vi, or less; if none is available, it prints the log path, then optionally removes the log file

Supported action types currently include:

apt
yum
pkg
pacman
apk
reboot
cmd:<remote command>
docker-stacks:<directory>

Architecture notes

The project is intentionally lightweight and script-based
Configuration is code-driven rather than declarative, since config files are sourced as shell files
The entire execution flow currently lives in a single Bash script
Remote operations are performed sequentially, not in parallel
Logging is file-based and coupled directly to command execution

Strengths

Very small and easy to deploy
Clear practical purpose for system administration workflows
Flexible host/action configuration model
Supports several Linux/BSD package managers
Suitable for use from a bastion host or admin workstation

Main issues identified

1. Documentation accuracy problems

README.md and CLI help were updated to better match current behavior
The previous typo in the configuration path (netuprade) has been fixed
The unsupported -b option was removed from the displayed help
The configuration format and supported actions are now documented in more detail

2. Shell robustness concerns

Config files are sourced directly, which is flexible but implies arbitrary code execution
The script still has some quoting-sensitive areas and does not use a stricter shell safety baseline
The NODES parsing was hardened to split on ; with IFS/read -r -a, which now preserves spaces in action values such as cmd:...

3. Remote execution correctness and safety

SSH execution now goes through a dedicated runSSH helper and the SSH user is configurable via SSH_USER, defaulting to root
cmd:<...> intentionally allows arbitrary remote command execution and is now executed through a remote shell, which improves support for shell operators but remains a powerful unsafe feature
The pacman orphan-removal command was corrected so orphan detection happens on the remote host instead of locally
The docker-stacks remote loop was rewritten to pass the stack root as an argument to a remote shell script, improving quoting and path handling

4. UX and dependency issues

Required runtime dependencies are now checked at startup (ssh, whiptail, cat, tee, rm, touch)
Log viewing no longer depends strictly on nano; the script now falls back to $EDITOR, then nano, vi, or less
If no supported log viewer is available, execution continues and the log path is shown
The workflow is highly interactive and not well suited for automation
rm -i introduces an extra prompt even when the rest of the flow is meant to be streamlined

5. Error handling limitations

Error propagation is inconsistent depending on the action type
Cleanup commands often do not affect the final failure state
The script continues through action sequences without a documented policy
The advertised “break on error” behavior does not exist yet

6. Maintainability limitations

Most logic is concentrated in one script
There is duplication in package-manager handling
No tests or validation tooling are present in the repository
Some wording, typos, and naming inconsistencies reduce clarity

Recommended direction

Short term

Tackle the next hardening work as small, reviewable commits instead of one broad patch
Define and document an explicit error-handling policy: what is fatal, what is best-effort, and whether host action sequences continue after a failure
Review the remaining quoting-sensitive areas, especially around remote shell command construction for cmd:<...> and docker-stacks:<...>
Align README.md, CLI help, and runtime behavior on dependencies and interaction details, especially the actual meaning of -f, current log-file behavior, and whether sed is still required
Review sample configuration files for naming or targeting inconsistencies that could lead to operator mistakes

Medium term

Make SSH user, log path, and editor configurable
Improve non-interactive usage options with explicit batch-friendly flags such as a true --all, --no-log-view, --keep-log, and custom log-path support
Standardize error handling and exit codes with a documented policy for best-effort cleanup steps versus fatal failures
Add lightweight validation or warnings around sourced config files, for example around trust expectations or overly permissive file permissions
Consider adopting a clearer shell option baseline such as an explicit global pipefail policy

Long term

Refactor the script into smaller functions with less duplication
Add shell linting guidance and automation (for example ShellCheck, optionally shfmt)
Consider a safer declarative configuration format if the project grows
Add test coverage for parsing, command construction, and non-regression around SSH quoting behavior

Recent changes

README.md was expanded to document installation, requirements, usage, configuration format, and supported actions
bin/netupgrade help output was aligned with actual CLI behavior and now documents --help
A startup dependency check was added before loading configuration or opening the interactive selector
Log viewer selection was made more flexible: $EDITOR is preferred, then nano, vi, or less
nano is no longer a strict runtime dependency
The unsupported -b option remains unimplemented and is no longer shown in help output
NODES parsing was hardened to preserve spaces in action values by splitting on ; with IFS and read -r -a
SSH calls were centralized through a runSSH helper and SSH_USER is now configurable, defaulting to root
The pacman orphan cleanup now runs entirely on the remote host instead of evaluating orphan detection locally
The docker-stacks action was rewritten to use a remote shell script with the stack directory passed as an argument
Unknown actions and reboot SSH failures now propagate error status more consistently
A focused code review identified the next recommended work items and suggested splitting them into separate commits rather than combining them in one larger hardening change
whiptail checklist defaults are now passed explicitly as ON/OFF, and selected items are parsed through a dedicated helper instead of relying on raw shell word splitting
The CLI help and README now clarify that -f preselects all nodes in the interactive checklist
Log summary generation no longer uses sed -i interpolation; the script now writes a temporary file with the summary header plus the existing log content and replaces the original log atomically
The apk action now applies -y to apk upgrade when the CLI -y flag is set, making its behavior consistent with the documented intent for supported package managers
The apt action now uses apt-get autoremove --purge and no longer runs apt-get purge without arguments, which makes the cleanup step more meaningful and avoids a misleading command in the log

Change guidance

Preserve backward compatibility for existing config files where possible
Prefer incremental hardening over a full rewrite
Keep the tool simple and admin-friendly
Split behavioral fixes into small logical commits when possible, for example: selection handling, log generation, package-manager cleanup semantics, error-policy changes, and non-interactive CLI improvements
Be cautious with changes to remote command construction, as quoting changes can introduce regressions
Treat documentation and behavior alignment as part of functional quality, not as a separate cleanup task
Avoid introducing a global set -euo pipefail baseline in one step without first documenting and testing the expected failure semantics

Suggested review focus for future changes

Correctness of package-manager cleanup commands and confirmation flags
Correctness of remote command execution
Safe quoting and shell expansion behavior
Compatibility of config format with existing user setups
Error-handling policy consistency across action types
Package-manager command correctness and cleanup-step behavior
Usability in both interactive and semi-automated contexts
Documentation consistency with actual runtime behavior and dependencies
Review of sample config accuracy to avoid mislabelled hosts or risky operator confusion

Additional recommendations from latest review

Highest priority should go to defining an explicit execution and failure policy, because it currently affects operator trust more than missing features do
The next highest priority should be protecting against regressions in SSH command construction by documenting manual test cases for commands with spaces, pipes, redirections, &&, ||, and quoted arguments
A small CLI usability pass would have strong value: -f currently only preselects nodes in whiptail, so a true non-interactive selection mode would improve automation without changing the overall project model
The dependency list should be rechecked: README.md still mentions sed, while the current implementation no longer appears to require it after the log-summary rewrite
The sample configuration set should be reviewed for consistency; for example, duplicate or mismatched display names attached to different IPs increase the risk of accidental operations on the wrong host
Shell quality improvements should favor linting, targeted helpers, and incremental refactors before any broad strict-mode changes
Future testing should focus first on parser behavior, command construction, and result reporting rather than trying to build a large end-to-end framework immediately

11 KiB Raw Blame History

Repository analysis

Project overview

Repository structure

Functional behavior

Architecture notes

Strengths

Main issues identified

1. Documentation accuracy problems

2. Shell robustness concerns

3. Remote execution correctness and safety

4. UX and dependency issues

5. Error handling limitations

6. Maintainability limitations

Recommended direction

Short term

Medium term

Long term

Recent changes

Change guidance

Suggested review focus for future changes

Additional recommendations from latest review

11 KiB

Raw Blame History