Automating Daily Kubernetes Troubleshooting with Nested Scripts
Kubernetes troubleshooting can be a time-consuming task, especially when dealing with recurring issues. Instead of manually running multiple kubectl commands every day, you can automate the process using nested scripts. This blog will guide you through setting up a structured troubleshooting workflow.
Why Automate Kubernetes Troubleshooting?
- Saves Time: Automating frequent checks helps reduce repetitive tasks.
- Ensures Consistency: Standardized scripts ensure that every troubleshooting step is performed correctly.
- Reduces Human Error: Automating log collection and resource monitoring minimizes missed issues.
- Faster Issue Resolution: Automated scripts provide instant insights into cluster health.
Setting Up the Automation
1. Create a Master Script (troubleshoot.sh)
This script serves as the entry point and executes all necessary checks.
#!/bin/bash
echo "Starting Kubernetes Troubleshooting..."
# Load environment variables if needed
source ~/.bashrc
# Run nested scripts
./check_pods.sh
./check_logs.sh
./check_resources.sh
echo "Troubleshooting completed!"
2. Checking Pods (check_pods.sh)
This script lists pods in error states and fetches relevant logs.
#!/bin/bash
echo "Checking for pods in error state..."
kubectl get pods --all-namespaces | grep -E 'CrashLoopBackOff|Error|Evicted'
echo "Fetching details for problematic pods..."
for pod in $(kubectl get pods --all-namespaces --field-selector=status.phase!=Running -o jsonpath='{.items[*].metadata.name}'); do
ns=$(kubectl get pod $pod -o jsonpath='{.metadata.namespace}')
echo "=== Logs for $pod in namespace $ns ==="
kubectl logs -n $ns $pod --tail=50
done
3. Checking Logs (check_logs.sh)
This script gathers logs for failing pods in a specific namespace.
#!/bin/bash
NAMESPACE="default" # Change this to your target namespace
echo "Fetching logs for failing pods in namespace $NAMESPACE..."
for pod in $(kubectl get pods -n $NAMESPACE --field-selector=status.phase!=Running -o jsonpath='{.items[*].metadata.name}'); do
echo "Logs for pod: $pod"
kubectl logs -n $NAMESPACE $pod --tail=100
echo "-----------------------------------"
done
4. Checking Resource Usage (check_resources.sh)
Monitor CPU and memory usage across nodes and pods.
#!/bin/bash
echo "Checking resource usage..."
kubectl top nodes
kubectl top pods --all-namespaces
Making Scripts Executable
Before running the scripts, grant execution permission:
chmod +x troubleshoot.sh check_pods.sh check_logs.sh check_resources.sh
Automating with Cron Jobs
To schedule the troubleshooting script to run daily, add a cron job:
crontab -e
Add the following line to execute the script every day at 8 AM:
0 8 * * * /path/to/troubleshoot.sh >> /var/log/k8s_troubleshoot.log 2>&1
Conclusion
By leveraging nested scripts for Kubernetes troubleshooting, you can:
- Reduce the manual effort required for daily checks.
- Ensure consistent monitoring of cluster health.
- Detect and resolve issues faster.
This approach not only enhances efficiency but also improves overall reliability in managing Kubernetes clusters. 🚀 Happy Automating!
No comments:
Post a Comment