Python Scripting for Sysadmin Task Automation
Learn to automate repetitive sysadmin tasks with Python scripting, covering file management, backups, and network monitoring.
As a sysadmin, you probably spend more time than you’d like repeating the same tasks: checking disk space, restarting services, or parsing logs. Python scripting turns those manual chores into automated, reliable workflows. It’s not about replacing your shell scripts—it’s about adding logic, error handling, and readability where Bash starts to creak. Here’s how I use Python daily to keep my servers in line.
Automating Log Parsing and Alerts
Logs are the sysadmin’s bread and butter, but grepping through gigabytes of data gets old fast. Python’s re module lets you extract exactly what matters. For example, to find all 5xx errors in an Nginx access log and count them per IP:
import re
from collections import Counter
log_pattern = r'(\d+\.\d+\.\d+\.\d+).*" \d{3} (5\d{2}) '
ip_counter = Counter()
with open('/var/log/nginx/access.log', 'r') as f:
for line in f:
match = re.search(log_pattern, line)
if match:
ip_counter[match.group(1)] += 1
for ip, count in ip_counter.most_common(10):
print(f"{ip}: {count} errors")
Pipe this into a cron job, and you’ll get a daily summary without touching the log files manually. Add smtplib to send an email alert when a threshold is exceeded.
Monitoring Disk Usage with Thresholds
df -h is fine, but it doesn’t tell you when to act. A Python script using shutil can check disk usage and trigger actions only when needed:
import shutil
import smtplib
from email.message import EmailMessage
threshold = 85 # percent
usage = shutil.disk_usage('/')
percent_used = (usage.used / usage.total) * 100
if percent_used > threshold:
msg = EmailMessage()
msg.set_content(f"Warning: Root disk at {percent_used:.1f}%")
msg['Subject'] = 'Disk Alert'
msg['To'] = 'admin@example.com'
msg['From'] = 'monitor@example.com'
with smtplib.SMTP('localhost') as server:
server.send_message(msg)
This is a drop-in replacement for a Nagios check or a cron-based df parser. You can extend it to clean old logs or archive files before the disk fills up.
Bulk User Management Across Servers
Managing users on multiple boxes with SSH and useradd is error-prone. Python’s paramiko library lets you script it cleanly:
import paramiko
servers = ['web01', 'web02', 'db01']
new_user = 'jdoe'
public_key = 'ssh-rsa AAA...'
for host in servers:
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(host, username='root', key_filename='/home/admin/.ssh/id_rsa')
commands = [
f'useradd -m -s /bin/bash {new_user}',
f'mkdir -p /home/{new_user}/.ssh',
f'echo "{public_key}" > /home/{new_user}/.ssh/authorized_keys',
f'chown -R {new_user}:{new_user} /home/{new_user}/.ssh',
f'chmod 700 /home/{new_user}/.ssh',
f'chmod 600 /home/{new_user}/.ssh/authorized_keys'
]
for cmd in commands:
stdin, stdout, stderr = client.exec_command(cmd)
err = stderr.read().decode()
if err and 'already exists' not in err:
print(f"Error on {host}: {err}")
client.close()
Wrap this in a function, and you can add or remove users across your fleet in seconds. Always use key-based auth and limit the command scope.
Scheduling with Cron and Error Handling
Python scripts are useless if they fail silently. Always wrap your logic in try/except blocks and log to a file:
import logging
logging.basicConfig(
filename='/var/log/python_automation.log',
level=logging.INFO,
format='%(asctime)s %(message)s'
)
try:
# Your automation logic here
logging.info('Task completed successfully')
except Exception as e:
logging.error(f'Task failed: {e}')
Set up a cron job to run your script daily: 0 3 * * * /usr/bin/python3 /opt/scripts/disk_check.py. The log file will tell you exactly what happened, so you don’t wake up to a silent failure.
Conclusion
Python scripting turns repetitive sysadmin tasks into set-and-forget solutions. Start with small wins—log parsing, disk checks, user management—and build up. Your future self (and your weekend) will thank you.