Compare commits
6 commits
| Author | SHA1 | Date | |
|---|---|---|---|
| f2551e68a7 | |||
| 435c481720 | |||
| e7275ac1de | |||
| ce133c03ee | |||
| 49eea13117 | |||
| 1eb23fd0d8 |
4 changed files with 309 additions and 128 deletions
146
README.md
146
README.md
|
|
@ -1,79 +1,105 @@
|
|||
# LainMonitor
|
||||
|
||||
LainMonitor is a Telegram bot designed to monitor your system by providing real-time information about the system's status, services, and disk usage. It can also check connectivity to a specific Tailscale IP address.
|
||||
LainMonitor is a Telegram bot designed to provide real‑time monitoring of both the local system and remote network clients (OPNsense firewalls and generic SSH hosts). It aggregates key metrics via SSH and REST APIs, and delivers concise reports through Telegram commands.
|
||||
|
||||
## Features
|
||||
- Retrieve system hostname, uptime, and status of essential services such as:
|
||||
- Zerotier
|
||||
- Prosody
|
||||
- PostgreSQL
|
||||
- Tailscale
|
||||
- Check disk usage
|
||||
- Ping a Tailscale IP to verify connectivity
|
||||
- Use via Telegram commands like `/status`, `/ping`, and `/help`
|
||||
|
||||
## Dependencies
|
||||
- [Telebot](https://github.com/eternnoir/pyTelegramBotAPI) - A Python library for Telegram bot API.
|
||||
* **Local System Monitoring**
|
||||
|
||||
* Hostname and overall online/offline status
|
||||
* Uptime (human‑readable)
|
||||
* Load averages (1, 5, 15 minute)
|
||||
* Memory usage (via `free -h`)
|
||||
* Disk usage (via `df -h`)
|
||||
|
||||
* **Remote Client Monitoring**
|
||||
|
||||
* **OPNsense Firewalls** (multiple hosts with per‑host trust‑on‑first‑use SSL)
|
||||
|
||||
* System health status
|
||||
* Uptime
|
||||
* Memory and disk statistics
|
||||
* Load averages
|
||||
* **Generic SSH Hosts**
|
||||
|
||||
* Hostname, uptime, load, memory, and disk via SSH
|
||||
|
||||
* **Security & Resilience**
|
||||
|
||||
* Trust‑on‑first‑use SSL: automatically fetches and caches firewall certificates
|
||||
* Concurrency: parallel polling of remote hosts with `ThreadPoolExecutor`
|
||||
* Error handling: per‑task exceptions are logged and do not interrupt overall data gathering
|
||||
* Access control: only whitelisted Telegram chat IDs can invoke commands
|
||||
* Automatic bot restart on failure with backoff retry loop
|
||||
|
||||
## Commands
|
||||
|
||||
* `/status` or `/ping` — Returns a combined report of local and remote metrics
|
||||
|
||||
## Installation
|
||||
1. Clone this repository:
|
||||
```bash
|
||||
git clone https://git.lainlounge.xyz/hornet/lainmonitor.git
|
||||
cd lainmonitor
|
||||
```
|
||||
2. Install the required Python library:
|
||||
```bash
|
||||
pip install pyTelegramBotAPI
|
||||
```
|
||||
3. Replace the placeholder in the code with your Telegram bot token:
|
||||
```python
|
||||
TOKEN = 'PLACE_YOUR_TOKEN_HERE'
|
||||
```
|
||||
|
||||
4. Set up permissions for the bot to check system services (run as a user with `sudo` access).
|
||||
1. **Clone repository**
|
||||
|
||||
```bash
|
||||
git clone https://git.lainlounge.xyz/hornet/lainmonitor.git
|
||||
cd lainmonitor
|
||||
```
|
||||
2. **Install dependencies**
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
3. **Configure**
|
||||
|
||||
* Copy `config.py.example` to `config.py`
|
||||
* Populate `config.TOKEN` with your Telegram bot token
|
||||
* Add your Telegram chat IDs to `config.ALLOWED_CHATS`
|
||||
* Define each host under `config.HOSTS` with correct credentials and API settings
|
||||
4. **Prepare SSL directory** (created automatically at runtime):
|
||||
|
||||
```bash
|
||||
mkdir certs
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Running Directly
|
||||
You can run the bot directly using Python:
|
||||
* **Run directly**
|
||||
|
||||
```bash
|
||||
python3 lainmonitor.py
|
||||
```
|
||||
```bash
|
||||
python3 lainmonitor.py
|
||||
```
|
||||
|
||||
### Running as a Service
|
||||
To run LainMonitor as a service, follow these steps:
|
||||
1. Create a systemd service file:
|
||||
```bash
|
||||
sudo nano /etc/systemd/system/lainmonitor.service
|
||||
```
|
||||
2. Add the following configuration:
|
||||
```ini
|
||||
[Unit]
|
||||
Description=LainMonitor Telegram Bot
|
||||
After=network.target
|
||||
* **Run as a service** (systemd)
|
||||
|
||||
[Service]
|
||||
ExecStart=/usr/bin/python3 /path/to/lainmonitor.py
|
||||
Restart=on-failure
|
||||
```ini
|
||||
[Unit]
|
||||
Description=LainMonitor Telegram Bot
|
||||
After=network.target
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
3. Enable and start the service:
|
||||
```bash
|
||||
sudo systemctl enable lainmonitor
|
||||
sudo systemctl start lainmonitor
|
||||
```
|
||||
[Service]
|
||||
ExecStart=/usr/bin/python3 /path/to/lainmonitor.py
|
||||
Restart=on-failure
|
||||
|
||||
## Telegram Bot Commands
|
||||
- `/start`: Initialize the bot and receive a welcome message.
|
||||
- `/help`: Display available commands.
|
||||
- `/status`: Get the system hostname, status, uptime, and the status of monitored services.
|
||||
- `/ping`: Ping a Tailscale IP and return the connectivity status.
|
||||
- `/reboot`: (Work in progress) Placeholder for a reboot command.
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
```bash
|
||||
sudo systemctl enable lainmonitor
|
||||
sudo systemctl start lainmonitor
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
* `pyTelegramBotAPI` (Telebot) — Telegram Bot API client
|
||||
* `paramiko` — SSH connectivity
|
||||
* `requests` — HTTP/REST API client
|
||||
|
||||
## Author
|
||||
Created by **hornetmaidan**
|
||||
|
||||
Feel free to contribute or suggest features!
|
||||
**h@x**
|
||||
|
||||
## Original Script written by:
|
||||
**hornetmaidan**
|
||||
|
||||
Contributions and feedback are welcome! :-)
|
||||
|
|
|
|||
40
config.py
Normal file
40
config.py
Normal file
|
|
@ -0,0 +1,40 @@
|
|||
# Configuration for Lainmonitor
|
||||
|
||||
# Telegram bot token
|
||||
TOKEN = 'PLACE_YOUR_TOKEN_HERE'
|
||||
|
||||
# Allowed Telegram chat IDs (whitelist)
|
||||
ALLOWED_CHATS = [123456789, 987654321]
|
||||
|
||||
# Per-host configuration
|
||||
HOSTS = {
|
||||
'10.0.0.1': {
|
||||
'type': 'opnsense',
|
||||
'api_url': 'https://10.0.0.1/api',
|
||||
'api_key': 'OPN_KEY_1',
|
||||
'api_secret': 'OPN_SECRET_1'
|
||||
},
|
||||
'10.128.0.1': {
|
||||
'type': 'opnsense',
|
||||
'api_url': 'https://10.128.0.1/api',
|
||||
'api_key': 'OPN_KEY_2',
|
||||
'api_secret': 'OPN_SECRET_2'
|
||||
},
|
||||
'10.144.0.1': {
|
||||
'type': 'opnsense',
|
||||
'api_url': 'https://10.144.0.1/api',
|
||||
'api_key': 'OPN_KEY_3',
|
||||
'api_secret': 'OPN_SECRET_3'
|
||||
},
|
||||
'10.130.1.1': {
|
||||
'type': 'opnsense',
|
||||
'api_url': 'https://10.130.1.1/api',
|
||||
'api_key': 'OPN_KEY_4',
|
||||
'api_secret': 'OPN_SECRET_4'
|
||||
},
|
||||
'10.177.0.100': {
|
||||
'type': 'generic',
|
||||
'ssh_user': 'SSH_USER_100',
|
||||
'ssh_pass': 'SSH_PASS_100'
|
||||
}
|
||||
}
|
||||
249
lainmonitor.py
249
lainmonitor.py
|
|
@ -1,75 +1,188 @@
|
|||
#description: telegram bot for monitoring the system
|
||||
#dependencies: telebot
|
||||
#usage: python3 lainmonitor.py | or run it as a service
|
||||
#author: hornetmaidan
|
||||
#!/usr/bin/env python3
|
||||
|
||||
# --------------------------------------------------------------------------
|
||||
# Description: A Telegram bot for monitoring critical infrastructur services
|
||||
# Dependencies: telebot
|
||||
# Usage: python3 lainmonitor.py | or run it as a service
|
||||
# Author: h@x
|
||||
# Version: 2.1.0
|
||||
# --------------------------------------------------------------------------
|
||||
|
||||
import subprocess
|
||||
import telebot
|
||||
#define the variables
|
||||
status, hostname, uptime, zerotier, prosody, postgres, tailscale, disk, ping = 'unknown', 'unknown', 'unknown', 'unknown', 'unknown', 'unknown', 'unknown', 'unknown', 'unknown'
|
||||
#telegram bot token
|
||||
TOKEN = 'PLACE_YOUR_TOKEN_HERE'
|
||||
import paramiko
|
||||
import requests
|
||||
import time
|
||||
import socket
|
||||
import logging
|
||||
import ssl
|
||||
import os
|
||||
from concurrent.futures import ThreadPoolExecutor, as_completed
|
||||
from telebot import types
|
||||
import config
|
||||
|
||||
#bot init
|
||||
bot = telebot.TeleBot(TOKEN)
|
||||
# Configure logging
|
||||
tlogging_format = "%(asctime)s [%(levelname)s] %(name)s: %(message)s"
|
||||
logging.basicConfig(level=logging.INFO, format=tlogging_format)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
#get system info
|
||||
def getinfo():
|
||||
global status, hostname, uptime, zerotier, prosody, postgres, tailscale, disk
|
||||
hostname = subprocess.check_output(['hostname']).decode().strip()
|
||||
uptime = subprocess.check_output(['uptime', '-p']).decode().strip()
|
||||
#systemd-only services
|
||||
zerotier = subprocess.Popen("sudo systemctl status zerotier-one | grep 'Active'", shell=True, stdout=subprocess.PIPE).stdout.read().decode().strip()
|
||||
prosody = subprocess.Popen("sudo systemctl status prosody | grep 'Active'", shell=True, stdout=subprocess.PIPE).stdout.read().decode().strip()
|
||||
postgres = subprocess.Popen("sudo systemctl status postgresql | grep 'Active'", shell=True, stdout=subprocess.PIPE).stdout.read().decode().strip()
|
||||
tailscale = subprocess.Popen("sudo systemctl status tailscaled | grep 'Active'", shell=True, stdout=subprocess.PIPE).stdout.read().decode().strip()
|
||||
disk = subprocess.check_output(['df', '-h']).decode().strip()
|
||||
if hostname == 'unknown':
|
||||
status = 'offline'
|
||||
# Ensure certificate directory exists
|
||||
CERT_DIR = os.path.join(os.path.dirname(__file__), 'certs')
|
||||
if not os.path.isdir(CERT_DIR):
|
||||
os.makedirs(CERT_DIR, exist_ok=True)
|
||||
|
||||
bot = telebot.TeleBot(config.TOKEN)
|
||||
ALLOWED_CHATS = set(config.ALLOWED_CHATS)
|
||||
|
||||
# Utility for command execution with timeout
|
||||
def run_cmd(cmd, timeout=5):
|
||||
try:
|
||||
result = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
|
||||
return result.stdout.strip()
|
||||
except subprocess.TimeoutExpired as e:
|
||||
logger.warning(f"Command {cmd} timed out: {e}")
|
||||
return 'timeout'
|
||||
except OSError as e:
|
||||
logger.error(f"OS error running {cmd}: {e}")
|
||||
return 'error'
|
||||
|
||||
# Local system info
|
||||
def get_local_info():
|
||||
hostname = run_cmd(['hostname'])
|
||||
uptime = run_cmd(['uptime', '-p'])
|
||||
load_line = run_cmd(['uptime'])
|
||||
load_avg = load_line.split('load average:')[-1].strip() if 'load average:' in load_line else 'unknown'
|
||||
memory = run_cmd(['free', '-h'])
|
||||
disk = run_cmd(['df', '-h'])
|
||||
status = 'online' if hostname not in ('', 'error', 'timeout') else 'offline'
|
||||
return {'hostname': hostname, 'uptime': uptime, 'load_avg': load_avg, 'memory': memory, 'disk': disk, 'status': status}
|
||||
|
||||
# Fetch and store SSL certificate once
|
||||
def fetch_certificate(host, port):
|
||||
cert_path = os.path.join(CERT_DIR, f"{host}.pem")
|
||||
if os.path.isfile(cert_path):
|
||||
return cert_path
|
||||
try:
|
||||
cert = ssl.get_server_certificate((host, port))
|
||||
with open(cert_path, 'w') as f:
|
||||
f.write(cert)
|
||||
logger.info(f"Saved certificate for {host} to {cert_path}")
|
||||
return cert_path
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to fetch certificate for {host}: {e}")
|
||||
return True
|
||||
|
||||
# SSH-based info gathering
|
||||
def get_ssh_info(ip, cfg):
|
||||
client = paramiko.SSHClient()
|
||||
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
|
||||
try:
|
||||
client.connect(ip, username=cfg['ssh_user'], password=cfg['ssh_pass'], timeout=5)
|
||||
info = {}
|
||||
cmds = {'hostname': 'hostname', 'uptime': 'uptime -p', 'load_avg': 'uptime', 'memory': 'free -h', 'disk': 'df -h'}
|
||||
for key, cmd in cmds.items():
|
||||
try:
|
||||
stdin, stdout, stderr = client.exec_command(cmd, timeout=5)
|
||||
out = stdout.read().decode().strip()
|
||||
if key == 'load_avg' and 'load average:' in out:
|
||||
out = out.split('load average:')[-1].strip()
|
||||
info[key] = out
|
||||
except (socket.timeout, paramiko.SSHException) as e:
|
||||
logger.error(f"SSH command {cmd} on {ip} failed: {e}")
|
||||
info[key] = 'error'
|
||||
info['status'] = 'online'
|
||||
except (paramiko.AuthenticationException, paramiko.SSHException, socket.timeout) as e:
|
||||
logger.error(f"SSH connection to {ip} failed: {e}")
|
||||
info = {'status': 'unreachable'}
|
||||
finally:
|
||||
try: client.close()
|
||||
except Exception as e: logger.warning(f"Error closing SSH to {ip}: {e}")
|
||||
return ip, info
|
||||
|
||||
# OPNsense API-based info gathering
|
||||
def get_opnsense_info(ip, cfg):
|
||||
url = cfg['api_url']
|
||||
host = url.split('//')[1].split('/')[0].split(':')[0]
|
||||
port = int(url.split('//')[1].split('/')[0].split(':')[1]) if ':' in url.split('//')[1].split('/')[0] else 443
|
||||
verify = fetch_certificate(host, port)
|
||||
try:
|
||||
resp = requests.get(f"{url}/core/get/health", auth=(cfg['api_key'], cfg['api_secret']), verify=verify, timeout=5)
|
||||
resp.raise_for_status()
|
||||
data = resp.json().get('health', {})
|
||||
return ip, {'status': data.get('health','unknown'), 'uptime': data.get('uptime','unknown'), 'memory': f"{data.get('mem_used','?')}MB/{data.get('mem_total','?')}MB", 'load_avg': data.get('load_avg','unknown'), 'disk': f"{data.get('disk_used','?')}%/{data.get('disk_total','?')}%"}
|
||||
except requests.RequestException as e:
|
||||
logger.error(f"OPNsense API call for {ip} failed: {e}")
|
||||
return ip, {'status': 'unreachable'}
|
||||
|
||||
# Gather info for given host or all hosts
|
||||
def gather_host(ip=None):
|
||||
if ip and ip in config.HOSTS:
|
||||
cfg = config.HOSTS[ip]
|
||||
return [get_ssh_info(ip, cfg) if cfg['type']=='generic' else get_opnsense_info(ip, cfg)]
|
||||
# all hosts
|
||||
return gather_clients()
|
||||
|
||||
# Ping utility
|
||||
def ping_ip(ip):
|
||||
res = run_cmd(['ping', '-c', '1', ip], timeout=3)
|
||||
if '1 packets transmitted, 1 received' in res or '1 packets transmitted, 1 packets received' in res:
|
||||
return 'reachable'
|
||||
if res in ('timeout', 'error'):
|
||||
return res
|
||||
return 'unreachable'
|
||||
|
||||
# Access control decorator
|
||||
def restricted(func):
|
||||
def wrapper(msg, *args, **kwargs):
|
||||
if msg.chat.id not in ALLOWED_CHATS:
|
||||
bot.reply_to(msg, 'Unauthorized access')
|
||||
return
|
||||
return func(msg, *args, **kwargs)
|
||||
return wrapper
|
||||
|
||||
# /status: show menu of available hosts
|
||||
@bot.message_handler(commands=['status'])
|
||||
@restricted
|
||||
def handle_status(msg):
|
||||
keyboard = types.InlineKeyboardMarkup()
|
||||
for ip in config.HOSTS.keys():
|
||||
keyboard.add(types.InlineKeyboardButton(ip, callback_data=f'status:{ip}'))
|
||||
keyboard.add(types.InlineKeyboardButton('All', callback_data='status:all'))
|
||||
bot.send_message(msg.chat.id, 'Select host for status:', reply_markup=keyboard)
|
||||
|
||||
# Callback handler for inline menu
|
||||
@bot.callback_query_handler(func=lambda c: c.data.startswith('status:'))
|
||||
@restricted
|
||||
def callback_status(call):
|
||||
_, key = call.data.split(':', 1)
|
||||
if key == 'all':
|
||||
entries = gather_clients()
|
||||
else:
|
||||
status = 'online'
|
||||
return hostname, uptime, zerotier, prosody, postgres, tailscale, disk
|
||||
entries = dict(gather_host(key))
|
||||
lines = []
|
||||
for ip, info in entries.items():
|
||||
lines.append(f"{ip}: {info.get('status','unknown')}")
|
||||
if info.get('status')=='online':
|
||||
for field in ('uptime','load_avg','memory','disk'):
|
||||
lines.append(f" {field}: {info.get(field,'-')}")
|
||||
bot.send_message(call.message.chat.id, '\n'.join(lines))
|
||||
|
||||
#ping tailscale (change the IP address to the one you want or add more)
|
||||
def check_tailscale():
|
||||
global ping
|
||||
ping = subprocess.Popen("ping TAILSCALE_IP -c 1 | grep '1 packets'", shell=True, stdout=subprocess.PIPE).stdout.read().decode().strip()
|
||||
if '1 received' in ping:
|
||||
ping = 'connected'
|
||||
else:
|
||||
ping = 'unreachable'
|
||||
return ping
|
||||
# /ping <IP>
|
||||
@bot.message_handler(func=lambda m: m.text and m.text.startswith('/ping'))
|
||||
@restricted
|
||||
def handle_ping(msg):
|
||||
parts = msg.text.split()
|
||||
if len(parts) != 2:
|
||||
bot.reply_to(msg, 'Usage: /ping <IP>')
|
||||
return
|
||||
ip = parts[1]
|
||||
status = ping_ip(ip)
|
||||
bot.reply_to(msg, f"Ping {ip}: {status}")
|
||||
|
||||
#debug handler
|
||||
def check():
|
||||
global status, hostname, uptime, zerotier, prosody, postgres, tailscale, disk
|
||||
getinfo()
|
||||
print('system status:', status)
|
||||
print('hostname:', hostname)
|
||||
print('uptime:', uptime)
|
||||
print('zerotier:', zerotier)
|
||||
print('prosody:', prosody)
|
||||
print('postgres:', postgres)
|
||||
print('tailscale:', tailscale)
|
||||
print('disk:', disk)
|
||||
return status, hostname, uptime, zerotier, prosody, postgres, tailscale, disk
|
||||
|
||||
#message handling
|
||||
@bot.message_handler(commands=['start', 'help', 'status', 'reboot', 'ping'])
|
||||
def handle(message):
|
||||
if message.text == '/start':
|
||||
bot.reply_to(message, 'lainmonitor v1.0 --- standing by...')
|
||||
elif message.text == '/help':
|
||||
bot.reply_to(message, 'commands: /start, /help, /status, /reboot, /ping')
|
||||
elif message.text == '/status':
|
||||
check()
|
||||
status_message = f'hostname: {hostname}\nsystem status: {status}\nuptime: {uptime}\nzerotier: {zerotier}\nprosody: {prosody}\npostgres: {postgres}\ntailscale: {tailscale}'
|
||||
bot.reply_to(message, status_message)
|
||||
bot.reply_to(message, f'filesystem info for {hostname}: \n\n{disk}')
|
||||
elif message.text == '/reboot':
|
||||
bot.reply_to(message, 'work in progress...')
|
||||
elif message.text == '/ping':
|
||||
check_tailscale()
|
||||
bot.reply_to(message, f'ping status: {ping}')
|
||||
|
||||
#polling
|
||||
bot.polling()
|
||||
# Run polling with retry
|
||||
while True:
|
||||
try:
|
||||
bot.polling()
|
||||
except Exception as e:
|
||||
logger.error(f"Polling error: {e}")
|
||||
time.sleep(5)
|
||||
|
|
|
|||
|
|
@ -1 +1,3 @@
|
|||
telebot
|
||||
paramiko
|
||||
requests
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue