Compare commits

..

6 commits
main ... main

Author SHA1 Message Date
hax
f2551e68a7 Add: Paramiko and requests added as requirement
Paramiko and requests has been added for installation requirement

Signed-off-by: hax <hax@lainlounge.org>
2025-07-22 09:56:15 +00:00
hax
435c481720 Add: /ping and /status separated
- /ping can now be used individually to check against any IP address.
- /status will bring up inline keyboard, where you can select either a general status request or per machine

Signed-off-by: hax <hax@lainlounge.org>
2025-07-22 09:53:21 +00:00
hax
e7275ac1de README.md aktualisiert
Signed-off-by: hax <hax@lainlounge.org>
2025-07-22 09:15:06 +00:00
h@x
ce133c03ee Merge: Rewritten from scratch' from Refactor2.0 into main
Reviewed-on: hax/lainmonitor#1
2025-07-22 09:12:38 +00:00
hax
49eea13117 Add config.py
Outsourced hardcoded credentials into a single config file.

Signed-off-by: hax <hax@lainlounge.org>
2025-07-22 09:10:44 +00:00
hax
1eb23fd0d8 Rewritten from scratch
- Removed Prosody, Tailscale, Zerotier, Postgresql checks
- Add checks for OPNSense and Proxmox via SSH
- Add SSL verification for trusted clients

Signed-off-by: hax <hax@lainlounge.org>
2025-07-22 09:08:38 +00:00
7 changed files with 291 additions and 257 deletions

View file

@ -1,2 +0,0 @@
AUTHORIZED_USER_ID_1
AUTHORIZED_USER_ID_2

1
.env
View file

@ -1 +0,0 @@
YOUR_TOKEN_HERE

2
.gitignore vendored
View file

@ -1,2 +0,0 @@
test.py
venv/

130
README.md
View file

@ -1,71 +1,77 @@
# lainmonitor # LainMonitor
LainMonitor is a Telegram bot designed to monitor your system, providing real-time updates on the systems status, essential services, and disk usage. It can also verify connectivity to a specific Tailscale IP address. LainMonitor is a Telegram bot designed to provide realtime monitoring of both the local system and remote network clients (OPNsense firewalls and generic SSH hosts). It aggregates key metrics via SSH and REST APIs, and delivers concise reports through Telegram commands.
Current version: v1.2
### Key Features: ## Features
Retrieve system information: * **Local System Monitoring**
Hostname
Uptime
Status of critical services:
Zerotier
Prosody
PostgreSQL
Tailscale
nginx
Check disk usage
Ping a Tailscale IP for connectivity verification
Restart critical services
Reboot the host
Accessible via Telegram commands
### Prerequisites: * Hostname and overall online/offline status
* Uptime (humanreadable)
* Load averages (1, 5, 15 minute)
* Memory usage (via `free -h`)
* Disk usage (via `df -h`)
Python 3 * **Remote Client Monitoring**
Telebot — Python library for interacting with the Telegram bot API.
### Installation Guide: * **OPNsense Firewalls** (multiple hosts with perhost trustonfirstuse SSL)
Clone the repository: * System health status
* Uptime
* Memory and disk statistics
* Load averages
* **Generic SSH Hosts**
* Hostname, uptime, load, memory, and disk via SSH
* **Security & Resilience**
* Trustonfirstuse SSL: automatically fetches and caches firewall certificates
* Concurrency: parallel polling of remote hosts with `ThreadPoolExecutor`
* Error handling: pertask exceptions are logged and do not interrupt overall data gathering
* Access control: only whitelisted Telegram chat IDs can invoke commands
* Automatic bot restart on failure with backoff retry loop
## Commands
* `/status` or `/ping` — Returns a combined report of local and remote metrics
## Installation
1. **Clone repository**
```bash
git clone https://git.lainlounge.xyz/hornet/lainmonitor.git git clone https://git.lainlounge.xyz/hornet/lainmonitor.git
cd lainmonitor cd lainmonitor
```
2. **Install dependencies**
RECOMMENDED: Create a virtual environment for python with: ```bash
``` pip install -r requirements.txt
python3 -m venv venv ```
source venv/bin/activate 3. **Configure**
```
Install dependencies:
``` * Copy `config.py.example` to `config.py`
pip3 install -r requirements.txt * Populate `config.TOKEN` with your Telegram bot token
``` * Add your Telegram chat IDs to `config.ALLOWED_CHATS`
* Define each host under `config.HOSTS` with correct credentials and API settings
4. **Prepare SSL directory** (created automatically at runtime):
Configure your bot token: Open the .env file and replace the placeholder with your Telegram bot token. ```bash
mkdir certs
```
Configure authorized users: Open the .authorized_users file and replace the placeholders with Telegram user ID(s). ## Usage
Set up service access: Ensure the bot can check system services by running it with sudo or appropriate permissions. * **Run directly**
### Usage:
#### Running the Bot Manually:
You can run LainMonitor directly from the command line:
```bash
python3 lainmonitor.py python3 lainmonitor.py
```
#### Running as a Systemd Service: * **Run as a service** (systemd)
To run the bot as a systemd service, follow these steps:
Create a service file:
sudo nano /etc/systemd/system/lainmonitor.service
Add the following configuration:
```ini
[Unit] [Unit]
Description=LainMonitor Telegram Bot Description=LainMonitor Telegram Bot
After=network.target After=network.target
@ -76,24 +82,24 @@ Add the following configuration:
[Install] [Install]
WantedBy=multi-user.target WantedBy=multi-user.target
```
Enable and start the service: ```bash
sudo systemctl enable lainmonitor sudo systemctl enable lainmonitor
sudo systemctl start lainmonitor sudo systemctl start lainmonitor
```
### Available Commands: ## Dependencies
/start — Initialize the bot and receive a welcome message. * `pyTelegramBotAPI` (Telebot) — Telegram Bot API client
/help — Display a list of available commands. * `paramiko` — SSH connectivity
/status — Retrieve system hostname, uptime, and status of monitored services. * `requests` — HTTP/REST API client
/ping — Ping a Tailscale IP and return connectivity status.
/restart hostname- Restart a specific service on a specified machine.
/reboot hostname — Placeholder for a system reboot command.
### Contributions: ## Author
Created by hornetmaidan. **h@x**
With Contributions from h@x.
Any new features and suggestions are welcome! ## Original Script written by:
**hornetmaidan**
Contributions and feedback are welcome! :-)

40
config.py Normal file
View file

@ -0,0 +1,40 @@
# Configuration for Lainmonitor
# Telegram bot token
TOKEN = 'PLACE_YOUR_TOKEN_HERE'
# Allowed Telegram chat IDs (whitelist)
ALLOWED_CHATS = [123456789, 987654321]
# Per-host configuration
HOSTS = {
'10.0.0.1': {
'type': 'opnsense',
'api_url': 'https://10.0.0.1/api',
'api_key': 'OPN_KEY_1',
'api_secret': 'OPN_SECRET_1'
},
'10.128.0.1': {
'type': 'opnsense',
'api_url': 'https://10.128.0.1/api',
'api_key': 'OPN_KEY_2',
'api_secret': 'OPN_SECRET_2'
},
'10.144.0.1': {
'type': 'opnsense',
'api_url': 'https://10.144.0.1/api',
'api_key': 'OPN_KEY_3',
'api_secret': 'OPN_SECRET_3'
},
'10.130.1.1': {
'type': 'opnsense',
'api_url': 'https://10.130.1.1/api',
'api_key': 'OPN_KEY_4',
'api_secret': 'OPN_SECRET_4'
},
'10.177.0.100': {
'type': 'generic',
'ssh_user': 'SSH_USER_100',
'ssh_pass': 'SSH_PASS_100'
}
}

View file

@ -1,197 +1,188 @@
# --/usr/bin/env python3 -- # #!/usr/bin/env python3
# description: telegram bot for monitoring the system
# dependencies: telebot # --------------------------------------------------------------------------
# usage: python3 lainmonitor.py | or run it as a service # Description: A Telegram bot for monitoring critical infrastructur services
# author: hornetmaidan # Dependencies: telebot
# contributors: h@x # Usage: python3 lainmonitor.py | or run it as a service
# version: 1.2 # Author: h@x
import os # Version: 2.1.0
# --------------------------------------------------------------------------
import subprocess import subprocess
import threading
import queue
from time import sleep
import telebot import telebot
import paramiko
import requests
import time
import socket
import logging import logging
import ssl
import os
from concurrent.futures import ThreadPoolExecutor, as_completed
from telebot import types
import config
# Setup logging # Configure logging
logging.basicConfig(filename='lainmonitor.log', level=logging.INFO, tlogging_format = "%(asctime)s [%(levelname)s] %(name)s: %(message)s"
format='%(asctime)s - %(levelname)s - %(message)s') logging.basicConfig(level=logging.INFO, format=tlogging_format)
logger = logging.getLogger(__name__)
# Load environment variables and config files securely # Ensure certificate directory exists
script_dir = os.path.dirname(os.path.realpath(__file__)) CERT_DIR = os.path.join(os.path.dirname(__file__), 'certs')
env_path = os.path.join(script_dir, '.env') if not os.path.isdir(CERT_DIR):
auth_users_path = os.path.join(script_dir, '.authorized_users') os.makedirs(CERT_DIR, exist_ok=True)
# Load the token bot = telebot.TeleBot(config.TOKEN)
try: ALLOWED_CHATS = set(config.ALLOWED_CHATS)
with open(env_path, 'r') as f:
token = f.read().strip()
except FileNotFoundError:
logging.error('Token file not found. Exiting...')
exit(1)
# Load the authorized users # Utility for command execution with timeout
try: def run_cmd(cmd, timeout=5):
authorized_users = [str(line.strip()) for line in open(auth_users_path, 'r').readlines()]
except FileNotFoundError:
logging.error('Authorized users file not found. Exiting...')
exit(1)
# Initialize the bot
bot = telebot.TeleBot(token)
# Define status variables
status, hostname, uptime = 'unknown', 'unknown', 'unknown'
zerotier, prosody, postgres, tailscale, nginx, disk = ['unknown'] * 6
nodes, hostnames, threads = [], [], []
reach_queue = queue.Queue()
# Get basic system info
def get_system_info():
global hostname, uptime, zerotier, prosody, postgres, tailscale, nginx, disk
try: try:
hostname = subprocess.check_output(['hostname']).decode().strip() result = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
uptime = subprocess.check_output(['uptime', '-p']).decode().strip() return result.stdout.strip()
except subprocess.TimeoutExpired as e:
logger.warning(f"Command {cmd} timed out: {e}")
return 'timeout'
except OSError as e:
logger.error(f"OS error running {cmd}: {e}")
return 'error'
services = ['zerotier-one', 'prosody', 'postgresql', 'tailscaled', 'nginx'] # Local system info
status_results = [] def get_local_info():
for service in services: hostname = run_cmd(['hostname'])
status_results.append(get_service_status(service)) uptime = run_cmd(['uptime', '-p'])
zerotier, prosody, postgres, tailscale, nginx = status_results load_line = run_cmd(['uptime'])
load_avg = load_line.split('load average:')[-1].strip() if 'load average:' in load_line else 'unknown'
memory = run_cmd(['free', '-h'])
disk = run_cmd(['df', '-h'])
status = 'online' if hostname not in ('', 'error', 'timeout') else 'offline'
return {'hostname': hostname, 'uptime': uptime, 'load_avg': load_avg, 'memory': memory, 'disk': disk, 'status': status}
disk = subprocess.check_output(['df', '-h']).decode().strip() # Fetch and store SSL certificate once
except subprocess.CalledProcessError as e: def fetch_certificate(host, port):
logging.error(f"Error fetching system info: {e}") cert_path = os.path.join(CERT_DIR, f"{host}.pem")
status = 'offline' if os.path.isfile(cert_path):
return cert_path
try:
cert = ssl.get_server_certificate((host, port))
with open(cert_path, 'w') as f:
f.write(cert)
logger.info(f"Saved certificate for {host} to {cert_path}")
return cert_path
except Exception as e:
logger.error(f"Failed to fetch certificate for {host}: {e}")
return True
# SSH-based info gathering
def get_ssh_info(ip, cfg):
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
try:
client.connect(ip, username=cfg['ssh_user'], password=cfg['ssh_pass'], timeout=5)
info = {}
cmds = {'hostname': 'hostname', 'uptime': 'uptime -p', 'load_avg': 'uptime', 'memory': 'free -h', 'disk': 'df -h'}
for key, cmd in cmds.items():
try:
stdin, stdout, stderr = client.exec_command(cmd, timeout=5)
out = stdout.read().decode().strip()
if key == 'load_avg' and 'load average:' in out:
out = out.split('load average:')[-1].strip()
info[key] = out
except (socket.timeout, paramiko.SSHException) as e:
logger.error(f"SSH command {cmd} on {ip} failed: {e}")
info[key] = 'error'
info['status'] = 'online'
except (paramiko.AuthenticationException, paramiko.SSHException, socket.timeout) as e:
logger.error(f"SSH connection to {ip} failed: {e}")
info = {'status': 'unreachable'}
finally:
try: client.close()
except Exception as e: logger.warning(f"Error closing SSH to {ip}: {e}")
return ip, info
# OPNsense API-based info gathering
def get_opnsense_info(ip, cfg):
url = cfg['api_url']
host = url.split('//')[1].split('/')[0].split(':')[0]
port = int(url.split('//')[1].split('/')[0].split(':')[1]) if ':' in url.split('//')[1].split('/')[0] else 443
verify = fetch_certificate(host, port)
try:
resp = requests.get(f"{url}/core/get/health", auth=(cfg['api_key'], cfg['api_secret']), verify=verify, timeout=5)
resp.raise_for_status()
data = resp.json().get('health', {})
return ip, {'status': data.get('health','unknown'), 'uptime': data.get('uptime','unknown'), 'memory': f"{data.get('mem_used','?')}MB/{data.get('mem_total','?')}MB", 'load_avg': data.get('load_avg','unknown'), 'disk': f"{data.get('disk_used','?')}%/{data.get('disk_total','?')}%"}
except requests.RequestException as e:
logger.error(f"OPNsense API call for {ip} failed: {e}")
return ip, {'status': 'unreachable'}
# Gather info for given host or all hosts
def gather_host(ip=None):
if ip and ip in config.HOSTS:
cfg = config.HOSTS[ip]
return [get_ssh_info(ip, cfg) if cfg['type']=='generic' else get_opnsense_info(ip, cfg)]
# all hosts
return gather_clients()
# Ping utility
def ping_ip(ip):
res = run_cmd(['ping', '-c', '1', ip], timeout=3)
if '1 packets transmitted, 1 received' in res or '1 packets transmitted, 1 packets received' in res:
return 'reachable'
if res in ('timeout', 'error'):
return res
return 'unreachable'
# Access control decorator
def restricted(func):
def wrapper(msg, *args, **kwargs):
if msg.chat.id not in ALLOWED_CHATS:
bot.reply_to(msg, 'Unauthorized access')
return
return func(msg, *args, **kwargs)
return wrapper
# /status: show menu of available hosts
@bot.message_handler(commands=['status'])
@restricted
def handle_status(msg):
keyboard = types.InlineKeyboardMarkup()
for ip in config.HOSTS.keys():
keyboard.add(types.InlineKeyboardButton(ip, callback_data=f'status:{ip}'))
keyboard.add(types.InlineKeyboardButton('All', callback_data='status:all'))
bot.send_message(msg.chat.id, 'Select host for status:', reply_markup=keyboard)
# Callback handler for inline menu
@bot.callback_query_handler(func=lambda c: c.data.startswith('status:'))
@restricted
def callback_status(call):
_, key = call.data.split(':', 1)
if key == 'all':
entries = gather_clients()
else: else:
status = 'online' entries = dict(gather_host(key))
lines = []
for ip, info in entries.items():
lines.append(f"{ip}: {info.get('status','unknown')}")
if info.get('status')=='online':
for field in ('uptime','load_avg','memory','disk'):
lines.append(f" {field}: {info.get(field,'-')}")
bot.send_message(call.message.chat.id, '\n'.join(lines))
# Helper function to get service status # /ping <IP>
def get_service_status(service): @bot.message_handler(func=lambda m: m.text and m.text.startswith('/ping'))
@restricted
def handle_ping(msg):
parts = msg.text.split()
if len(parts) != 2:
bot.reply_to(msg, 'Usage: /ping <IP>')
return
ip = parts[1]
status = ping_ip(ip)
bot.reply_to(msg, f"Ping {ip}: {status}")
# Run polling with retry
while True:
try: try:
subprocess.run(['sudo', 'systemctl', 'is-active', '--quiet', service], check=True) bot.polling()
return f'{service} is active' except Exception as e:
except subprocess.CalledProcessError: logger.error(f"Polling error: {e}")
return f'{service} is inactive/not present' time.sleep(5)
# Function to ping a Tailscale node
def ping_node(node, hostname):
try:
ping = subprocess.run(['ping', '-c', '1', node], stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True)
reach_queue.put(f'{node}/{hostname} is reachable')
except subprocess.CalledProcessError:
reach_queue.put(f'{node}/{hostname} is unreachable')
# Check Tailscale nodes
def check_tailscale_nodes():
global nodes, hostnames, threads
try:
nodes_output = subprocess.check_output("tailscale status | grep '100'", shell=True).decode().strip()
nodes = [line.split()[0] for line in nodes_output.split('\n') if line]
hostnames = [line.split()[1] for line in nodes_output.split('\n') if line]
for node, hostname in zip(nodes, hostnames):
thread = threading.Thread(target=ping_node, args=(node, hostname))
threads.append(thread)
thread.start()
for thread in threads:
thread.join()
reach = []
while not reach_queue.empty():
reach.append(reach_queue.get())
return reach
except subprocess.CalledProcessError as e:
logging.error(f"Error checking Tailscale status: {e}")
return ['Error checking Tailscale status']
# Function to restart a service
def restart_service(service):
logging.info(f'Restarting {service}...')
try:
subprocess.run(['sudo', 'systemctl', 'restart', service], check=True)
sleep(3)
service_status = get_service_status(service)
status_message = f'{service} restarted! Status: {service_status}'
logging.info(status_message)
return status_message
except subprocess.CalledProcessError as e:
logging.error(f"Error restarting {service}: {e}")
return f'Error restarting {service}'
# Restart services menu
def restart_menu():
keyboard = [
[telebot.types.InlineKeyboardButton('zerotier-one', callback_data='zerotier-one')],
[telebot.types.InlineKeyboardButton('prosody', callback_data='prosody')],
[telebot.types.InlineKeyboardButton('postgresql', callback_data='postgresql')],
[telebot.types.InlineKeyboardButton('tailscaled', callback_data='tailscaled')],
[telebot.types.InlineKeyboardButton('nginx', callback_data='nginx')],
[telebot.types.InlineKeyboardButton('cancel', callback_data='cancel')]
]
reply_markup = telebot.types.InlineKeyboardMarkup(keyboard)
return reply_markup
# Callback query handler for service restart
@bot.callback_query_handler(func=lambda call: True)
def callback_query(call):
service = call.data
if service != 'cancel':
status_message = restart_service(service)
bot.send_message(call.message.chat.id, status_message)
else:
bot.edit_message_reply_markup(call.message.chat.id, call.message.message_id, reply_markup=None)
bot.send_message(call.message.chat.id, 'Canceled')
# Reboot system function
def reboot():
logging.info('Rebooting system...')
subprocess.run(['sudo', 'reboot'], check=True)
# Populate teh variables on first start
get_system_info()
# Message handlers
@bot.message_handler(commands=['start', 'help', 'status', 'restart', 'reboot', 'ping'])
def handle(message):
user_id = str(message.from_user.id)
if user_id not in authorized_users:
bot.reply_to(message, 'You are not authorized for this action')
else:
if message.text == '/start':
bot.reply_to(message, 'lainmonitor v1.2 --- standing by...')
elif message.text == '/help':
bot.reply_to(message, 'commands: /start, /help, /status, /restart, /reboot, /ping')
bot.reply_to(message, 'commands: /start, /help, /status, /restart, /reboot, /ping')
elif message.text == '/status':
get_system_info()
status_message = (
f'hostname: {hostname}\n'
f'system status: {status}\n'
f'uptime: {uptime}\n'
f'zerotier: {zerotier}\n'
f'prosody: {prosody}\n'
f'postgres: {postgres}\n'
f'tailscale: {tailscale}\n'
f'nginx: {nginx}'
)
bot.reply_to(message, status_message)
bot.reply_to(message, f'Filesystem info for {hostname}:\n\n{disk}')
elif message.text == f'/restart {hostname}':
bot.send_message(message.chat.id, 'Select a service to restart:', reply_markup=restart_menu())
elif message.text == f'/reboot {hostname}':
bot.reply_to(message, f'Rebooting {hostname}...')
reboot()
elif message.text == '/ping':
reach = check_tailscale_nodes()
bot.reply_to(message, f'Ping status:\n\n{"\n".join(reach)}')
else:
pass
# Polling with timeout and error handling
try:
bot.polling(none_stop=True, timeout=60, long_polling_timeout=60)
except Exception as e:
logging.error(f'Polling error: {e}')

View file

@ -1 +1,3 @@
telebot telebot
paramiko
requests