Add: Paramiko and requests added as requirement

Paramiko and requests has been added for installation requirement Signed-off-by: hax <hax@lainlounge.org>
Add: /ping and /status separated
2025-07-22 09:56:15 +00:00 · 2025-07-22 09:53:21 +00:00 · 2025-07-22 09:15:06 +00:00 · 2025-07-22 09:12:38 +00:00 · 2025-07-22 09:10:44 +00:00 · 2025-07-22 09:08:38 +00:00
7 changed files with 291 additions and 257 deletions
--- a/.authorized_users
+++ b/.authorized_users
@ -1,2 +0,0 @@
-AUTHORIZED_USER_ID_1
-AUTHORIZED_USER_ID_2
--- a/.env
+++ b/.env
@ -1 +0,0 @@
-YOUR_TOKEN_HERE
--- a/.gitignore
+++ b/.gitignore
@ -1,2 +0,0 @@
-test.py
-venv/
--- a/README.md
+++ b/README.md
@ -1,99 +1,105 @@
-# lainmonitor
+# LainMonitor

-LainMonitor is a Telegram bot designed to monitor your system, providing real-time updates on the system’s status, essential services, and disk usage. It can also verify connectivity to a specific Tailscale IP address.
-Current version: v1.2
+LainMonitor is a Telegram bot designed to provide real‑time monitoring of both the local system and remote network clients (OPNsense firewalls and generic SSH hosts). It aggregates key metrics via SSH and REST APIs, and delivers concise reports through Telegram commands.

-### Key Features:
+## Features

-    Retrieve system information:
-        Hostname
-        Uptime
-        Status of critical services:
-            Zerotier
-            Prosody
-            PostgreSQL
-            Tailscale
-            nginx
-    Check disk usage
-    Ping a Tailscale IP for connectivity verification
-    Restart critical services
-    Reboot the host
-    Accessible via Telegram commands
+* **Local System Monitoring**

-### Prerequisites:
+  * Hostname and overall online/offline status
+  * Uptime (human‑readable)
+  * Load averages (1, 5, 15 minute)
+  * Memory usage (via `free -h`)
+  * Disk usage (via `df -h`)

-    Python 3
-    Telebot — Python library for interacting with the Telegram bot API.
+* **Remote Client Monitoring**

-### Installation Guide:
+  * **OPNsense Firewalls** (multiple hosts with per‑host trust‑on‑first‑use SSL)

-Clone the repository:
+    * System health status
+    * Uptime
+    * Memory and disk statistics
+    * Load averages
+  * **Generic SSH Hosts**

-    git clone https://git.lainlounge.xyz/hornet/lainmonitor.git
-    cd lainmonitor
+    * Hostname, uptime, load, memory, and disk via SSH

-RECOMMENDED: Create a virtual environment for python with:
-```
-python3 -m venv venv
-source venv/bin/activate
-```
-Install dependencies:
+* **Security & Resilience**

-```
-   pip3 install -r requirements.txt
-```
+  * Trust‑on‑first‑use SSL: automatically fetches and caches firewall certificates
+  * Concurrency: parallel polling of remote hosts with `ThreadPoolExecutor`
+  * Error handling: per‑task exceptions are logged and do not interrupt overall data gathering
+  * Access control: only whitelisted Telegram chat IDs can invoke commands
+  * Automatic bot restart on failure with backoff retry loop

-Configure your bot token: Open the .env file and replace the placeholder with your Telegram bot token.
+## Commands

-Configure authorized users: Open the .authorized_users file and replace the placeholders with Telegram user ID(s).
+* `/status` or `/ping` — Returns a combined report of local and remote metrics

-Set up service access: Ensure the bot can check system services by running it with sudo or appropriate permissions.
+## Installation

-### Usage:
-#### Running the Bot Manually:
+1. **Clone repository**

-You can run LainMonitor directly from the command line:
+   ```bash
+   git clone https://git.lainlounge.xyz/hornet/lainmonitor.git
+   cd lainmonitor
+   ```
+2. **Install dependencies**

-    python3 lainmonitor.py
+   ```bash
+   pip install -r requirements.txt
+   ```
+3. **Configure**

-#### Running as a Systemd Service:
+   * Copy `config.py.example` to `config.py`
+   * Populate `config.TOKEN` with your Telegram bot token
+   * Add your Telegram chat IDs to `config.ALLOWED_CHATS`
+   * Define each host under `config.HOSTS` with correct credentials and API settings
+4. **Prepare SSL directory** (created automatically at runtime):

-To run the bot as a systemd service, follow these steps:
+   ```bash
+   mkdir certs
+   ```

-Create a service file:
+## Usage

-    sudo nano /etc/systemd/system/lainmonitor.service
+* **Run directly**

-Add the following configuration:
+  ```bash
+  python3 lainmonitor.py
+  ```

-    [Unit]
-    Description=LainMonitor Telegram Bot
-    After=network.target
+* **Run as a service** (systemd)

-    [Service]
-    ExecStart=/usr/bin/python3 /path/to/lainmonitor.py
-    Restart=on-failure
+  ```ini
+  [Unit]
+  Description=LainMonitor Telegram Bot
+  After=network.target

-    [Install]
-    WantedBy=multi-user.target
+  [Service]
+  ExecStart=/usr/bin/python3 /path/to/lainmonitor.py
+  Restart=on-failure

-Enable and start the service:
+  [Install]
+  WantedBy=multi-user.target
+  ```

-    sudo systemctl enable lainmonitor
-    sudo systemctl start lainmonitor
+  ```bash
+  sudo systemctl enable lainmonitor
+  sudo systemctl start lainmonitor
+  ```

-### Available Commands:
+## Dependencies

-    /start — Initialize the bot and receive a welcome message.
-    /help — Display a list of available commands.
-    /status — Retrieve system hostname, uptime, and status of monitored services.
-    /ping — Ping a Tailscale IP and return connectivity status.
-    /restart hostname- Restart a specific service on a specified machine.
-    /reboot hostname —  Placeholder for a system reboot command.
+* `pyTelegramBotAPI` (Telebot) — Telegram Bot API client
+* `paramiko` — SSH connectivity
+* `requests` — HTTP/REST API client

-### Contributions:
+## Author

-Created by hornetmaidan. 
-With Contributions from h@x. 
+**h@x**

-Any new features and suggestions are welcome!
+## Original Script written by:
+**hornetmaidan**
+
+Contributions and feedback are welcome! :-)
--- a/config.py
+++ b/config.py
@ -0,0 +1,40 @@
+# Configuration for Lainmonitor
+
+# Telegram bot token
+TOKEN = 'PLACE_YOUR_TOKEN_HERE'
+
+# Allowed Telegram chat IDs (whitelist)
+ALLOWED_CHATS = [123456789, 987654321]
+
+# Per-host configuration
+HOSTS = {
+    '10.0.0.1': {
+        'type': 'opnsense',
+        'api_url': 'https://10.0.0.1/api',
+        'api_key': 'OPN_KEY_1',
+        'api_secret': 'OPN_SECRET_1'
+    },
+    '10.128.0.1': {
+        'type': 'opnsense',
+        'api_url': 'https://10.128.0.1/api',
+        'api_key': 'OPN_KEY_2',
+        'api_secret': 'OPN_SECRET_2'
+    },
+    '10.144.0.1': {
+        'type': 'opnsense',
+        'api_url': 'https://10.144.0.1/api',
+        'api_key': 'OPN_KEY_3',
+        'api_secret': 'OPN_SECRET_3'
+    },
+    '10.130.1.1': {
+        'type': 'opnsense',
+        'api_url': 'https://10.130.1.1/api',
+        'api_key': 'OPN_KEY_4',
+        'api_secret': 'OPN_SECRET_4'
+    },
+    '10.177.0.100': {
+        'type': 'generic',
+        'ssh_user': 'SSH_USER_100',
+        'ssh_pass': 'SSH_PASS_100'
+    }
+}
--- a/lainmonitor.py
+++ b/lainmonitor.py
@ -1,197 +1,188 @@
-# --/usr/bin/env python3 -- #
-# description: telegram bot for monitoring the system
-# dependencies: telebot
-# usage: python3 lainmonitor.py | or run it as a service
-# author: hornetmaidan
-# contributors: h@x
-# version: 1.2
-import os
+#!/usr/bin/env python3
+
+# --------------------------------------------------------------------------
+# Description: A Telegram bot for monitoring critical infrastructur services
+# Dependencies: telebot
+# Usage: python3 lainmonitor.py | or run it as a service
+# Author: h@x
+# Version: 2.1.0
+# --------------------------------------------------------------------------
+
 import subprocess
-import threading
-import queue
-from time import sleep
 import telebot
+import paramiko
+import requests
+import time
+import socket
 import logging
+import ssl
+import os
+from concurrent.futures import ThreadPoolExecutor, as_completed
+from telebot import types
+import config

-# Setup logging
-logging.basicConfig(filename='lainmonitor.log', level=logging.INFO, 
-                    format='%(asctime)s - %(levelname)s - %(message)s')
+# Configure logging
+tlogging_format = "%(asctime)s [%(levelname)s] %(name)s: %(message)s"
+logging.basicConfig(level=logging.INFO, format=tlogging_format)
+logger = logging.getLogger(__name__)

-# Load environment variables and config files securely
-script_dir = os.path.dirname(os.path.realpath(__file__))
-env_path = os.path.join(script_dir, '.env')
-auth_users_path = os.path.join(script_dir, '.authorized_users')
+# Ensure certificate directory exists
+CERT_DIR = os.path.join(os.path.dirname(__file__), 'certs')
+if not os.path.isdir(CERT_DIR):
+    os.makedirs(CERT_DIR, exist_ok=True)

-# Load the token
-try:
-    with open(env_path, 'r') as f:
-        token = f.read().strip()
-except FileNotFoundError:
-    logging.error('Token file not found. Exiting...')
-    exit(1)
+bot = telebot.TeleBot(config.TOKEN)
+ALLOWED_CHATS = set(config.ALLOWED_CHATS)

-# Load the authorized users
-try:
-    authorized_users = [str(line.strip()) for line in open(auth_users_path, 'r').readlines()]
-except FileNotFoundError:
-    logging.error('Authorized users file not found. Exiting...')
-    exit(1)
-
-# Initialize the bot
-bot = telebot.TeleBot(token)
-
-# Define status variables
-status, hostname, uptime = 'unknown', 'unknown', 'unknown'
-zerotier, prosody, postgres, tailscale, nginx, disk = ['unknown'] * 6
-nodes, hostnames, threads = [], [], []
-reach_queue = queue.Queue()
-
-# Get basic system info
-def get_system_info():
-    global hostname, uptime, zerotier, prosody, postgres, tailscale, nginx, disk
+# Utility for command execution with timeout
+def run_cmd(cmd, timeout=5):
    try:
-        hostname = subprocess.check_output(['hostname']).decode().strip()
-        uptime = subprocess.check_output(['uptime', '-p']).decode().strip()
+        result = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
+        return result.stdout.strip()
+    except subprocess.TimeoutExpired as e:
+        logger.warning(f"Command {cmd} timed out: {e}")
+        return 'timeout'
+    except OSError as e:
+        logger.error(f"OS error running {cmd}: {e}")
+        return 'error'

-        services = ['zerotier-one', 'prosody', 'postgresql', 'tailscaled', 'nginx']
-        status_results = []
-        for service in services:
-            status_results.append(get_service_status(service))
-        zerotier, prosody, postgres, tailscale, nginx = status_results
+# Local system info
+def get_local_info():
+    hostname = run_cmd(['hostname'])
+    uptime = run_cmd(['uptime', '-p'])
+    load_line = run_cmd(['uptime'])
+    load_avg = load_line.split('load average:')[-1].strip() if 'load average:' in load_line else 'unknown'
+    memory = run_cmd(['free', '-h'])
+    disk = run_cmd(['df', '-h'])
+    status = 'online' if hostname not in ('', 'error', 'timeout') else 'offline'
+    return {'hostname': hostname, 'uptime': uptime, 'load_avg': load_avg, 'memory': memory, 'disk': disk, 'status': status}

-        disk = subprocess.check_output(['df', '-h']).decode().strip()
-    except subprocess.CalledProcessError as e:
-        logging.error(f"Error fetching system info: {e}")
-        status = 'offline'
+# Fetch and store SSL certificate once
+def fetch_certificate(host, port):
+    cert_path = os.path.join(CERT_DIR, f"{host}.pem")
+    if os.path.isfile(cert_path):
+        return cert_path
+    try:
+        cert = ssl.get_server_certificate((host, port))
+        with open(cert_path, 'w') as f:
+            f.write(cert)
+        logger.info(f"Saved certificate for {host} to {cert_path}")
+        return cert_path
+    except Exception as e:
+        logger.error(f"Failed to fetch certificate for {host}: {e}")
+        return True
+
+# SSH-based info gathering
+def get_ssh_info(ip, cfg):
+    client = paramiko.SSHClient()
+    client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
+    try:
+        client.connect(ip, username=cfg['ssh_user'], password=cfg['ssh_pass'], timeout=5)
+        info = {}
+        cmds = {'hostname': 'hostname', 'uptime': 'uptime -p', 'load_avg': 'uptime', 'memory': 'free -h', 'disk': 'df -h'}
+        for key, cmd in cmds.items():
+            try:
+                stdin, stdout, stderr = client.exec_command(cmd, timeout=5)
+                out = stdout.read().decode().strip()
+                if key == 'load_avg' and 'load average:' in out:
+                    out = out.split('load average:')[-1].strip()
+                info[key] = out
+            except (socket.timeout, paramiko.SSHException) as e:
+                logger.error(f"SSH command {cmd} on {ip} failed: {e}")
+                info[key] = 'error'
+        info['status'] = 'online'
+    except (paramiko.AuthenticationException, paramiko.SSHException, socket.timeout) as e:
+        logger.error(f"SSH connection to {ip} failed: {e}")
+        info = {'status': 'unreachable'}
+    finally:
+        try: client.close()
+        except Exception as e: logger.warning(f"Error closing SSH to {ip}: {e}")
+    return ip, info
+
+# OPNsense API-based info gathering
+def get_opnsense_info(ip, cfg):
+    url = cfg['api_url']
+    host = url.split('//')[1].split('/')[0].split(':')[0]
+    port = int(url.split('//')[1].split('/')[0].split(':')[1]) if ':' in url.split('//')[1].split('/')[0] else 443
+    verify = fetch_certificate(host, port)
+    try:
+        resp = requests.get(f"{url}/core/get/health", auth=(cfg['api_key'], cfg['api_secret']), verify=verify, timeout=5)
+        resp.raise_for_status()
+        data = resp.json().get('health', {})
+        return ip, {'status': data.get('health','unknown'), 'uptime': data.get('uptime','unknown'), 'memory': f"{data.get('mem_used','?')}MB/{data.get('mem_total','?')}MB", 'load_avg': data.get('load_avg','unknown'), 'disk': f"{data.get('disk_used','?')}%/{data.get('disk_total','?')}%"}
+    except requests.RequestException as e:
+        logger.error(f"OPNsense API call for {ip} failed: {e}")
+        return ip, {'status': 'unreachable'}
+
+# Gather info for given host or all hosts
+def gather_host(ip=None):
+    if ip and ip in config.HOSTS:
+        cfg = config.HOSTS[ip]
+        return [get_ssh_info(ip, cfg) if cfg['type']=='generic' else get_opnsense_info(ip, cfg)]
+    # all hosts
+    return gather_clients()
+
+# Ping utility
+def ping_ip(ip):
+    res = run_cmd(['ping', '-c', '1', ip], timeout=3)
+    if '1 packets transmitted, 1 received' in res or '1 packets transmitted, 1 packets received' in res:
+        return 'reachable'
+    if res in ('timeout', 'error'):
+        return res
+    return 'unreachable'
+
+# Access control decorator
+def restricted(func):
+    def wrapper(msg, *args, **kwargs):
+        if msg.chat.id not in ALLOWED_CHATS:
+            bot.reply_to(msg, 'Unauthorized access')
+            return
+        return func(msg, *args, **kwargs)
+    return wrapper
+
+# /status: show menu of available hosts
+@bot.message_handler(commands=['status'])
+@restricted
+def handle_status(msg):
+    keyboard = types.InlineKeyboardMarkup()
+    for ip in config.HOSTS.keys():
+        keyboard.add(types.InlineKeyboardButton(ip, callback_data=f'status:{ip}'))
+    keyboard.add(types.InlineKeyboardButton('All', callback_data='status:all'))
+    bot.send_message(msg.chat.id, 'Select host for status:', reply_markup=keyboard)
+
+# Callback handler for inline menu
+@bot.callback_query_handler(func=lambda c: c.data.startswith('status:'))
+@restricted
+def callback_status(call):
+    _, key = call.data.split(':', 1)
+    if key == 'all':
+        entries = gather_clients()
    else:
-        status = 'online'
+        entries = dict(gather_host(key))
+    lines = []
+    for ip, info in entries.items():
+        lines.append(f"{ip}: {info.get('status','unknown')}")
+        if info.get('status')=='online':
+            for field in ('uptime','load_avg','memory','disk'):
+                lines.append(f"  {field}: {info.get(field,'-')}")
+    bot.send_message(call.message.chat.id, '\n'.join(lines))

-# Helper function to get service status
-def get_service_status(service):
+# /ping <IP>
+@bot.message_handler(func=lambda m: m.text and m.text.startswith('/ping'))
+@restricted
+def handle_ping(msg):
+    parts = msg.text.split()
+    if len(parts) != 2:
+        bot.reply_to(msg, 'Usage: /ping <IP>')
+        return
+    ip = parts[1]
+    status = ping_ip(ip)
+    bot.reply_to(msg, f"Ping {ip}: {status}")
+
+# Run polling with retry
+while True:
    try:
-        subprocess.run(['sudo', 'systemctl', 'is-active', '--quiet', service], check=True)
-        return f'{service} is active'
-    except subprocess.CalledProcessError:
-        return f'{service} is inactive/not present'
-
-# Function to ping a Tailscale node
-def ping_node(node, hostname):
-    try:
-        ping = subprocess.run(['ping', '-c', '1', node], stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True)
-        reach_queue.put(f'{node}/{hostname} is reachable')
-    except subprocess.CalledProcessError:
-        reach_queue.put(f'{node}/{hostname} is unreachable')
-
-# Check Tailscale nodes
-def check_tailscale_nodes():
-    global nodes, hostnames, threads
-    try:
-        nodes_output = subprocess.check_output("tailscale status | grep '100'", shell=True).decode().strip()
-        nodes = [line.split()[0] for line in nodes_output.split('\n') if line]
-        hostnames = [line.split()[1] for line in nodes_output.split('\n') if line]
-
-        for node, hostname in zip(nodes, hostnames):
-            thread = threading.Thread(target=ping_node, args=(node, hostname))
-            threads.append(thread)
-            thread.start()
-
-        for thread in threads:
-            thread.join()
-
-        reach = []
-        while not reach_queue.empty():
-            reach.append(reach_queue.get())
-
-        return reach
-    except subprocess.CalledProcessError as e:
-        logging.error(f"Error checking Tailscale status: {e}")
-        return ['Error checking Tailscale status']
-
-# Function to restart a service
-def restart_service(service):
-    logging.info(f'Restarting {service}...')
-    try:
-        subprocess.run(['sudo', 'systemctl', 'restart', service], check=True)
-        sleep(3)
-        service_status = get_service_status(service)
-        status_message = f'{service} restarted! Status: {service_status}'
-        logging.info(status_message)
-        return status_message
-    except subprocess.CalledProcessError as e:
-        logging.error(f"Error restarting {service}: {e}")
-        return f'Error restarting {service}'
-
-# Restart services menu
-def restart_menu():
-    keyboard = [
-        [telebot.types.InlineKeyboardButton('zerotier-one', callback_data='zerotier-one')],
-        [telebot.types.InlineKeyboardButton('prosody', callback_data='prosody')],
-        [telebot.types.InlineKeyboardButton('postgresql', callback_data='postgresql')],
-        [telebot.types.InlineKeyboardButton('tailscaled', callback_data='tailscaled')],
-        [telebot.types.InlineKeyboardButton('nginx', callback_data='nginx')],
-        [telebot.types.InlineKeyboardButton('cancel', callback_data='cancel')]
-    ]
-    reply_markup = telebot.types.InlineKeyboardMarkup(keyboard)
-    return reply_markup
-
-# Callback query handler for service restart
-@bot.callback_query_handler(func=lambda call: True)
-def callback_query(call):
-    service = call.data
-    if service != 'cancel':
-        status_message = restart_service(service)
-        bot.send_message(call.message.chat.id, status_message)
-    else:
-        bot.edit_message_reply_markup(call.message.chat.id, call.message.message_id, reply_markup=None)
-        bot.send_message(call.message.chat.id, 'Canceled')
-
-# Reboot system function
-def reboot():
-    logging.info('Rebooting system...')
-    subprocess.run(['sudo', 'reboot'], check=True)
-
-# Populate teh variables on first start
-get_system_info()
-
-# Message handlers
-@bot.message_handler(commands=['start', 'help', 'status', 'restart', 'reboot', 'ping'])
-def handle(message):
-    user_id = str(message.from_user.id)
-    if user_id not in authorized_users:
-        bot.reply_to(message, 'You are not authorized for this action')
-    else:
-        if message.text == '/start':
-            bot.reply_to(message, 'lainmonitor v1.2 --- standing by...')
-        elif message.text == '/help':
-            bot.reply_to(message, 'commands: /start, /help, /status, /restart, /reboot, /ping')
-            bot.reply_to(message, 'commands: /start, /help, /status, /restart, /reboot, /ping')
-        elif message.text == '/status':
-            get_system_info()
-            status_message = (
-                f'hostname: {hostname}\n'
-                f'system status: {status}\n'
-                f'uptime: {uptime}\n'
-                f'zerotier: {zerotier}\n'
-                f'prosody: {prosody}\n'
-                f'postgres: {postgres}\n'
-                f'tailscale: {tailscale}\n'
-                f'nginx: {nginx}'
-            )
-            bot.reply_to(message, status_message)
-            bot.reply_to(message, f'Filesystem info for {hostname}:\n\n{disk}')
-        elif message.text == f'/restart {hostname}':
-            bot.send_message(message.chat.id, 'Select a service to restart:', reply_markup=restart_menu())
-        elif message.text == f'/reboot {hostname}':
-            bot.reply_to(message, f'Rebooting {hostname}...')
-            reboot()
-        elif message.text == '/ping':
-            reach = check_tailscale_nodes()
-            bot.reply_to(message, f'Ping status:\n\n{"\n".join(reach)}')
-        else:
-            pass
-# Polling with timeout and error handling
-try:
-    bot.polling(none_stop=True, timeout=60, long_polling_timeout=60)
-except Exception as e:
-    logging.error(f'Polling error: {e}')
-
+        bot.polling()
+    except Exception as e:
+        logger.error(f"Polling error: {e}")
+        time.sleep(5)
--- a/requirements.txt
+++ b/requirements.txt
@ -1 +1,3 @@
 telebot
+paramiko
+requests
Author	SHA1	Message	Date
hax	f2551e68a7	Add: Paramiko and requests added as requirement Paramiko and requests has been added for installation requirement Signed-off-by: hax <hax@lainlounge.org>	2025-07-22 09:56:15 +00:00
hax	435c481720	Add: /ping and /status separated - /ping can now be used individually to check against any IP address. - /status will bring up inline keyboard, where you can select either a general status request or per machine Signed-off-by: hax <hax@lainlounge.org>	2025-07-22 09:53:21 +00:00
hax	e7275ac1de	README.md aktualisiert Signed-off-by: hax <hax@lainlounge.org>	2025-07-22 09:15:06 +00:00
h@x	ce133c03ee	Merge: Rewritten from scratch' from Refactor2.0 into main Reviewed-on: hax/lainmonitor#1	2025-07-22 09:12:38 +00:00
hax	49eea13117	Add config.py Outsourced hardcoded credentials into a single config file. Signed-off-by: hax <hax@lainlounge.org>	2025-07-22 09:10:44 +00:00
hax	1eb23fd0d8	Rewritten from scratch - Removed Prosody, Tailscale, Zerotier, Postgresql checks - Add checks for OPNSense and Proxmox via SSH - Add SSL verification for trusted clients Signed-off-by: hax <hax@lainlounge.org>	2025-07-22 09:08:38 +00:00