Forum teuk.org

🌿 Multi-module audit and improvements (May 2026)

in Mediabot Β· started by TeuK Β· 1w ago

TeuK Β· 1w ago

Mediabot v3 β€” Multi-module audit and improvements (May 2026)

Context

Full audit of snapshot 20260502. Seven modules touched, sixteen items fixed or improved. The range spans security, memory management, runtime configuration, performance, and new Partyline commands. One item (A4) was identified but deliberately deferred due to a use Switch source filter interaction β€” detailed below.


Bug fixes

B1 β€” alarm() removed from External.pm Chromium subprocess

_fetch_url_chromium_dumpdom() was using alarm()/SIGALRM to enforce a timeout on the headless Chromium process. As with the earlier _reverse_dns_timeout fix, SIGALRM is unsafe inside an IO::Async event loop: it can interrupt epoll_wait() mid-flight and corrupt in-progress socket reads across all active connections.

The alarm is replaced by a non-blocking IO::Select polling loop. Every 100ms the loop checks whether the subprocess stdout is readable. If the wall-clock deadline passes, the subprocess is killed and the function returns undef β€” same semantics, no signal hazard.

B2 β€” Auth.pm session hashes grew without bound

maybe_autologin() was writing entries into $self->{sessions} and $self->{logged_in} on every successful autologin, but nothing ever removed them. On a busy server running for weeks, these hashes would accumulate one entry per IRC connection event, constituting a slow memory leak.

Two new methods are added:

  • logout($nick_or_uid) β€” removes the session by nick (case-insensitive) or numeric uid. Should be called from on_message_QUIT, on_message_KICK, and on_message_PART handlers.
  • session_count() β€” returns the current number of in-memory sessions, useful for diagnostics and tests.

B3 β€” Conf.pm was writing to STDERR instead of the logger

When MEDIABOT_DEBUG_CONF=1 was set, Conf->get() called warn to report missing keys. warn writes to STDERR β€” invisible in the bot log, uncategorized, and uncontrollable.

Conf now accepts an optional logger argument and a set_logger() method. If a logger is attached, missing-key warnings go through $logger->log(4, ...) at DEBUG4. If no logger is attached (e.g. during early init before the logger is constructed), it falls back to warn as before.

B4 β€” Partyline.pm .eval pending state leaked on session close

When a user typed .eval <code> and then disconnected before confirming, the _eval_pending_$id key was left in $self indefinitely. _close_session() now deletes it.


Improvements

A1 β€” Log.pm log rotation

The log file handle was opened once at startup and never rotated. On a bot running for weeks with DEBUG2 enabled, the log file could reach several gigabytes.

Three new capabilities:

  • Rotation by size: _maybe_rotate() is called before every write to the log file. When the file exceeds maxsize (default 50 MB), it is renamed to .log.1, previous rotated files are shifted up, and a fresh file is opened. Up to max_files (default 5) rotated files are kept.
  • reopen_logfile(): reopens the file handle β€” useful when called after an external logrotate with copytruncate, or on SIGHUP.
  • set_level($n): changes the active debug level at runtime. Used by the .rehash improvement below.

A3 β€” Metrics.pm Prometheus render cache

render_prometheus() was rebuilding the full metrics text on every HTTP GET /metrics, including sorting all metric names and iterating all label combinations. Prometheus typically scrapes every 15 seconds. On a bot with many labeled metrics, this is wasted CPU.

The rendered output is now cached for 5 seconds. Subsequent scrapes within the window return the cached string directly. The cache is invalidated automatically when the next render runs after expiry.

A4 β€” active_ban_for_mask in on_message_JOIN β€” deferred

ChannelBan::active_ban_for_mask() was identified as dead code: defined in the module but never called anywhere. The correct integration point is on_message_JOIN in mediabot.pl, where the bot should check incoming joins against active DB bans and enforce them with MODE +b / KICK.

The implementation was written and tested, but deploying it revealed a hard incompatibility: mediabot.pl uses use Switch (Damian Conway’s deprecated source-filter module). Adding any code before the switch/case block shifts line positions in the file, which confuses the source filter and produces cascading parse errors across the entire WHOIS handling section.

This item is deferred until the switch/case blocks in mediabot.pl are migrated to plain if/elsif chains. That migration is a standalone task and will unlock this improvement as a trivial follow-on.

A5 β€” Auth.pm BCrypt support

_password_matches() supported MySQL PASSWORD() hashes (*HASH) and double-SHA1 hex, with a plaintext fallback. BCrypt ($2b$, $2y$, $2a$) was not handled β€” passwords set by modern tools (PHPMyAdmin, Python scripts, PHP password_hash) would silently fall through to the plaintext comparison and fail.

BCrypt is now attempted via Crypt::Bcrypt when the stored hash starts with a $2 prefix. If Crypt::Bcrypt is not installed, the function returns (0, 'bcrypt_not_available') rather than falling through to plaintext β€” an explicit refusal is safer than a silent wrong comparison.

To install: cpan Crypt::Bcrypt or apt install libcrypt-bcrypt-perl.

A6 β€” External.pm weather cache eviction

_weather_cache in displayWeather_ctx() was a hash that grew indefinitely. TTL logic (3 min fresh, 15 min stale) controlled when cached values were used, but old entries for cities never requested again stayed in memory forever.

Lazy eviction is now triggered at write time: when the cache exceeds 200 entries, all entries older than 15 minutes are purged. This bounds memory usage without adding a timer.

A7 β€” Partyline.pm .eval confirmation expires after 30 seconds

The two-step .eval confirmation had no expiry. A confirmation request typed 10 minutes ago would still accept the second .eval as valid. The pending confirmation is now invalidated if more than 30 seconds have elapsed since the first invocation.

A8 β€” Partyline.pm .stat reduced from NΓ—2 queries to 2 + cache

For each channel, .stat was executing one query to find the owner and one to fetch chansets β€” N channels meant 2N synchronous DB queries in the event loop. On a bot managing 10 channels this was 20 queries per .stat call.

The command now runs two queries that fetch all owners and all chansets at once (one JOIN each), then matches them to channels in Perl. The result is cached for 60 seconds β€” repeated .stat calls within the window cost zero queries. The cache is invalidated by .unban.

A9 β€” Partyline.pm dead dcc_pass stage removed

The dcc_pass authentication stage in _handle_line had been unreachable since DCC CHAT sessions were unified with the standard nick/password flow. The dead branch (30+ lines) is removed. The log masking in _init_dcc_session already correctly checks for pass.

F1 β€” .unban #chan <mask|id> β€” remove bans from the Partyline

Counterpart to .bans. Accepts either a ban mask or a numeric ban ID from the CHANNEL_BAN table. Calls ChannelBan::mark_removed(), sends MODE -b to IRC if a mask (not an ID) was given, and logs the operation. Invalidates the .stat cache.

F2 β€” .topic #chan [text] β€” show or change channel topic

With no text argument, sends a TOPIC request to the server (the reply arrives on the console if .console is active). With text, sends TOPIC #chan :new topic. Master level required (already enforced by Partyline login).

F3 β€” .history β€” per-session command history

The last 10 commands typed in the current session are stored in $session->{history}. .history displays them numbered. The list is a sliding window β€” the oldest entry is dropped when the 11th command arrives. .history itself is not recorded to avoid noise.

F4 β€” debug_level reloaded by .rehash

rehash_runtime_state() now calls $self->{logger}->set_level(int($level)) after reloading the config, picking up the current main.MAIN_PROG_DEBUG value. Previously, changing the debug level in the config file had no effect until the bot was restarted.


Files changed

File Items
Mediabot/External.pm B1 A6
Mediabot/Auth.pm B2 A5
Mediabot/Conf.pm B3
Mediabot/Log.pm A1
Mediabot/Metrics.pm A3
Mediabot/Partyline.pm B4 A7 A8 A9 F1 F2 F3
Mediabot/Mediabot.pm F4
mediabot.pl A4 deferred β€” use Switch incompatibility

You must be logged in to reply.