🧙‍♂️ The Great Modularization: How We Tamed a 15,000-Line Spellbook

in Mediabot · started by TeuK · 1mo ago

TeuK · 1mo ago

Mediabot v3 — Development Journal, March 2026

A Monster in the Making

Every long-lived project has one: a file that started small and quietly grew into something terrifying. For Mediabot v3, that file was Mediabot.pm — a 15,530-line monolith containing everything: IRC dispatching, database commands, channel management, user authentication, radio streaming, external API calls, quote management, admin tools, and more.

Reading it felt like opening the Restricted Section at Hogwarts. Fixing a bug meant searching through thousands of lines. Adding a feature risked breaking six unrelated things. It was time to cast the Accio Modularization spell.

The Spell: Breaking the Monolith

The core idea was straightforward: identify logical domains, extract their functions into dedicated modules, and wire them back together through clean imports. What followed was weeks of careful surgery.

The 10 New Grimoires

Module	Responsibility	Lines
`Mediabot::Helpers`	Shared utilities: `botNotice`, `logBot`, `checkUserLevel`, DNS, flood control	2,747
`Mediabot::ChannelCommands`	Channel management: join/part, chanset, badwords, access lists	3,157
`Mediabot::DBCommands`	Custom commands, ignores, timers, responders, CRUD	2,286
`Mediabot::UserCommands`	User management: add/del, auth, levels, seen, greet	1,409
`Mediabot::Radio`	Icecast/Liquidsoap: play, queue, listeners, metadata	2,268
`Mediabot::External`	YouTube, TMDB, ChatGPT, weather, URL titles	1,120
`Mediabot::LoginCommands`	Authentication: login, autologin, hostmask matching	621
`Mediabot::AdminCommands`	Bot control: status, rehash, restart, jump, exec	466
`Mediabot::Hailo`	Hailo AI brain: learn, reply, spike responses	545
`Mediabot::Quotes`	Quote database: add, search, random, stats	457

Result: Mediabot.pm went from 15,530 lines down to 1,068. The core now contains only what it should: the constructor, IRC event loop integration, and the two main dispatchers (mbCommandPublic / mbCommandPrivate).

The Export Architecture

Every module that needs shared helpers simply declares:

use Mediabot::Helpers;

Mediabot::Helpers exports all shared functions via @EXPORT — botNotice, botPrivmsg, logBot, checkUserLevel, getIdChansetList, and 50+ more. Each module also declares its own use statements independently, because Perl does not inherit use declarations — a lesson learned the hard way through a cascade of Bareword not allowed errors.

The DB Migration: Laying Better Foundations

Alongside the modularization, a full MariaDB schema migration was applied across all Mediabot instances.

What Changed

P1 — USER.hostmasks CSV column extracted into a proper USER_HOSTMASK table with foreign keys
P2 — 19 foreign keys added with appropriate CASCADE / SET NULL behaviors
P3 — Deprecated WEBLOG.password column removed
P4 — CHANNEL_LOG.publictext upgraded to TEXT
P5 — USER table encoding migrated to utf8mb4
P6 — 12 indexes added for query performance
P7 — USER.auth normalized to TINYINT(1)
P8 — All bigint(20) columns converted to BIGINT UNSIGNED

The Migration Script

A production-safe install/db_migrate.sh was written to handle this automatically: it reads the configuration file, creates a full backup via mysqldump, applies each migration step idempotently, and validates foreign keys using a stored procedure compatible with MariaDB 10.x.

⚠️ This migration is required before deploying this version. Run install/db_migrate.sh on every instance before starting the bot.

Bugs Hunted and Slain

The `require_level` Phantom Crash

The most insidious bug: throughout the codebase, authenticated commands were written as:

my $user = $ctx->require_level("Administrator") or return;
# ... later ...
$sth->execute($user->id, ...);

The problem: require_level() returns 1 (boolean) on success, not the user object. Calling "1"->id in Perl throws a fatal exception that kills the bot process with an EOF from client. Fixed by separating the authorization check from the user object retrieval:

return unless $ctx->require_level("Administrator");
my $user = $ctx->user;
return unless $user;

This was present in 5 different modules.

The Blocking DNS Disaster

The resolve command used gethostbyname() — a blocking system call that freezes the entire Net::Async::IRC event loop. The bot would hang, miss PINGs, and get disconnected by the server.

The fix uses open(my $pipe, '-|', ...) to spawn a child Perl process for the DNS lookup, sets the pipe to non-blocking mode with Fcntl, and collects the result 3 seconds later via IO::Async::Timer::Countdown — keeping the event loop free throughout.

The `$self->{db}->dbh` Ghost

userLogin_ctx was calling $self->{db}->dbh to get a database handle. But $self->{db} is never initialized — the bot exposes $self->{dbh} directly. The eval {} wrapper was silently swallowing the crash and returning undef, causing every login attempt to fail with “Internal error (DB unavailable)” instead of authenticating. Fixed by using $self->{dbh} directly and replacing a nonexistent level_id_to_desc() method with a direct USER_LEVEL query.

The Live Test Suite: 13 Scrolls, 68 Assertions

A full live testing framework was built to validate bot behavior against a real IRC server and a real (test) database.

Architecture

t/test_live.pl is the runner. It:

Creates a fresh mediabot_test database from t/live/schema_test.sql
Generates a test.conf from a template with randomized bot/spy nicks
Spawns the bot as a subprocess
Connects a “spy” IRC client that observes and sends commands
Runs all .t test files in order, with die_last always executed last
Tears down cleanly — kills the bot, drops the DB

The 13 Test Scrolls

#	File	Coverage
01	`01_connect.t`	WHOIS identity, version string
02	`02_routing.t`	PRIVMSG vs NOTICE routing
03	`03_auth.t`	Login success/failure, whoami
04	`04_dispatch_public.t`	Public command dispatch
05	`05_dispatch_private.t`	Private command dispatch
06	`06_commands_auth.t`	Authenticated command responses
07	`07_channel_commands.t`	chaninfo, chanset, access, seen
08	`08_user_commands.t`	users, userinfo, whoami, greet
09	`09_quotes.t`	q add/search/random/view/del
10	`10_external_commands.t`	date, leet, colors, echo, status
11	`11_ignores_responders.t`	ignore/unignore, yomomma, timers
12	`12_db_commands.t`	addcmd/showcmd/delcmd/searchcmd
13	`13_die_last.t`	Clean bot shutdown via `!die`

New Runner Features

--from N / --to N — Run only a subset of tests (e.g. --from 10 to debug the last few)
die_last guarantee — Files matching die_last always run last, regardless of numeric prefix
Bot death detection — Before each test, kill(0, $bot_pid) checks the bot is still alive; remaining tests are skipped with a clear message if it has crashed
Automatic debug output — On timeout, the last lines received from the bot (and the bot’s own log tail) are printed automatically
Buffer drain helper — A $drain closure is passed to each test file to flush residual multi-line responses between commands

Running the Suite

# Full suite
perl t/test_live.pl --verbose --server localhost --port 6667 --channel '#mbtest'

# From test 10 onwards
perl t/test_live.pl --verbose --server localhost --port 6667 --channel '#mbtest' --from 10

# A specific range
perl t/test_live.pl --verbose --server localhost --port 6667 --channel '#mbtest' --from 07 --to 09

What’s Still in the Room of Requirement

A few items are flagged for future work:

rehash crash — The command kills the bot under certain conditions; cause under investigation
resolve/whereis in tests — Excluded from the test suite pending a stable non-blocking implementation
Radio module — Not tested (Liquidsoap/Icecast not operational in the test environment)
ignores list verification — The “list after add” test is skipped; a DB scope mismatch is suspected

By the Numbers

Metric	Before	After
`Mediabot.pm` lines	15,530	1,068
Modules	1 monolith	10 focused modules
DB tables with FK	0	19
Live test assertions	0	68
Known crash bugs fixed	0	3

“It is our choices, Harry, that show what we truly are, far more than our abilities.” — This commit chose modularity.

You must be logged in to reply.