Some updates bring a shiny new command.
Others descend into the dungeons, wake the sleeping child processes, question suspicious timers under Veritaserum, and politely ask Chromium why it is still holding the entire castle hostage. πΈοΈ
This Mediabot v3 hardening pass did not reinvent the bot. It did something more valuable:
It made the existing features faster, safer, more honest, and far harder to freeze.
The rule of the day was simple:
No download, DNS lookup, child process, timeout,
poll, Partyline command, or AI reply
may hold the whole IRC bot hostage.
And one sacred promise was kept from beginning to end:
No database schema change.
No new table.
No new column.
No migration.
No stored-data conversion.
So grab your wand, keep an eye on the process table, and mind the zombies. π»
Mediabot::Auth used:
$level ||= 3;
In Perl, numeric zero is false.
That meant an authentication failure intentionally logged at level 0 could be silently promoted to DEBUG3 and disappear from normal production logs.
A critical error had effectively discovered an invisibility cloak. π«₯
The fix now applies the default only when the level is undefined:
$level = 3 unless defined $level;
Level-zero authentication failures remain visible exactly as intended.
The Metrics logger translated symbolic ERROR messages to numeric level 1.
But in Mediabot, level 0 is the level that remains visible even with minimal debugging enabled.
A Prometheus listener bind failure or radio-status provider error could therefore vanish into the Restricted Section without leaving a note.
Severity mapping is now explicit and consistent:
INFO β 0
ERROR β 0
WARN β 1
DEBUG β 2
Alternate logger objects also receive the correct named method instead of a vague approximation.
When pollstop closed a poll, the winner could be announced as:
Winner: 0
Technically, that was the internal zero-based option index.
Humanly, it was about as useful as a Marauderβs Map with no names on it. πΊοΈ
Weighted polls had another inconsistency:
pollresult respected weights;pollstop ignored them.The same poll could therefore produce two different winners.
The poll system now:
Example:
Pizza x3 : 2 votes β score 6
Sushi x1 : 3 votes β score 3
The winner is now Pizza.
Not option 0.
Not Sushi just because it had more voters.
Justice has returned to the Great Hall. π
The yt-dlp watcher decoded child status with:
my $exit = $? >> 8;
That works for a normal process exit.
But when a process is killed by TERM or KILL, the signal lives in the low bits of the raw wait status. Shifting by eight could produce exit code 0.
A real timeout could therefore end with this misleading message:
download finished, but no readable MP3 file was produced
The new logic distinguishes:
waitpid() failure.Timeouts use conventional exit code 124, and the proper timeout message finally reaches the user.
No more Polyjuice Potion for failed downloads. π§ͺ
The Chromium fallback already limited how long it read stdout and stderr.
But after both pipes closed, it still performed:
waitpid($pid, 0);
Pipe EOF does not prove that the child process has exited.
A broken Chromium process could close both pipes, remain alive, and freeze Mediabot forever.
The new child-reaping sequence is bounded:
WNOHANG;TERM;KILL;One cursed web page can no longer turn the whole bot into stone. πͺ¨
.eval Loses Its Blocking HourglassThe Owner-only .eval watchdog paused the main event loop with:
usleep(500_000);
For half a second, everything stopped:
Worse, an unconditional waitpid($pid, 0) could freeze the bot indefinitely if evaluated code closed its output and kept running.
The watchdog is now fully asynchronous:
TERM and KILL escalation use timers;WNOHANG;The dangerous spell remains Owner-only, but its containment wards are much stronger. πͺ
radiodlcancel Now Means βActually CancelledβThe old cancellation path sent KILL, called:
waitpid($pid, WNOHANG);
once, then immediately announced success.
But WNOHANG may return zero while the child is still alive.
Mediabot could therefore:
yt-dlp process lingering;Cancellation now has real states:
active
cancelling β cancel_phase=term
cancelling β cancel_phase=kill
Cleanup occurs only after confirmed child reaping.
Repeated cancel commands no longer spawn duplicate timer chains.
The process must leave the castle before the gates are declared closed. πͺ
resolve Finally Becomes Truly AsynchronousReverse DNS used:
gethostbyaddr(...)
directly inside the main Mediabot process.
A slow resolver could pause the entire bot.
Forward lookup already used a child process, but Mediabot still waited a fixed three seconds before reading the result.
Even a lookup completed in milliseconds had to perform its full dramatic entrance. π
Forward and reverse lookup now share one asynchronous pipeline:
TERM to KILL is asynchronous;WNOHANG;Fast DNS is now fast.
Slow DNS no longer drags the whole bot into the Forbidden Forest. π²
Long AI answers are split into IRC-sized chunks.
Previously, each chunk was separated by a blocking usleep().
With four chunks and the default pacing delay, Mediabot could pause for roughly three seconds.
The new system uses an asynchronous queue per IRC target:
The existing settings remain unchanged:
openai.SLEEP_US
anthropic.SLEEP_US
The AI still speaks politely.
It simply stops putting the entire castle to bed between sentences. π΄
Every telnet or DCC Partyline connection performed reverse DNS in the main process.
A slow lookup could delay the entire bot before the user had even entered.
The new behavior:
The DNS owl may arrive later.
The door remains open. π¦
whereis Finally Knows Where to Send the ReplyThe whereis command could return undef when:
country field was missing;The visible result could become:
Country :
Private requests had another flaw: the reply was always sent to a channel, even when no channel existed.
The corrected command now guarantees:
N/A for controlled failure cases;No more empty owl.
No more answer thrown into a wall. π
This entire series follows the same engineering principle.
Before:
sleep
usleep
waitpid(..., 0)
fixed waiting
state cleared too early
signal interpreted as success
duplicate timers
silent failure
After:
asynchronous timers
bounded deadlines
waitpid(..., WNOHANG)
explicit runtime state
idempotent cleanup
single final reply
accurate failure reporting
This pass is not spectacular because a new button appeared.
It is spectacular because the old buttons are now much harder to break. π‘οΈ
Each correction received focused regression coverage.
Across MB304 to MB314, 218 targeted assertions exercised:
The final full project suite remains the last gate before commit and push.
The pre-commit guard already proved useful in real conditions: it detected a local SQL artifact and blocked the commit before a schema-related file could be added accidentally. π§―
Even the anti-mistake wards got their own practical exam.
Still none.
0 new tables
0 modified columns
0 migrations
0 data conversions
0 SQL files included in the commit
The database watched the entire adventure from a comfortable armchair and was never disturbed. β
Mediabot v3 is now:
It did not receive a new broom.
It received something better:
stronger wards, better brakes, and far fewer ghosts in the process table. π»β¨
You must be logged in to reply.