🛡️🧙‍♂️ Mediabot v3: The Runtime Shield Charm — logs, timers, caches, metrics and tests hardened

in Mediabot · started by TeuK · yesterday

TeuK · yesterday

Some commits add a shiny new command.

This one does something better.

It walks through the castle at night, checks every door, tightens the hinges, sweeps the hidden corridors, and makes sure nobody left passwords written on the walls.

This is a runtime hardening pass for Mediabot v3.

No database schema change.

No migration.

No new table.

No new column.

Just a large, careful set of fixes around long-running behavior, security, caching, timers, logging, metrics, channel state, and regression tests.

This is the kind of commit that does not shout.

It makes the bot harder to break.

🧭 What this pass is about

The recent development branch already had a lot going on: social commands, achievements, metrics, radio hardening, Claude integration, schema drift checks, better tests, and multiple command dispatch improvements.

After that kind of sprint, the right move is not always to add more sparkle.

The right move is to ask:

What can go stale?
What can leak?
What can fire twice?
What keeps living after it should be gone?
What does the test suite claim but not really prove?

This pass answers those questions.

🔐 Passwords are no longer written into private-message logs

One of the most important fixes is security-related.

Outbound private messages to IRC services were logged too literally.

That matters because the bot can talk to services such as:

NickServ
X / CService

and send messages like:

identify <password>
identify <account> <password>
login <user> <password>
register <password> <email>
ghost <nick> <password>
recover <nick> <password>
release <nick> <password>
set password <password>

The previous behavior could expose credentials in logs.

That is not acceptable.

The redaction logic now masks service passwords before writing log lines, while keeping the original IRC message unchanged on the wire.

Examples:

identify secret              -> identify ****
identify teuk secret         -> identify teuk ****
id teuk secret               -> id teuk ****
login mybot secret           -> login mybot ****
register secret user@mail    -> register **** user@mail
set password secret          -> set password ****

This was then centralized through a shared helper so the protection is not limited to only one outbound path.

The shared redactor is now used for private outbound logging in:

botPrivmsg
botAction
botNotice

The important rule is simple:

send the real message to IRC
log only the redacted copy

🧾 Log rotation now writes to the correct file

The log rotation code had a subtle filehandle ordering bug.

The old order was effectively:

my $fh = $self->{logfilehandle};
$self->_maybe_rotate();
print $fh $logline;

On Linux, an open filehandle still points to the old inode even after the file is renamed.

So the first log line after rotation could be written into the rotated file instead of the newly opened log file.

The fix is straightforward but important:

$self->_maybe_rotate();
my $fh = $self->{logfilehandle};
print $fh $logline;

That way, post-rotation writes go where they belong.

Small bug.

Classic long-running daemon problem.

Good catch.

📡 Metrics HTTP Content-Length is byte-correct with UTF-8

The metrics HTTP endpoint now computes Content-Length using bytes, not Perl character count.

That matters when exported labels contain UTF-8.

For ASCII, character length and byte length are the same.

For UTF-8, they are not.

Wrong Content-Length can confuse HTTP clients, scrapers, or proxies.

The regression tests now prove the difference with accented characters and emoji.

Mediabot is an IRC bot. Unicode happens. The HTTP response must count bytes.

⏱️ Nicklist timers have a cleaner lifecycle

The nicklist timer behavior was hardened around channel add, join, part, and purge.

The important lifecycle is now validated:

addchan -> timer exists
part    -> timer is stopped, channel remains registered
join    -> timer comes back
purge   -> timer is stopped and channel disappears

This is exactly the kind of behavior that matters in a bot that stays online for a long time.

Timers should not be ghosts.

They should exist only when the channel lifecycle says they should.

🚪 PART no longer logs out users too aggressively

A user leaving one channel is not the same thing as quitting IRC.

The old behavior could globally logout an authenticated user when they PARTed one shared channel, even if they were still present with the bot on other channels.

That is wrong.

The bot now checks whether the nick is still present on another shared channel before logging them out.

Correct behavior:

Bob leaves #chanA but is still on #chanB -> keep auth
Bob leaves the last shared channel        -> logout
Bob QUITs IRC                             -> logout

PART semantics and QUIT semantics are no longer confused.

🧹 Channel purge now really clears runtime state

Purging a channel already removed database rows.

But the runtime could still keep channel-scoped state in memory.

That is a classic stale-cache problem.

The purge cleanup now covers a wider set of caches, including:

channels registry
hChannelNicks
_badword_cache
_af_params
_chan_flood
_chan_flood_conf
_cmd_cooldown
_cmd_cooldown_conf
_chanset_cache
_uchan_level_cache
_quote_last_rand
_quote_bynick_last
_quotegame
_karma_log
_karma_brigade
_karma_cooldown
_duel_stats
_duel_cooldown
_duel_streak
_duel_last_result
_ignore_cache

That prevents a deleted channel from coming back with stale runtime baggage if it is recreated later.

A purge should not leave footprints.

👻 Claude session ghosts are cleaned correctly

Claude state has different key conventions:

history         -> raw IRC nick
persona         -> lower-case nick
activity marker -> lower-case nick

Earlier fixes handled manual !ai forget.

This pass continues the cleanup work by ensuring QUIT/NICK behavior uses the right key conventions too.

When a user changes nick or genuinely quits, Claude runtime state is purged correctly:

history by raw nick
persona by lc(nick)
activity by lc(nick)

No more case-sensitive ghost sessions hiding in memory.

🔎 `seen` wildcard matching now escapes SQL LIKE properly

The seen command supports IRC-style wildcards:

seen teu*
seen te?k

Internally this maps to SQL LIKE.

The dangerous part is that SQL LIKE has its own magic characters:

%  any sequence
_  any single character

So user input containing literal _ or % must be escaped.

The new conversion is character-by-character:

*  -> %
?  -> _
!  -> !!
%  -> !%
_  -> !_

and the query uses:

LIKE ? ESCAPE '!'

This affects both IRC seen and Partyline .seen.

A wildcard should only be a wildcard when the user actually asked for one.

🧠 Ignore checks now use a short runtime cache

isIgnored() runs on the hot path.

Doing repeated SQL queries for every message is wasteful when ignore rules rarely change.

The new cache keeps ignore hostmasks briefly in memory:

global ignores
channel-specific ignores

with a short TTL:

30 seconds

The cache stores masks, not match results.

Every message is still matched against its real prefix using the existing hostmask matcher.

The cache is explicitly invalidated when ignore rules change:

ignore
unignore
channel purge

So the behavior remains correct while removing unnecessary repeated SQL work.

🎭 `botAction()` now uses the same badword cache as `botPrivmsg()`

botPrivmsg() already used _badword_cache.

botAction() still had an older direct SQL path.

That meant channel ACTIONs could still hit the database repeatedly for BADWORDS, and the no-badword path was not as clean around statement-handle finishing.

botAction() now follows the same cached pattern:

per-channel _badword_cache
TTL 300 seconds
finish statement handle on success
finish defensively on SQL error
increment DB error metric on failure

Same behavior.

Cleaner runtime.

Less repeated SQL.

🧭 Outbound helpers now recognize all IRC channel prefixes

Several helper paths treated only targets starting with # as channels.

But IRC channel prefixes can include:

#
&
!
+

A shared helper now handles this consistently:

_is_irc_channel_target($target)

using:

defined($target) && $target =~ /^[#&!+]/

This makes outbound classification consistent in:

botPrivmsg
botAction
botNotice

That means &local, !safe, and +modeless are no longer treated like private targets by mistake.

🧪 Recent regression tests now run directly

A large commit needs tests that are easy to run.

Some recent tests still used an old harness-only pattern:

return sub {
    ...
};

That works only when loaded by the custom test harness.

If run directly:

perl t/cases/test.t

they fail with:

Can't return outside a subroutine

That is annoying and fragile.

The recent tests now support both modes:

project harness
direct CLI execution

A broken assertion involving @words interpolation was also fixed.

The recent test set now validates directly:

PART logout and purge caches
nicklist timers and Metrics UTF-8 length
log rotation and password redaction
botAction badword cache
channel target detection and badword handle cleanup

🧪 Validation snapshot

The recent tests now run cleanly in direct mode:

387_mb128_part_logout_and_purge_caches.t       29/29 OK
388_mb129_nicklist_timer_and_metrics_utf8.t    22/22 OK
389_mb130_log_rotation_and_password_redact.t   25/25 OK
393_mb134_botaction_badword_cache.t            12/12 OK
394_mb135_channel_target_and_badword_finish.t  11/11 OK

That is 99 direct assertions across the recent hardening set.

This is exactly what we want before a large commit.

🧰 Files touched in this hardening batch

The batch spans runtime code and tests, including:

mediabot.pl
Mediabot/Helpers.pm
Mediabot/ChannelCommands.pm
Mediabot/DBCommands.pm
Mediabot/Log.pm
Mediabot/Metrics.pm
t/cases/387_mb128_part_logout_and_purge_caches.t
t/cases/388_mb129_nicklist_timer_and_metrics_utf8.t
t/cases/389_mb130_log_rotation_and_password_redact.t
t/cases/390_mb131_ignore_cache.t
t/cases/391_mb132_privmsg_redact_and_ignore_private.t
t/cases/392_mb133_outbound_private_log_redaction.t
t/cases/393_mb134_botaction_badword_cache.t
t/cases/394_mb135_channel_target_and_badword_finish.t

No schema file.

No migration file.

No database seed change.

🧯 Security note

The code now prevents future logging of common IRC service passwords.

But old logs may already contain secrets from before this hardening pass.

Recommended operator actions after deploying this build:

rotate IRC service passwords if they were present in old logs
review / purge old log archives if needed
check log collector/indexer retention if logs were shipped elsewhere
restrict log file permissions

The code prevents future leaks.

It cannot erase old ones.

🕯️ Final note

This is a large maintenance commit.

Not glamorous.

Important.

It fixes credential logging. It fixes log rotation ordering. It fixes HTTP byte length. It fixes timer lifecycle. It fixes PART auth semantics. It clears runtime ghosts after purge. It makes SQL wildcard matching safer. It caches hot-path ignore lookups. It aligns action/privmsg badword checks. It recognizes IRC channel targets properly. It makes the recent tests runnable directly.

That is not a cosmetic change.

That is the bot becoming more mature.

A little less haunted.

A little less leaky.

A little more ready to run for a long time.

🛡️🧙‍♂️🧹

You must be logged in to reply.