Forum teuk.org

Command Output Polish, Safer Matching, and Runtime Hardening

in Mediabot · started by TeuK · 1w ago

TeuK · 1w ago

Mediabot v3 — Command Output Polish, Safer Matching, and Runtime Hardening

Context

This development pass focused on improving Mediabot v3 without changing its spirit.

The main goals were simple:

  • make long IRC outputs readable instead of truncated;
  • avoid accidental SQL wildcard behavior when users type % or _;
  • replace old LIKE lookups with exact matches where the value is an identifier;
  • remove large GROUP_CONCAT hostmask outputs;
  • harden a few long-running runtime paths;
  • add regression tests for every improvement.

This was a broad quality pass across public commands, channel tools, user/account tools, quote search, responders, Partyline, and scheduler controls.


Suggested commit message

🪄 Revelio Maxima: polish command output, protect LIKE lookups, and tidy runtime spells

Alternative shorter version:

🪄 Revelio Maxima: paginate outputs and harden Mediabot lookups

Main theme: no more unreadable one-line output

Several commands used to build one huge IRC line, then truncate it around 300–400 characters.

That meant useful information could be silently lost.

This pass replaced those old outputs with a consistent pattern:

short summary
paginated details by NOTICE
numbered pages
safe line length

The affected commands now behave much better on real IRC channels.


Commands now paginated

searchcmd

searchcmd now sends a short public summary and paginated details by notice.

It also uses MariaDB-safe LIKE escaping with:

ESCAPE '!'

This prevents % and _ from behaving as accidental SQL wildcards.

Files:

Mediabot/DBCommands.pm
t/cases/18_searchcmd_pagination.t

topcmd

topcmd now avoids dumping the top 20 commands into one long line.

It sends:

Top commands: X result(s), showing max 20
topcmd[01]: ...
topcmd[02]: ...

Files:

Mediabot/DBCommands.pm
t/cases/22_topcmd_pagination.t

popcmd

popcmd now paginates popular commands for a given handle.

It also escapes SQL LIKE patterns safely, so % and _ in the searched handle are treated literally.

Files:

Mediabot/DBCommands.pm
t/cases/24_popcmd_pagination.t

owncmd

owncmd now paginates command owner statistics instead of building one huge owner list.

It also limits the owner list to a sane maximum.

Files:

Mediabot/DBCommands.pm
t/cases/25_owncmd_pagination.t

countcmd and lastcmd

countcmd now paginates category counts.

lastcmd now paginates the most recent commands.

Files:

Mediabot/DBCommands.pm
t/cases/26_count_last_cmd_pagination.t

chanlist

chanlist now sends a clean summary and paginated channel list.

Instead of repeating a huge header on every output line, it now outputs:

Registered channels: X result(s)
chanlist[01]: #chan1(3) #chan2(8) ...

Files:

Mediabot/ChannelCommands.pm
t/cases/27_channel_list_pagination.t

nicklist

nicklist now uses numbered pages:

Users on #channel: X result(s)
nicklist[01]: nick1 nick2 nick3 ...

The old max-length chunking code was removed.

Files:

Mediabot/ChannelCommands.pm
t/cases/36_nicklist_pagination.t

whotalk

whotalk now keeps its useful “top talkers” feature but no longer truncates the result list.

It sends a short summary in-channel and the detailed list by notice.

The gentle flood warning remains intact.

Files:

Mediabot/Helpers.pm
t/cases/28_whotalk_pagination.t

checkhost, checknick, and checkhostchan

The host/nick investigation commands now paginate their result sets.

This avoids long single-line outputs when many hostmasks or nicks match.

Files:

Mediabot/Helpers.pm
Mediabot/ChannelCommands.pm
t/cases/29_checkhost_checknick_pagination.t
t/cases/30_checkhostchan_pagination.t

topsay

topsay now paginates repeated sayings instead of stopping early because the output line became too long.

It keeps:

  • the SQL LIMIT 30;
  • the skip-pattern filtering;
  • action formatting;
  • channel/private behavior.

Files:

Mediabot/UserCommands.pm
t/cases/37_topsay_pagination.t

cstat

cstat now paginates authenticated users instead of building one huge line.

Files:

Mediabot/UserCommands.pm
t/cases/35_cstat_pagination.t

Safer SQL LIKE handling

A recurring problem was not SQL injection — prepared statements were already used.

The issue was correctness.

In SQL LIKE, these characters are special:

%  any sequence of characters
_  any single character

So if a user searches for a literal % or _, the database may return far too many rows.

This pass standardized literal LIKE escaping with:

ESCAPE '!'

and the following escaping rules:

!  -> !!
%  -> !%
_  -> !_

Commands hardened with LIKE ESCAPE '!'

searchcmd

Fixed MariaDB quoting problems and accidental wildcard behavior.


quote search

mbQuoteSearch now uses MariaDB-safe LIKE escaping.

Multi-word quote search still works with AND logic.

Files:

Mediabot/Quotes.pm
t/cases/12_hotfix_after_claude.t
t/cases/33_quote_search_like_escape.t

checkhost and checkhostchan

Host searches now escape %, _, and ! before building the host suffix pattern.

Also fixed the checkhostchan summary so it matches its SQL limit.

Files:

Mediabot/Helpers.pm
Mediabot/ChannelCommands.pm
t/cases/39_checkhost_like_escape.t

qlog

qlog now escapes nick and public text searches safely.

It still keeps ordered multi-term matching:

word1 -> word2 -> word3

Files:

Mediabot/ChannelCommands.pm
t/cases/40_qlog_like_escape.t

topsay

topsay now escapes the nick filter safely.

Files:

Mediabot/UserCommands.pm
t/cases/41_topsay_like_escape.t

Exact matching where LIKE was the wrong tool

Some database fields are identifiers, not search patterns.

For those, LIKE was not appropriate.

Public command names

Public command lookups now use exact matches:

WHERE command = ?

instead of:

WHERE command LIKE ?

This affects:

  • duplicate checks;
  • command removal;
  • hold/unhold;
  • showcmd;
  • runtime command execution.

Files:

Mediabot/DBCommands.pm
t/cases/42_public_command_exact_match.t

Responders

Responders now use exact matching too.

That prevents responder names containing % or _ from matching unintended rows, and it avoids accidental broad deletes in delresponder.

Files:

Mediabot/DBCommands.pm
t/cases/43_responder_exact_match.t

Hostmask output cleanup

Several old commands used GROUP_CONCAT(hostmask...) and then dumped all hostmasks into one line.

That does not age well on a long-running bot.

This pass replaced those paths with row-by-row hostmask fetching and paginated output.


whoami

whoami now shows hostmasks as paginated notice lines:

Masks: X shown, max 20
whoami-masks[01]: mask1 | mask2

Files:

Mediabot/LoginCommands.pm
t/cases/31_whoami_hostmask_pagination.t

userinfo

userinfo now does the same:

Hostmasks: X shown, max 20
userinfo-masks[01]: mask1 | mask2

Files:

Mediabot/UserCommands.pm
t/cases/32_userinfo_hostmask_pagination.t

Partyline .match

Partyline .match no longer uses GROUP_CONCAT for user hostmasks.

It now displays hostmasks cleanly:

Hosts   : X shown, max 20
Hosts[01]: mask1 | mask2

Files:

Mediabot/Partyline.pm
t/cases/34_partyline_match_hostmask_pagination.t

Topic autologin hostmask lookup

The topic autologin path no longer fetches hostmasks as one comma-concatenated SQL string.

It now reads them row by row and keeps the same matching behavior.

Files:

Mediabot/ChannelCommands.pm
t/cases/38_topic_autologin_hostmask_fetch.t

Authentication and legacy auth cleanup

Legacy auth hash safety

The legacy pass and ident paths now protect make_password_hash() with eval.

If hashing fails, the bot logs the problem and returns cleanly instead of risking an exception through the IRC event loop.

Files:

Mediabot/LoginCommands.pm
t/cases/23_legacy_auth_hash_safety.t

Statement handle cleanup

checkAuthByUser() now finishes statement handles on more return paths.

This avoids subtle leaks in legacy authentication code on a long-running bot.

Files:

Mediabot/LoginCommands.pm
t/cases/23_legacy_auth_hash_safety.t

Partyline improvements

.schedule

Partyline now has runtime scheduler control:

.schedule list
.schedule status <name>
.schedule start <name>
.schedule stop <name>
.schedule restart <name>

It is restricted to Master/Owner level.

It also has safer runtime behavior:

  • uses task_info() when available;
  • falls back to all_info;
  • wraps scheduler actions in eval;
  • reports clean errors to the Partyline.

Files:

Mediabot/Partyline.pm
t/cases/21_partyline_schedule_control.t

.match

As described above, .match now paginates hostmasks and avoids large SQL concatenation.


Background maintenance visibility

Background purges now notify the console channel when they actually delete rows.

No rows deleted means no noise.

Files:

Mediabot/Mediabot.pm
t/cases/20_purge_console_notice.t

Help and usability

help

help now routes to level-filtered command display when appropriate instead of only pointing users at the wiki.

It still supports explicit documentation requests such as:

help wiki
help docs

Files:

Mediabot/Mediabot.pm
t/cases/19_help_routes_to_showcommands.t

Test coverage

This pass added or extended many regression tests under:

t/cases/

The tests cover:

  • pagination behavior;
  • MariaDB-safe LIKE escaping;
  • exact lookup behavior;
  • hostmask output cleanup;
  • Partyline scheduler control;
  • legacy auth robustness;
  • background purge visibility.

The test strategy was intentionally mostly static because many of these checks are about preventing old unsafe patterns from coming back.

Examples:

no GROUP_CONCAT(hostmask)
no command LIKE ? for exact command lookup
LIKE ? ESCAPE '!' required where user text is searched
paginated detail lines required for long outputs

Suggested validation

Syntax checks:

perl -I. -c Mediabot/DBCommands.pm
perl -I. -c Mediabot/ChannelCommands.pm
perl -I. -c Mediabot/Helpers.pm
perl -I. -c Mediabot/UserCommands.pm
perl -I. -c Mediabot/LoginCommands.pm
perl -I. -c Mediabot/Partyline.pm
perl -I. -c mediabot.pl

Full regression suite:

env LANG=C.UTF-8 LC_ALL=C.UTF-8 perl t/test_commands.pl --verbose

Useful live IRC checks:

m searchcmd check
m topcmd
m popcmd teuk
m owncmd
m countcmd
m lastcmd
m chanlist
m nicklist #teuk
m whotalk #teuk
m checkhost wanadoo.fr
m checknick Te[u]K
m checkhostchan #teuk wanadoo.fr
m whoami
m userinfo teuk
m cstat
m topsay #teuk Te[u]K
m qlog #teuk test
m q search test

Useful Partyline checks:

.match teuk
.schedule list
.schedule status channel_ban_expire

Suggested Git commands

Before committing, inspect the staged files carefully:

cd /home/mediabot/mediabot_v3

git status --short
git diff --stat

Then stage the intended files only:

git add   Mediabot/DBCommands.pm   Mediabot/ChannelCommands.pm   Mediabot/Helpers.pm   Mediabot/UserCommands.pm   Mediabot/LoginCommands.pm   Mediabot/Partyline.pm   Mediabot/Mediabot.pm   Mediabot/Quotes.pm   t/cases/

Check what will be committed:

git diff --cached --stat
git diff --cached --name-only

Commit:

git commit -m "🪄 Revelio Maxima: polish command output, protect LIKE lookups, and tidy runtime spells"

Push:

git push

Closing note

This was not a cosmetic pass.

It made the bot more usable in real IRC channels, safer with MariaDB pattern matching, cleaner for long-running operation, and much harder to regress thanks to dedicated tests.

A lot of small spells, but together they make Mediabot feel sharper, safer, and far more polished.

You must be logged in to reply.