Forum teuk.org

Exact Lookups and DB Safety Pass

in Mediabot · started by TeuK · 1w ago

TeuK · 1w ago

Exact Lookups and DB Safety Pass

Context

This pass continued the cleanup work on Mediabot v3 after the previous output-pagination and LIKE-escaping work.

The focus this time was more internal:

  • replacing accidental SQL LIKE lookups with exact = lookups where the value is an identifier;
  • fixing missing imports discovered during real IRC testing;
  • hardening database helpers so they do not call execute() on a failed prepare();
  • making error paths return safely and log clearly;
  • ensuring statement handles are finished properly;
  • adding regression tests so these fixes do not silently regress later.

This was mostly invisible polish from a user point of view, but it makes the bot safer and more reliable as a long-running IRC service.


Suggested commit message

🛡️ Protego Maxima: harden DB lookups, exact-match spells, and runtime safety

Alternative shorter version:

🛡️ Protego Maxima: harden Mediabot DB lookup paths

Main theme: identifiers should not use SQL LIKE

Several commands and helpers were still using SQL LIKE ? for values that are not search patterns.

Examples:

  • channel names;
  • user nicknames / handles;
  • user levels;
  • command categories;
  • timezones;
  • Hailo exclusion nicks;
  • TMDB language channel lookup;
  • responder names;
  • public command names.

For those values, SQL wildcard behavior is not wanted.

A value such as:

te_k

should not accidentally match:

teak
teuk
teXk

because _ has a special meaning in SQL LIKE.

This pass replaced those cases with exact = comparisons.


Channel-name exact matching

ChannelCommands.pm

Channel lookups now use exact matches instead of SQL LIKE.

Affected areas include:

  • antifloodset;
  • TMDB language channel lookup;
  • channel access-related paths.

Before:

WHERE CHANNEL.name LIKE ?

After:

WHERE CHANNEL.name = ?

Files:

Mediabot/ChannelCommands.pm
t/cases/44_channel_name_exact_match.t

TMDB language lookup fix

A real runtime bug appeared during IRC testing:

Undefined subroutine &Mediabot::External::getTMDBLangChannel

The TMDB command lived in Mediabot::External, but getTMDBLangChannel() is defined in Mediabot::ChannelCommands.

External.pm now imports it explicitly:

use Mediabot::ChannelCommands qw(getTMDBLangChannel);

The TMDB channel-language lookup also uses exact channel-name matching.

Files:

Mediabot/External.pm
Mediabot/ChannelCommands.pm
t/cases/45_tmdb_lang_lookup_import.t

Hailo exact matching

Mediabot/Hailo.pm still had two LIKE lookups that should have been exact:

  • Hailo ignored nick lookup;
  • Hailo channel ratio lookup.

Before:

SELECT 1 FROM HAILO_EXCLUSION_NICK WHERE nick LIKE ?
WHERE CHANNEL.name LIKE ?

After:

SELECT 1 FROM HAILO_EXCLUSION_NICK WHERE nick = ?
WHERE CHANNEL.name = ?

Files:

Mediabot/Hailo.pm
t/cases/46_hailo_exact_match.t

User lookup exact matching

Several user lookup paths still used SQL LIKE even though they expected one exact user.

Affected areas:

  • channel access lookup;
  • WHOIS/access callback in mediabot.pl;
  • Fortnite ID lookup in External.pm.

Before:

USER.nickname LIKE ?
nickname LIKE ?

After:

USER.nickname = ?
nickname = ?

Files:

Mediabot/ChannelCommands.pm
Mediabot/External.pm
mediabot.pl
t/cases/47_user_lookup_exact_match.t

Command category exact matching

getCommandCategory() previously used:

WHERE description LIKE ?

Command categories are logical identifiers, not wildcard searches.

It now uses:

WHERE description = ?

The function was also cleaned up so it:

  • checks prepare();
  • checks execute();
  • logs DB failures;
  • finishes the statement before returning.

Files:

Mediabot/DBCommands.pm
t/cases/48_command_category_exact_match.t

MP3 search LIKE escaping

The mp3 search is intentionally a search command, so it still uses LIKE.

But user input must be escaped so %, _, and ! are treated literally.

The search now uses:

LIKE ? ESCAPE '!'

with escaping:

! -> !!
% -> !%
_ -> !_

The multi-token behavior is preserved:

queen bohemian

still becomes an ordered search like:

%queen%bohemian%

Files:

Mediabot/Helpers.pm
t/cases/49_mp3_like_escape.t

Timezone exact matching

_tz_exists() now uses exact timezone lookup.

Before:

SELECT tz FROM TIMEZONE WHERE tz LIKE ?

After:

SELECT tz FROM TIMEZONE WHERE tz = ?

It also now:

  • rejects empty timezone input;
  • checks prepare();
  • checks execute();
  • logs SQL errors;
  • finishes the statement handle before returning.

Files:

Mediabot/Helpers.pm
t/cases/50_timezone_exact_match.t

checknick exact matching

checknick used to switch to LIKE if the input contained % or _.

That was misleading because checknick <nick> is not a wildcard search command.

It now always uses exact nick lookup:

WHERE nick = ?

Pagination from the previous pass remains intact.

Files:

Mediabot/Helpers.pm
t/cases/51_checknick_exact_match.t

User level exact matching

getLevel()

getLevel() in Helpers.pm now uses exact USER_LEVEL.description lookup.

Before:

WHERE description LIKE ?

After:

WHERE description = ?

It also now handles DB failures cleanly.

Files:

Mediabot/Helpers.pm
t/cases/52_user_level_exact_match.t

getIdUserLevel()

getIdUserLevel() in UserCommands.pm received the same treatment.

Before:

WHERE description LIKE ?

After:

WHERE description = ?

It now also checks prepare() and execute() safely.

Files:

Mediabot/UserCommands.pm
t/cases/53_user_id_level_exact_match.t

Authentication DB safety

Auth::_resolve_user()

Mediabot/Auth.pm now uses a dedicated internal helper:

_fetch_user_row()

This avoids raw prepare() / execute() calls scattered directly inside _resolve_user().

The new helper:

  • checks the database handle;
  • checks prepare();
  • checks execute();
  • logs failures;
  • finishes statements before returning.

_fetch_hostmasks()

The hostmask fetch path was also hardened:

  • checks prepare();
  • checks execute();
  • logs failures;
  • finishes statements cleanly;
  • returns an empty string safely on DB failure.

Files:

Mediabot/Auth.pm
t/cases/54_auth_resolve_user_db_safety.t

UserCommands lookup helper safety

Three helper functions in Mediabot/UserCommands.pm were hardened:

getLevelUser()
getUserLevelDesc()
getUserChannelLevel()

They now:

  • verify prepare();
  • verify execute();
  • log clear errors;
  • finish statement handles before returning;
  • keep exact lookup semantics.

Files:

Mediabot/UserCommands.pm
t/cases/55_usercommands_lookup_db_safety.t

Helpers level/channel lookup safety

Several helper functions in Mediabot/Helpers.pm were hardened:

checkUserLevel()
checkUserChannelLevel()
getIdUserChannelLevel()
getUserChannelLevelByName()

Notable changes:

  • checkUserLevel() now uses exact level description matching;
  • all four helpers check prepare();
  • all four helpers check execute();
  • all four helpers log prepare/execute failures;
  • statement handles are finished properly.

Files:

Mediabot/Helpers.pm
t/cases/56_helpers_level_lookup_db_safety.t

WHOIS user lookup DB safety

Two WHOIS-related helpers were hardened:

get_user_from_whois()
getNickInfoWhois()

These paths matter because they are used around IRC identity/authentication logic.

They now:

  • check main query prepare();
  • check main query execute();
  • finish statements on error;
  • handle secondary level lookups safely;
  • log prepare/execute errors explicitly.

Files:

Mediabot/Helpers.pm
t/cases/57_whois_user_lookup_db_safety.t

getIdUser() DB safety

getIdUser() now:

  • rejects empty handles;
  • uses exact nickname lookup;
  • checks prepare();
  • checks execute();
  • logs DB errors;
  • finishes statements before returning.

Files:

Mediabot/Helpers.pm
t/cases/58_getiduser_db_safety.t

Channel lookup DB safety

The following channel-related paths were hardened:

get_channel_by_name()
logBotAction()
getIdChannelSet()

get_channel_by_name()

Now checks prepare() and execute() before creating a Mediabot::Channel object.

logBotAction()

Now handles both lookup and insert failures safely:

  • channel lookup prepare/execute errors;
  • insert prepare/execute errors;
  • clean statement finishing.

getIdChannelSet()

Now handles missing parameters and DB failures more clearly.

Files:

Mediabot/ChannelCommands.pm
Mediabot/Helpers.pm
t/cases/59_channel_lookup_db_safety.t

Autologin DB safety

getUserAutologin() now handles DB errors properly.

Old behavior could attempt execute() on an undefined statement handle if prepare() failed.

New behavior:

  • rejects empty handle input;
  • checks prepare();
  • checks execute();
  • logs errors;
  • returns 0 safely on DB failure;
  • finishes statement handles.

Files:

Mediabot/LoginCommands.pm
t/cases/60_getuserautologin_db_safety.t

User count DB safety

userCount() now:

  • checks prepare();
  • checks execute();
  • logs DB errors;
  • returns 0 safely on failure;
  • finishes statement handles.

Files:

Mediabot/Helpers.pm
t/cases/61_usercount_db_safety.t

Join and logout DB safety

Two important UserCommands.pm runtime paths were hardened:

dbLogoutUsers()
userOnJoin()

dbLogoutUsers()

Now handles:

  • missing DB handle;
  • prepare failure;
  • execute failure;
  • statement cleanup;
  • explicit success/failure return.

userOnJoin()

This path runs on every JOIN and may apply:

  • automode;
  • greet;
  • channel notice.

It now handles:

  • user-channel query prepare/execute failure;
  • channel notice query prepare/execute failure;
  • statement cleanup;
  • exact channel notice lookup.

Files:

Mediabot/UserCommands.pm
t/cases/62_usercommands_onjoin_logout_db_safety.t

Startup timer DB safety

onStartTimers() in DBCommands.pm was hardened.

This is important because it runs during bot startup and restores DB-defined timers.

It now:

  • checks prepare();
  • checks execute();
  • logs DB failures;
  • clears startup timer hash safely on failure;
  • skips invalid timer durations;
  • preserves the existing “skip if IRC is not connected” behavior;
  • returns the number of timers started.

Files:

Mediabot/DBCommands.pm
t/cases/63_onstarttimers_db_safety.t

Regression tests

This pass added or updated tests from:

t/cases/44_channel_name_exact_match.t

through:

t/cases/63_onstarttimers_db_safety.t

The tests mostly enforce structural safety:

  • no accidental LIKE where exact matching is required;
  • required prepare() checks;
  • required execute() checks;
  • required statement cleanup;
  • expected SQL queries preserved;
  • runtime paths still using exact identifiers.

This gives a useful safety net against future regressions.


Suggested validation

Run syntax checks:

perl -I. -c Mediabot/DBCommands.pm
perl -I. -c Mediabot/ChannelCommands.pm
perl -I. -c Mediabot/Helpers.pm
perl -I. -c Mediabot/UserCommands.pm
perl -I. -c Mediabot/LoginCommands.pm
perl -I. -c Mediabot/Auth.pm
perl -I. -c Mediabot/Hailo.pm
perl -I. -c Mediabot/External.pm
perl -I. -c mediabot.pl

Run the full regression suite:

env LANG=C.UTF-8 LC_ALL=C.UTF-8 perl t/test_commands.pl --verbose

Useful live checks:

m tmdb matrix
m hailo_status
m userinfo teuk
m whoami
m cstat
m access #teuk teuk
m mp3 queen
m mp3 %
m date Europe/Paris
m checknick Te[u]K
m checknick %
m status
m qlog #teuk test

Startup checks:

tail -f /home/mediabot/mediabot_v3/mediabot.log

Look for:

Checking timers to set at startup
X active timer(s) set at startup

and no unexpected DB errors.


Suggested Git commands

Before committing:

cd /home/mediabot/mediabot_v3

git status --short
git diff --stat

Stage the intended files:

git add   Mediabot/Auth.pm   Mediabot/ChannelCommands.pm   Mediabot/DBCommands.pm   Mediabot/External.pm   Mediabot/Hailo.pm   Mediabot/Helpers.pm   Mediabot/LoginCommands.pm   Mediabot/UserCommands.pm   mediabot.pl   t/cases/

Check staged files:

git diff --cached --stat
git diff --cached --name-only

Commit:

git commit -m "🛡️ Protego Maxima: harden DB lookups, exact-match spells, and runtime safety"

Push:

git push

Closing note

This was a defensive engineering pass.

Most users will not see a dramatic visible change, and that is exactly the point: the bot should behave the same when everything is healthy, but fail more gracefully when the database or runtime environment misbehaves.

The code now has fewer accidental SQL wildcard paths, better DB error handling, cleaner statement lifecycle management, and a stronger regression suite.

Small protections everywhere. A stronger Mediabot overall.

You must be logged in to reply.