This session tackled one of the oldest forms of technical debt in the codebase: implicit SQL JOINs. Since Mediabot’s origins in 2007, many queries were written in the traditional comma-separated FROM A, B WHERE A.id = B.id style. Modern SQL has preferred explicit JOIN ... ON syntax for decades — it’s clearer, safer, and easier to audit. 27 occurrences across 9 files were converted in a single pass.
Two other improvements came along for the ride: database connection resilience via ensure_connected(), and a log level cleanup that stops flooding the console with debug noise.
Implicit JOINs are not wrong — they produce identical query plans in MariaDB. But they have real costs:
WHERE clause alongside filter conditions. It’s not immediately obvious which conditions define relationships and which filter rows.WHERE clause that already mixes join conditions and filters is a common source of bugs.Every conversion followed the same mechanical rule:
-- Before (implicit)
SELECT level FROM CHANNEL, USER_CHANNEL
WHERE CHANNEL.id_channel = USER_CHANNEL.id_channel
AND CHANNEL.name = ?
AND USER_CHANNEL.id_user = ?
-- After (explicit)
SELECT USER_CHANNEL.level
FROM CHANNEL
JOIN USER_CHANNEL ON USER_CHANNEL.id_channel = CHANNEL.id_channel
WHERE CHANNEL.name = ?
AND USER_CHANNEL.id_user = ?
The WHERE clause now contains only filter conditions. The relationship is declared in the JOIN.
The most conversions in a single file. Affected queries: BADWORDS lookup (called on every public message), USER_CHANNEL level lookups, IGNORES check.
The IGNORES query is called on every incoming IRC message to check if the sender is on the ignore list — making it one of the hottest paths in the bot. The explicit JOIN makes the intent immediately clear:
-- Before
SELECT * FROM IGNORES, CHANNEL
WHERE IGNORES.id_channel = CHANNEL.id_channel
AND CHANNEL.name LIKE ?
-- After
SELECT IGNORES.* FROM IGNORES
JOIN CHANNEL ON CHANNEL.id_channel = IGNORES.id_channel
WHERE CHANNEL.name LIKE ?
Every single query in Quotes.pm used the implicit style. All 7 were converted, including the date range queries (minDate/maxDate) and the random quote selector.
The most structurally varied set. Two of them were inside heredoc strings (Perl’s <<SQL ... SQL syntax), requiring careful multi-line replacement. The CHANNEL_LOG query also got its ambiguous column references qualified:
-- Before
FROM CHANNEL, CHANNEL_LOG
WHERE (event_type = 'public' OR event_type = 'action')
AND CHANNEL.id_channel = CHANNEL_LOG.id_channel
-- After
FROM CHANNEL
JOIN CHANNEL_LOG ON CHANNEL_LOG.id_channel = CHANNEL.id_channel
WHERE (CHANNEL_LOG.event_type = 'public' OR CHANNEL_LOG.event_type = 'action')
The RESPONDERS queries were the trickiest. They use a global scope mechanism: when id_channel = 0, a responder applies to all channels. The original query expressed this with an OR:
FROM RESPONDERS, CHANNEL
WHERE ((CHANNEL.id_channel = RESPONDERS.id_channel AND CHANNEL.name LIKE ?)
OR (RESPONDERS.id_channel = 0))
A simple INNER JOIN would exclude global responders (id_channel = 0) because there’s no matching CHANNEL row. The fix uses LEFT JOIN:
FROM RESPONDERS
LEFT JOIN CHANNEL ON CHANNEL.id_channel = RESPONDERS.id_channel
WHERE ((CHANNEL.name LIKE ? AND CHANNEL.id_channel IS NOT NULL)
OR RESPONDERS.id_channel = 0)
This preserves the global-scope logic while making the relationship explicit.
Hailo.pm (1), Mediabot.pm (1), ChannelCommands.pm (2), Radio.pm (1), mediabot.pl (2) — all straightforward two-table or three-table conversions.
getDbh() → ensure_connected()mediabot.pl had 5 remaining calls to $mediabot->getDbh->prepare(...) — the RadioPub timer, the RandomQuote timer, the JOIN handler, the WHOIS automode lookup, and the random quote responder. These were all converted to:
$mediabot->{db}->ensure_connected()->prepare(...)
ensure_connected() was added to DB.pm in a previous session. It pings the database handle and reconnects automatically if the connection has dropped. The five hottest long-running paths in the bot — the ones most likely to run after a multi-hour idle period — now all go through it.
Three handlers were logging at level 0 (always visible) for purely informational messages that have no operational significance:
# Before
$mediabot->{logger}->log(0, "on_message_WHO() $target_name");
# After
$mediabot->{logger}->log(3, "on_message_WHO() $target_name");
Level 3 is debug output. These messages are only useful when diagnosing a specific WHO/WHOIS interaction — not something you want in the production log by default.
Nine messages related to missing API keys and empty responses were at level 0. These indicate misconfiguration, not crashes — level 1 is appropriate:
Four messages downgraded from 0 to 1: Trying to join, checkAntiFlood() End of antiflood, could not find CHANNEL_FLOOD record, Mediabot is up to date.
| Change | Files | Count |
|---|---|---|
| Implicit JOIN → explicit JOIN | 9 | 27 |
getDbh() → ensure_connected() |
1 | 5 |
log(0) → log(1) |
2 | 9 |
log(0) → log(3) |
1 | 3 |
Zero functional changes. The bot behaves identically — it just does it more clearly, more safely, and with less noise in the logs.
“The spells we cast most often are the ones most worth casting correctly.”
— Not Hermione, but she would have approved of the explicit JOIN syntax
You must be logged in to reply.