🛡️ Security Hardening & Code Cleanup: The Sorting Hat Said "Gryffindor"

in Mediabot · started by TeuK · 2mo ago

TeuK · 2mo ago

Why This Matters

A codebase that’s been evolving since 2007 accumulates technical debt. Some of it is cosmetic. Some of it is a ticking clock. After a methodical audit of all 32 Perl modules, five categories of real issues surfaced — ranging from broken database calls to latent shell injection vectors. This article walks through each one.

No functionality was changed. Every fix is surgical.

Phase 1 — The `mysql_insertid` Graveyard

DBD::mysql is end-of-life. Mediabot migrated to DBD::MariaDB weeks ago. But 11 occurrences of $sth->{mysql_insertid} were silently returning undef or 0 on the new driver, meaning every operation that needed the auto-incremented ID of a freshly inserted row was flying blind.

Affected files: Channel.pm, DBCommands.pm, Helpers.pm, Quotes.pm, Radio.pm, install/conf_servers.pl

The fix is uniform across all 11 sites:

# Before — always returns 0 with DBD::MariaDB
my $id = $sth->{mysql_insertid};

# After — standard DBI, works everywhere
my $id = $sth->{Database}->last_insert_id(undef, undef, undef, undef);

A bonus cleanup in Helpers.pm removed a redundant || fallback that was calling last_insert_id twice:

# Before
my $id = $dbh->{mysql_insertid} || eval { $dbh->last_insert_id(undef, undef, 'USER', 'id_user') };

# After
my $id = $dbh->last_insert_id(undef, undef, undef, undef);

Phase 2 — SQL Interpolation: Playing With Fire

Auth.pm

Two functions were building SQL queries by interpolating a $where variable directly into the query string:

# Before — $where interpolated into SQL
my ($where, $val) = ($input =~ /^\d+$/)
    ? ('id_user = ?', $input)
    : ('nickname = ?', $input);
my $sql = "SELECT ... FROM USER WHERE $where";

While the $where value was internally controlled (only two possible strings), this pattern is fragile: any future modification could accidentally introduce user-controlled data into the SQL structure. The fix uses explicit branches with fully literal SQL strings:

# After — SQL is always a string literal
if ($input =~ /^\d+$/) {
    $sql = "SELECT ... FROM USER WHERE id_user = ?";
    $val = $input;
} else {
    $sql = "SELECT ... FROM USER WHERE nickname = ?";
    $val = $input;
}

ChannelCommands.pm

The qlog command was building a SQL WHERE clause dynamically with join(' AND ', @where) and then interpolating the result into a heredoc. The clause itself only contained hardcoded column names with ? placeholders — so the risk was low — but the $limit variable was also interpolated without sanitization. Fixed by rewriting as explicit string concatenation with int() enforcement on the limit:

my $sql = "SELECT cl.ts, cl.nick, cl.publictext\n"
        . "FROM CHANNEL_LOG cl\n"
        . "JOIN CHANNEL c ON c.id_channel = cl.id_channel\n"
        . "WHERE $where_sql\n"   # only hardcoded column refs with ?
        . "ORDER BY cl.ts DESC\n"
        . "LIMIT $limit\n";      # $limit = int(...)

Phase 3 — Shell Injection via `open()` Pipes

Perl’s two-argument open(FH, "command $var |") passes the string through a shell. If $var contains shell metacharacters (; & | > <), arbitrary commands can execute. Three occurrences were found with user-influenced data:

External.pm — YouTube API

$sYoutubeId comes from an IRC message argument. A user could craft a YouTube ID containing shell metacharacters.

# Before — shell injection vector
open YOUTUBE_INFOS, "curl ... \"...?id=$sYoutubeId&key=$APIKEY...\" |";

# After — list form, no shell involved
open $fh_yt, "-|", "curl", "--connect-timeout", "5", "-f", "-s", $yt_url;

Helpers.pm — IP Geolocation

$userIP is extracted from a user’s IRC hostmask — entirely attacker-controlled.

# Before — $userIP directly in shell string
open WHEREIS, "curl ... https://api.country.is/$userIP |";

# After — list form
open $fh_whereis, "-|", "curl", "--connect-timeout", "3", "-f", "-s",
    "https://api.country.is/$userIP";

Radio.pm — Icecast Status

$JSON_STATUS_URL comes from config, so the risk was lower, but consistency matters.

# Before
open ICECAST_STATUS_JSON, "curl ... $JSON_STATUS_URL |";

# After
open $fh_icecast, "-|", "curl", "--connect-timeout", "3", "-f", "-s", $JSON_STATUS_URL;

All three fixes also replaced the old bare-word filehandles (YOUTUBE_INFOS, WHEREIS, ICECAST_STATUS_JSON) with lexical $fh_* variables — a long-overdue modernization.

Phase 4 — Laying `youtube-dl` to Rest

youtube-dl hasn’t been meaningfully maintained since 2021. Two of the three download blocks in Radio.pm were still calling it. The third was calling yt-dlp but with a hardcoded /usr/local/bin/yt-dlp path.

Three changes in one go:

All three blocks unified to yt-dlp via the list-form open:

my $ytdlp_bin = $self->{conf}->get('radio.YTDLP_PATH') || '/usr/bin/yt-dlp';
open $fh_yt, "-|", $ytdlp_bin, "-x", "--audio-format", "mp3", "--audio-quality", "0", $ytUrl;

Regex unified from [ffmpeg] Destination: (youtube-dl format) to [ExtractAudio] Destination: (yt-dlp format) across all three blocks.

New config key added to mediabot.sample.conf:

[radio]
YTDLP_PATH=/usr/bin/yt-dlp

Update your mediabot.conf accordingly if the binary lives elsewhere (e.g. /usr/local/bin/yt-dlp).

Phase 5 — Partyline Fortification

The Partyline is a TCP telnet-style admin interface. It requires Master-level authentication, but once inside, there were no safeguards against abuse.

5a — Rate Limiting

Without throttling, an authenticated session could flood the bot with hundreds of commands per second. Added a sliding window counter per session: 10 commands per 5-second window. Excess commands are silently dropped with a warning sent back to the client and a log entry at level 2.

if ($session->{rate_count} > 10) {
    $stream->write("Rate limit exceeded. Slow down.\r\n");
    return;
}

5b — Nick Validation

The .nick command accepted any string and forwarded it directly to change_nick. Added an RFC-compliant IRC nick format check before the call:

unless ($newnick =~ /^[A-Za-z\[\\\`_\^\{\|\}][A-Za-z0-9\[\\\`_\-\^\{\|\}]{0,14}$/) {
    $stream->write("Invalid nick format.\r\n");
    return;
}

5c — Timer Command Whitelist

Timers execute raw IRC commands on a schedule. An Owner can set them up via !addtimer. Before this fix, any IRC verb could be injected — including QUIT, NICK, OPER, or multi-line payloads. Added a whitelist check on the command verb:

my @allowed_verbs = qw(PRIVMSG NOTICE JOIN PART TOPIC MODE KICK INVITE WHO WHOIS PING PONG);
my ($verb) = ($cmd =~ /^(\S+)/);
unless (grep { uc($verb) eq $_ } @allowed_verbs) {
    $self->botNotice($nick, "Timer command must start with a valid IRC verb.");
    return;
}

Summary

Phase	What	Files	Risk
1	`mysql_insertid` → `last_insert_id()`	6	🔴 Broken in prod
2	SQL interpolation cleanup	2	🟠 Fragile/latent
3	Shell injection via `open()`	3	🔴 Exploitable
4	`youtube-dl` → `yt-dlp`	1 + conf	🟡 Reliability
5	Partyline hardening	2	🟠 Abuse vector

13 files modified. Zero functional regressions. The bot runs exactly as before — just without the landmines.

“It does not do to dwell on bugs and forget to fix them.” — Not quite Dumbledore, but close enough.

You must be logged in to reply.