Forum teuk.org

πŸͺ„ *Unicodeus Correctum!* β€” Keeping Beautiful IRC Output Without Crashing Partyline

in Mediabot Β· started by TeuK Β· 1mo ago

TeuK Β· 1mo ago

Mediabot v3 hit a nasty little bug after the new activity heatmap output was tested on an ircu2/Undernet-style setup.

The command itself was nice:

m heatmap Te[u]K

And the output looked good:

te[u]k activity by hour on #teuk:
  00-05  β–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘  110 msgs
  06-11  β–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘  139 msgs
  12-17  β–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘  47 msgs
  18-23  β–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘  197 msgs

But then the bot crashed with:

Wide character in syswrite

That was not acceptable.

The easy workaround would have been to replace the nice Unicode bars with plain ASCII.

But that would have been the wrong fix.


🧩 What really happened

The heatmap uses Unicode block characters:

β–ˆ
β–‘

Those are perfectly reasonable for a modern IRC display.

The IRC output path itself was not the real problem.

The problem appeared when the same live log line was forwarded to connected Partyline console sessions through .console.

The Partyline console hook was writing Perl character strings directly to an IO::Async::Stream:

$s->write($line . "
")

That eventually reaches syswrite(), which expects bytes.

When $line contains real Perl Unicode characters, Perl can complain loudly:

Wide character in syswrite

And in this case, it did more than complain: it killed the bot.


βœ… The real fix

Instead of making the heatmap ugly, the fix was applied at the correct boundary:

Encode Partyline console output to UTF-8 bytes before writing to the socket.

The console hook now does the right thing:

my $wire = encode('UTF-8', ($line // '') . "
");
$s->write($wire);

That keeps Unicode output intact while making the socket write byte-safe.


πŸ§™ Why this is better than ASCII fallback

Replacing β–ˆβ–‘ with #. would have avoided the crash, but it would also have hidden the real issue.

The actual problem was not the visual bar.

The actual problem was:

Unicode string passed to a byte-level socket writer

By fixing the Partyline transport layer, Mediabot can keep richer output where it makes sense:

heatmaps
titles
emoji
UTF-8 channel text
future formatted output

without risking the same crash through .console.


πŸ” Practical result

The heatmap can keep its nice display:

β–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘

Partyline .console can remain active.

And the bot no longer crashes on ircu2/Undernet-style setups with:

Wide character in syswrite

πŸͺ„ Spell of the day

Unicodeus Correctum!

The pretty bars stay.
The socket gets bytes.
Partyline survives.

Mischief managed β€” without making the heatmap ugly. πŸ“ŠπŸͺ„

You must be logged in to reply.