Trace the Rainbow: writing a Wireshark dissector

I recently found a good excuse to learn about writing packet dissectors in Wireshark, which are super helpful if you’re working on a custom protocol. If you’re unfamiliar, Wireshark is an open-source packet analyzer for inspecting network traffic. Teaching Wireshark about your protocol allows you to sort and filter packets by its fields, extract them to columns, and apply conditional highlighting.

I’ve done a number of the Protohackers networking challenges. When testing my solution to problem #7 last fall, Wireshark helped me track down a tricky parsing bug. After, I thought, “This would’ve been even easier if Wireshark understood my protocol! I want good search. I want colors to mean something!" So I added a simple Lua dissector, and thought I’d share a bit about that.

I had the pleasure of pairing with Brian Palmer on this while spending time at the Recurse Center, a fun retreat for programmers to collaborate. Maybe you’d like it, too!

A quick look at LRCP

Protohackers #7 involves implementing a custom protocol called LRCP, or “Line Reversal Control Protocol.” It’s really an application protocol (which just reverses lines of text) over a simple connection-oriented transport layer, which is itself tunneled over UDP. Without going into too many details, LRCP packets look like this:

/connect/SESSION/
/ack/SESSION/LENGTH/
/data/SESSION/POSITION/DATA/
/close/SESSION

… where SESSION, LENGTH, and POSITION are all nonnegative integers, specified as ASCII text. DATA is bytes, with character escaping for / and \.

The spec I linked earlier includes an example exchange between a client and server:

<-- /connect/12345/
--> /ack/12345/0/
<-- /data/12345/0/hello\n/
--> /ack/12345/6/
--> /data/12345/0/olleh\n/
<-- /ack/12345/6/
<-- /data/12345/6/Hello, world!\n/
--> /ack/12345/20/
--> /data/12345/6/!dlrow ,olleH\n/
<-- /ack/12345/20/
<-- /close/12345/
--> /close/12345/

That’s all we really need to know to get started on the dissector, but there’s a full spec and automated testing on the problem page if you’re curious.

Initial setup

You’ll need to configure Wireshark to run your Lua script:

Here’s what I put in mine:

-- init.lua
disable_lua = false
-- Run my dissector script
dofile("/home/b/b/ws/protohackers7.lua")

…and here’s what I started with in lrcp.lua:

-- lrcp.lua
-- The first arg to Proto is the programmatic label to be used in filters, e.g. `lrcp.session == SESSION`
-- The second arg is the label to be applied to the protocol in Wireshark's tree UI.
lrcp_proto = Proto("lrcp", "LRCP")

function lrcp_proto.dissector(buffer, pinfo, tree)
	pinfo.cols.protocol = "LRCP"
    -- This will associate our protocol with its bytes, and label them in the subtree with our protocol name.
    -- We're at the end of the line - all the bytes of the UDP payload - so we pass the entire buffer.
	local subtree = tree:add(lrcp_proto, buffer(), "Line Reversal Control Protocol")
end

-- Register our protocol to be matched for any UDP packets with a port equal to 4321.
udp_table = DissectorTable.get("udp.port")
udp_table:add(4321, lrcp_proto)

Our dissector method should accept three inputs:

All we’ve done so far is register a new node (called “LRCP”) in the tree for any UDP packets with a port equal to 4321, but this is enough to update the “Protocol” column in Wireshark with “LRCP”.

Writing the dissector

Now, we want to start extracting our packet fields and adding them to the subtree we’ve created. This just requires a few steps:

That’s all that’s needed for building out our tree! If, like me, you’re new to Lua, or at least to Wireshark’s Lua API:

-- lrcp.lua
-- LRCP Packet Dissector
-- See https://protohackers.com/problem/7

lrcp_proto = Proto("lrcp", "LRCP")

local msg_type = ProtoField.new("Message Type", "lrcp.msg_type", ftypes.STRING)
-- Numeric but functionally a string
local session = ProtoField.new("Session", "lrcp.session", ftypes.STRING)
-- Length and Position could be typed as numerics if we wanted them to be usable elsewhere
local ack_length = ProtoField.new("Ack Length", "lrcp.ack_len", ftypes.STRING)
local data_position = ProtoField.new("Position", "lrcp.data_pos", ftypes.STRING)
local data_bytes = ProtoField.new("Data", "lrcp.data_bytes", ftypes.BYTES)
lrcp_proto.fields = { msg_type, session, ack_length, data_position, data_bytes }

function lrcp_proto.dissector(buffer, pinfo, tree)
	pinfo.cols.protocol = "LRCP"
	local subtree = tree:add(lrcp_proto, buffer(), "Line Reversal Control Protocol")

	-- i is our current index into the buffer.
	-- x is the offset at which a regex pattern is found, and y is the pattern end.
	-- Note: Lua likes to 1-index its tables, but Wireshark's buffer is 0-indexed,
	-- hence the funny +1/-1 stuff you'll see scattered below.
	local i, x, y, mtype
	-- ("^/(ack|close|connect|data)/" doesn't seem to be supported?)
	x, y, mtype = string.find(buffer:raw(), "^/(%a+)/")
	if mtype ~= "ack" and mtype ~= "close" and mtype ~= "connect" and mtype ~= "data" then
		-- bad mtype
		return
	end

	-- 1 to skip the leading slash
	subtree:add_packet_field(msg_type, buffer(1, mtype:len()), ENC_UTF_8)
	i = y + 1

	-- Parse session
	x, y, _ = string.find(buffer:raw(), "(%d+)/", i)
	if not x then
		return
	end
	subtree:add_packet_field(session, buffer(x-1, y-x), ENC_UTF_8)
	i = y + 1

	if mtype == "close" or mtype == "connect" then
		return
	end

	-- Parse ack length / data position
	x, y, _ = string.find(buffer:raw(), "(%d+)/", i)
	if not x then
		return
	elseif mtype == "ack" then
		subtree:add_packet_field(ack_length, buffer(x-1, y-x), ENC_UTF_8)
		-- We're done
		return
	elseif mtype == "data" then
		subtree:add_packet_field(data_position, buffer(x-1, y-x), ENC_UTF_8)
	end
	i = y

	-- Parse data bytes: just the rest of the packet, trim final byte
	-- (which we assume to be a trailing slash)
	subtree:add_packet_field(data_bytes, buffer(i, buffer:len()-i-1), ENC_NA)
end

-- Register our protocol
udp_table = DissectorTable.get("udp.port")
-- I've been using 4321, but it'd be nice to make this more configurable.
udp_table:add(4321, lrcp_proto)

That’s it! This produces all the fields we want in the packet tree (bottom-left corner), highlights only the field values in the hex dump (bottom-right corner), and makes the fields accessible to the rest of Wireshark for filtering. (See screenshot below.)

For example, we can filter to a single session with lrcp.session == "1234" or see only data and ack packets with lrcp.msg_type == "data" or lrcp.msg_type == "ack".

Using the dissector

I took a packet capture of some of my test traffic here. This includes one long test over an unreliable link (25% UDP packet loss) and a few short parallel tests at the end.

In the screenshot, I’ve narrowed the conversation to one short conversation by filtering on lrcp.session. You can see how Wireshark now shows our data in the subtree on the left, and highlighting a section in the subtree highlights the corresponding data in the hex view.

a screenshot of the Wireshark program with colorful highlighting applied to LRCP packets of different types

I’ve also customized the color scheme for the protocol. I wanted something easy on the eyes but also relatively accessible, so I went with Paul Tol’s “Muted” colorblind friendly palette, as described in his paper, Colour Schemes. It’s a nice read!

You can view (and download) the color rules here. To use them in Wireshark: View > Coloring Rules ... > Import....

Epilogue

I think it would be cool to figure out how to cleanly track which member of a conversation started it (i.e. sent a connnect message.) Then I could apply conditional highlighting, like a slightly lighter shade for the client than the server, and something like a “Role” field to indicate this in a column or filter. Maybe next time!

If you want a few references for writing your own dissectors:

You can also write them in C as a Wireshark plugin, which will run a lot faster, and the process seems only slightly more involved.