Trace the Rainbow: writing a Wireshark dissector
I recently found a good excuse to learn about writing packet dissectors in Wireshark, which are super helpful if you’re working on a custom protocol. If you’re unfamiliar, Wireshark is an open-source packet analyzer for inspecting network traffic. Teaching Wireshark about your protocol allows you to sort and filter packets by its fields, extract them to columns, and apply conditional highlighting.
I’ve done a number of the Protohackers networking challenges. When testing my solution to problem #7 last fall, Wireshark helped me track down a tricky parsing bug. After, I thought, “This would’ve been even easier if Wireshark understood my protocol! I want good search. I want colors to mean something!" So I added a simple Lua dissector, and thought I’d share a bit about that.
I had the pleasure of pairing with Brian Palmer on this while spending time at the Recurse Center, a fun retreat for programmers to collaborate. Maybe you’d like it, too!
A quick look at LRCP
Protohackers #7 involves implementing a custom protocol called LRCP, or “Line Reversal Control Protocol.” It’s really an application protocol (which just reverses lines of text) over a simple connection-oriented transport layer, which is itself tunneled over UDP. Without going into too many details, LRCP packets look like this:
/connect/SESSION/
/ack/SESSION/LENGTH/
/data/SESSION/POSITION/DATA/
/close/SESSION
… where SESSION, LENGTH, and POSITION are all nonnegative integers, specified as ASCII text. DATA is bytes, with character escaping for /
and \
.
The spec I linked earlier includes an example exchange between a client and server:
<-- /connect/12345/
--> /ack/12345/0/
<-- /data/12345/0/hello\n/
--> /ack/12345/6/
--> /data/12345/0/olleh\n/
<-- /ack/12345/6/
<-- /data/12345/6/Hello, world!\n/
--> /ack/12345/20/
--> /data/12345/6/!dlrow ,olleH\n/
<-- /ack/12345/20/
<-- /close/12345/
--> /close/12345/
That’s all we really need to know to get started on the dissector, but there’s a full spec and automated testing on the problem page if you’re curious.
Initial setup
You’ll need to configure Wireshark to run your Lua script:
- Confirm your copy of Wireshark was compiled with Lua: navigate to
Help > About Wireshark
, which should list something like “Lua x.y.z” for the version it shipped with. If it’s missing, you need a different version. - Find or create
init.lua
. I’m not sure about other platforms, but on Linux:- Check in
$XDG_CONFIG_HOME/wireshark
. - If that variable isn’t set, check
~/.config/wireshark
. - Whichever folder exists: if
init.lua
isn’t there, create it.
- Check in
Here’s what I put in mine:
-- init.lua
disable_lua = false
-- Run my dissector script
dofile("/home/b/b/ws/protohackers7.lua")
…and here’s what I started with in lrcp.lua
:
-- lrcp.lua
-- The first arg to Proto is the programmatic label to be used in filters, e.g. `lrcp.session == SESSION`
-- The second arg is the label to be applied to the protocol in Wireshark's tree UI.
lrcp_proto = Proto("lrcp", "LRCP")
function lrcp_proto.dissector(buffer, pinfo, tree)
pinfo.cols.protocol = "LRCP"
-- This will associate our protocol with its bytes, and label them in the subtree with our protocol name.
-- We're at the end of the line - all the bytes of the UDP payload - so we pass the entire buffer.
local subtree = tree:add(lrcp_proto, buffer(), "Line Reversal Control Protocol")
end
-- Register our protocol to be matched for any UDP packets with a port equal to 4321.
udp_table = DissectorTable.get("udp.port")
udp_table:add(4321, lrcp_proto)
Our dissector method should accept three inputs:
buffer
is a Tvb - effectively a bytearray containing our packet data, beginning wherever previous layers stopped parsing. In our case, that’s the UDP payload.pinfo
is a Pinfo instance, which provides a lot of information about the packet. For example, we can access hardware and network addresses parsed by previous layers, but we just use it to provide a value for the packet’s Protocols column.tree
is a TreeItem. Packet data is inherently nested, with fields and potentially subfields, so Wireshark models this with a tree. We’ll be appending our parsed data to this structure withTreeItem:add_packet_field
.
All we’ve done so far is register a new node (called “LRCP”) in the tree for any UDP packets with a port equal to 4321, but this is enough to update the “Protocol” column in Wireshark with “LRCP”.
Writing the dissector
Now, we want to start extracting our packet fields and adding them to the subtree we’ve created. This just requires a few steps:
- Create a ProtoField for each of our packet’s fields.
- Use
string.find
to get the beginning and end index of the field in the inputbuffer
. - Slice the
buffer
to the field’s bounds, and associate it with theProtoField
by passing both tosubtree:add_packet_field
along with an encoding indicating how Wireshark should interpret the bytes.
That’s all that’s needed for building out our tree! If, like me, you’re new to Lua, or at least to Wireshark’s Lua API:
- Lua structures (and functions like
string.find
) are usually 1-indexed, but Wireshark’sbuffer
is thankfully 0-indexed. This does require some conversion below. string.find
returns the index of the start of the match and the index of the end of the match, but it also returns an additional variable for each match group in the regex.- Calling
buffer(a, b)
doesn’t use slice notation - it’sbuffer(index, length)
. (Again, 0-indexed.)
-- lrcp.lua
-- LRCP Packet Dissector
-- See https://protohackers.com/problem/7
lrcp_proto = Proto("lrcp", "LRCP")
local msg_type = ProtoField.new("Message Type", "lrcp.msg_type", ftypes.STRING)
-- Numeric but functionally a string
local session = ProtoField.new("Session", "lrcp.session", ftypes.STRING)
-- Length and Position could be typed as numerics if we wanted them to be usable elsewhere
local ack_length = ProtoField.new("Ack Length", "lrcp.ack_len", ftypes.STRING)
local data_position = ProtoField.new("Position", "lrcp.data_pos", ftypes.STRING)
local data_bytes = ProtoField.new("Data", "lrcp.data_bytes", ftypes.BYTES)
lrcp_proto.fields = { msg_type, session, ack_length, data_position, data_bytes }
function lrcp_proto.dissector(buffer, pinfo, tree)
pinfo.cols.protocol = "LRCP"
local subtree = tree:add(lrcp_proto, buffer(), "Line Reversal Control Protocol")
-- i is our current index into the buffer.
-- x is the offset at which a regex pattern is found, and y is the pattern end.
-- Note: Lua likes to 1-index its tables, but Wireshark's buffer is 0-indexed,
-- hence the funny +1/-1 stuff you'll see scattered below.
local i, x, y, mtype
-- ("^/(ack|close|connect|data)/" doesn't seem to be supported?)
x, y, mtype = string.find(buffer:raw(), "^/(%a+)/")
if mtype ~= "ack" and mtype ~= "close" and mtype ~= "connect" and mtype ~= "data" then
-- bad mtype
return
end
-- 1 to skip the leading slash
subtree:add_packet_field(msg_type, buffer(1, mtype:len()), ENC_UTF_8)
i = y + 1
-- Parse session
x, y, _ = string.find(buffer:raw(), "(%d+)/", i)
if not x then
return
end
subtree:add_packet_field(session, buffer(x-1, y-x), ENC_UTF_8)
i = y + 1
if mtype == "close" or mtype == "connect" then
return
end
-- Parse ack length / data position
x, y, _ = string.find(buffer:raw(), "(%d+)/", i)
if not x then
return
elseif mtype == "ack" then
subtree:add_packet_field(ack_length, buffer(x-1, y-x), ENC_UTF_8)
-- We're done
return
elseif mtype == "data" then
subtree:add_packet_field(data_position, buffer(x-1, y-x), ENC_UTF_8)
end
i = y
-- Parse data bytes: just the rest of the packet, trim final byte
-- (which we assume to be a trailing slash)
subtree:add_packet_field(data_bytes, buffer(i, buffer:len()-i-1), ENC_NA)
end
-- Register our protocol
udp_table = DissectorTable.get("udp.port")
-- I've been using 4321, but it'd be nice to make this more configurable.
udp_table:add(4321, lrcp_proto)
That’s it! This produces all the fields we want in the packet tree (bottom-left corner), highlights only the field values in the hex dump (bottom-right corner), and makes the fields accessible to the rest of Wireshark for filtering. (See screenshot below.)
For example, we can filter to a single session with lrcp.session == "1234"
or see only data and ack packets with lrcp.msg_type == "data" or lrcp.msg_type == "ack"
.
Using the dissector
I took a packet capture of some of my test traffic here. This includes one long test over an unreliable link (25% UDP packet loss) and a few short parallel tests at the end.
In the screenshot, I’ve narrowed the conversation to one short conversation by filtering on lrcp.session
.
You can see how Wireshark now shows our data in the subtree on the left, and highlighting a section in the subtree highlights the corresponding data in the hex view.
I’ve also customized the color scheme for the protocol. I wanted something easy on the eyes but also relatively accessible, so I went with Paul Tol’s “Muted” colorblind friendly palette, as described in his paper, Colour Schemes. It’s a nice read!
You can view (and download) the color rules here.
To use them in Wireshark: View > Coloring Rules ... > Import...
.
Epilogue
I think it would be cool to figure out how to cleanly track which member of a conversation started it (i.e. sent a connnect
message.)
Then I could apply conditional highlighting, like a slightly lighter shade for the client than the server,
and something like a “Role” field to indicate this in a column or filter.
Maybe next time!
If you want a few references for writing your own dissectors:
- The Wireshark wiki has a page on dissectors that is a brief but decent entry point.
- The Wireshark manual provides a Lua API Reference in Chapter 11.
- Programming in Lua is free online. Chapter 20 covers pattern matching.
You can also write them in C as a Wireshark plugin, which will run a lot faster, and the process seems only slightly more involved.