DTB parser implementing notes
Ever find yourself needing to implement a device tree blob (aka FDT, flattened device tree) parser and want to save yourself some time? Learn from my mistakes!
If you try to do it in one pass, you will hurt yourself
I charged headlong into writing dtb.zig by starting at the top of the Devicetree Specification page on the “Flattened Devicetree (DTB)” Format” and reading down. It looked delightfully simple. Keep in mind, I still didn’t know what I yet needed out of it, just that I probably needed to reference the DTB to get it. (I kind of know better now.)
The tree was taking shape, and then I had to parse the contents of one field by
the contents of a prop in its parent (#address-cells
and
#size-cells
).
Add some contexts and derive them from their parent, allowing overriding for
children. Easy.
Then I needed to parse
interrupts.
It turns out the interrupts
property of a node has its format defined by the
#interrupt-cells
of the “binding of the interrupt domain root”. It turns out
your “interrupt
parent”
might be defined forward in the file, as referenced by its phandle.
You find out the same thing about clocks, though the documentation is harder
to
find.
A clock provider specifies #clock-cells
, which is usually 0 or 1. When
another node refers to a clock on that node, it addresses the phandle of the
clock provider, followed by #clock-cells
worth of cells to index which clock
on that provider.
In other words, a clocks
like this:
00000000: 00000085 0000001c 0000002e
could refer to either:
- one clock specified by phandle
0x85
, with a#clock-cells
of 2, the index being0x1c 0x2e
, - two clocks;
- either a clock at phandle
0x85
with a#clock-cells
of 1 indexed by0x1c
, and a clock at0x2e
with no index, or, - a clock at
0x85
with no index, and a clock at0x1c
indexed by0x2e
; or,
- either a clock at phandle
- three clocks, all with no index;
0x85
,0x1c
,0x2e
.
You need to be able to look up the clocks and obtain their properties to interpret this, so you need a second pass, or delayed/on-time resolution of fields, or whatever. There end up being quite a few props that need a second pass.
How big?
It’s worth noting all numbers and indexes in DTB are in big-endian, unsigned 32-bit integer cells. That makes hexdumps easier, since you can read them byte-by-byte or in groups of 4 and don’t need to rearrange them in your head.
You’ll see #address-cells
of 2 and similar for most 64-bit devices. I saw an
#address-cells
of 3 once in a PCIe node and it scared me.
Strings are NUL-terminated, and NUL padded
This tripped me up. Strings are NUL-terminated, and then the field will be
padded with NULs (if needed) to align on a u32
(i.e. offset divisible by 4).
This is helpful, because a u32
is literally what will always follow, and Arm
devices (which DTBs are often used on) don’t like unaligned reads.
So, when you need to read a NUL-terminated string, don’t do what I did first:
It seems reasonable at first blush: count the NULs (there’s a much better
way),
then advance the index past the name, plus align to advance past padding.
(Hack for aligning to a power of two, n
: add n-1
, then logical AND with
n-1
.)
The problem is that I never advanced past the NUL terminator, which is still part of the string. Here are some example NUL-terminated strings:
00000000: 6100 a.
00000000: 616200 ab.
00000000: 61626300 abc.
00000000: 61626364 00 abcd.
Here are the same strings padded with NULs to align on u32
:
00000000: 61000000 a...
00000000: 61620000 ab..
00000000: 61626300 abc.
00000000: 61626364 00000000 abcd....
Here’s the corrected code:
That’s all for now
I ended up separating dtb.zig into two parts, given it’s used in boot-time code where allocating memory can mess around with things:
- allocator-free
Traverser
, which emits events SAX style. I tried using Zig’ssuspend
/resume
here, and it works pretty well. - allocating
Parser
which uses theTraverser
and creates an AST, parsing props into an immediately usable AST in two passes.
The Traverser
is used in daintree’s
bootloader,
dainboot,
to find a probable serial UART device. I’ll use the Parser
later in the OS
proper to bring up more devices.