Commit Revealed: Issue 2
This issue unpacks the work of James Prestwich (Summa) and Ted Blackman (Urbit)
Commit Revealed is a forum to learn from the industry’s brightest builders. It is a newsletter where top cryptocurrency engineers share their most interesting code commit, then explain how it works and why it matters.
Welcome to the first Commits Revealed of 2020!
Contribution 1: James Prestwich reveals Summa’s new Bitcoin Relay design
Why this commit is meaningful:
In our first commit, we get a look at the new Bitcoin Relay design Summa is using to reduce gas costs by roughly 75%. This commit finalizes the last feature of the design, Bitcoin chain-tip tracking, which makes it possible for nodes to handle cross-chain swaps, bridging assets, and much more. This design comes from an amazing engineer and can be applied to Bitcoin, Ethereum, and any EVM chain; let’s dive in.
Commit 2fb4010
*Commit commentary is mostly James’; slight edits were made for readability.
File: contracts/Relay.sol
Lines: 1100-1122
The benefits of understanding this:
This is commit shares an implementation of a core aspect behind Nakamoto consensus: the heaviest-chain rule—the rule that requires you to use the heaviest of two chains. Specifically, this code determines which header (of two possible Bitcoin headers) is the heaviest descendant of a common ancestor, as cheaply as possible. Hence the 75% gas savings.
The most interesting bits:
Because storing headers is expensive, we choose to avoid this problem altogether. In practice, this means not storing the Merkle root or header difficulty. Instead, we store the hash of a header.
But in order to follow the heaviest-chain rule, we still need to know the difficulty. So here’s the trick to handling this: in almost all cases we can check Bitcoin’s heaviest chain without knowing the difficulty of the blocks involved. So instead of traversing both header chains and summing work as is typically done, we can (in most cases) take a significant shortcut by simply counting the headers and inferring work. And we can do this because Bitcoin difficulty only changes every 2016 blocks.
Practically this means that if the chains haven’t crossed a difficulty boundary, their work hasn’t changed. So we really only need to define four cases. And only in the rarest case do we actually count the difficulty: when both descendants are in new epochs (when each chain has changed to a different proof of work). This result is significantly cheaper than traversing both header arrays.
Important lessons learned:
The GCD (greatest common denominator) is also called the LCA (latest common ancestor).
The solidity compiler does not optimize reads. If you read the same state variable several times, you’ll pay for it each time. Do yourself a favor and assign it to a local variable.
There are a bizarre number of consensus edge cases in Bitcoin, and many, many more in Ethereum.
Tricks of the trade:
State storage is the enemy and our entire relay is carefully designed to minimize it. The loop on lines 272 - 280 is the most expensive part of the operation.
If we naively stored raw headers instead of digests, it would cost 3x more.
Storage costs 625 gas/byte, and 25 gas/byte to read. Calldata costs 16 gas/byte. It is never cheaper to store something than to pass that data up again, even if you read it many times (note that this is not true in Ethereum Classic).
Sharing the Spotlight
James wanted to call out the work of Elena Nadolinski and Georgios Konstantopoulos; these two have been inspiring him recently. If you’re in need of further inspiration, look them up.
Contribution 2: Ted Blackman reveals Ames: Urbit’s P2P messaging protocol
Why this commit is meaningful:
Our second commit gives you a closer look at one of the key functions behind Urbit’s Ames protocol. This uniquely designed p2p overlay has a blockchain-backed Sybil-resistant PKI, is packet-oriented, has exactly-once delivery semantics, and does its own serialization and safe deserialization. Might be something to understand if you’re, you know, building crypto networks.
Also, this commit will give you a (small) sense for how Urbit works. If you aren’t reading big chunks of Hoon in your spare time, yet still want to understand what’s going on at Urbit, this commit provides a small enough diff to glean some insights.
This commit focuses on Ames, Urbit’s TCP-like peer-to-peer messaging protocol. But unlike TCP, Ames packets are authenticated and encrypted using keys retrieved from Urbit’s Ethereum-based address ownership ledger. Ames’ connections also persist permanently, this allows Urbit to provide exactly-once delivery, which benefits everyone on the network by reducing the load on infrastructure relay nodes and decreasing latency for personal nodes.
Commit 9c3050a
*Commit commentary is mostly Ted’s; slight edits were made for readability.
**This commit had a typo that was quickly patched; it is explained in one of the lessons Ted shares below.
File: pkg/arvo/sys/vane/ames.hoon
Lines: 1312-1315
The benefits of understanding this:
This commit determines whether we need to update a route (a piece of data the Urbit runtime provides to represent the physical address of Urbit ‘ships’) to another Urbit server based on receiving a packet from them.
We do this because ships can’t always talk to one another directly—they might be behind a NAT or firewall. For this reason our general procedure is for ships to ping infrastructure nodes on a regular interval, so they can still receive messages when they’re not directly connected.
To facilitate this, Urbit addresses don’t map to a static IP’s. Rather, Urbit ‘ships’ can move to different machines or IP routers, and the Ames peer discovery system reinstates connections to other ships automatically. This is a unique design constraint for Urbit.
In previous designs, we held onto the relay's address if we heard from it before we heard from the peer directly. But this created cases where ships would get stuck talking through relays even if they could talk directly. This caused a handful of issues (see Philip’s comments in his commit message).
But thanks to this commit, when a packet is received the runtime tells us which IP address it came from. Now, if we get a packet from a relay but still have a direct route to the peer, we maintain the direct route. This helps keep connections as “tight” as possible, meaning they’ll stay direct if they can as opposed to falling back to relaying.
The most interesting bits:
It’s interesting that Ames peer discovery is a pretty simple design, but it degrades gracefully and implicitly into something very similar to the traditional NAT traversal techniques of STUN and TURN. But because Ames packets are all encrypted and authenticated, it avoids legacy security issues.
Also, the simplicity of the design and implementation is unusual (Ames is about 3,000 lines of heavily commented Hoon), but STUN and TURN are pretty standard. Like many parts of Urbit, Ames isn’t exactly doing something new; it's just doing it with extreme minimalism, digging down into the bare essence of what a network does.
Important lessons learned:
I think the biggest thing I learned this year is that modern networking is pretty complicated. It also confirmed my impression that push-based programming (subscriptions, change notifications, reactivity) is easier to reason about than pull-based (polling, checking if a piece of state has changed). Old Ames had essentially pull-based timer logic, which was exceedingly difficult to understand. New Ames timers are push-based and are pretty straightforward.
Layering is important. New Ames separates the packet processing layer from message processing, which improved the design and code quality.
Philip Monk, who patched my code (remember the typo mentioned above?), is a great engineer and colleague. I already knew that, really, but this and other bug fixes confirmed it. I spent 2019 rewriting Ames, with Philip’s help. (Editorial note: you’re one hell of an awesome colleague for pointing this out, Ted.)
We should improve our compiler to prevent this kind of typo from happening again; a type system should be able to eliminate these types of bugs.
Tricks of the trade:
This commit uses pretty standard, run-of-the-mill Hoon idioms. The ‘=? route’ syntax means “change the ‘route’ variable’s value if this condition is true”. And because Hoon is pure-functional, “changing a variable” is a slightly subtle concept, but it basically means “return a value where a field has been mutated from the original”; similar to record updates in the Elm language.
Sharing the Spotlight
Ted wanted to call out ZeroTier for some of their great work with peer-to-peer routing.
Note: Don’t confuse their “planet” and “moon” terminology with Urbit’s. They don’t mean the same thing in both systems ;)
Ted also shared that Van Jacobson’s work on networking has been an inspiration—especially his work on Named Data Networking and the BBR congestion control algorithm. Ames implemented the old-style, loss-based congestion control this round, but should try BBR in the future.
Parting thanks
A huge thanks to both James and Ted for sharing their work and disseminating the lessons learned while shipping production level code. And another thank you to my colleagues at IDEO CoLab Ventures and the teams we’re working with—you make this a blast.
Get involved…
Share your feedback—this is a living prototype
See you in the next issue!