Optimizing Chia Puzzles
All Chia puzzles (for example the NFT state layer, NFT ownership layer, and CAT2 puzzle) are compiled to CLVM before being used on-chain. It’s important to make sure that these puzzles are as optimized as possible to maximize the throughput of transactions. Let’s look into some of the optimizations in the Rue compiler, and how they compare to Chialisp.
CLVM Cost
The cost of a coin spend is measured by a unit called “cost”. This doesn’t necessarily directly correlate to XCH paid as a transaction fee, however they are related.
The following things have a CLVM cost:
12,000 per byte stored in the blockchain database for a program
1,200,000 per AGG_SIG condition
1,800,000 per CREATE_COIN condition
Various costs for allocating memory and using operators at runtime
The thing that tends to add up the most dramatically is the 12,000 cost per byte. So, this is what Chia puzzle developers and the compiler optimize the most heavily.
There is a maximum of 11,000,000,000 (11 billion) cost per block. If your program uses 11,000,000 (11 million) cost, you can spend 1,000 of them per block. Thus, there’s a direct relationship between cost and throughput.
The mempool allows for transactions that are at most half the size of a block, and a total space of approximately 10 times the size of a block. There isn’t a minimum fee unless the mempool is full, but transactions are prioritized for block inclusion based on the highest fee per cost ratio.
You can read more about the mempool on the official Chia docs.
Constant Folding
The simplest of optimizations is constant folding, wherein you simplify expressions at compile time. This isn’t particularly interesting, but it allows for nested calls to concat, sha256, and various operators to be simplified as much as possible to reduce the amount of bytes requires to express the intended value.
Importantly, the Chialisp compiler decides to interpret calls to sha256 with known values and insert the resulting hash instead of the call to the operator. This results in 32 bytes (approximately 384,000 CLVM cost) being added to the program, which is in most cases quite inefficient. Rue instead elects to preserve the operator call unless you explicitly use sha256_inline.
Let Bindings
Because classic Chialisp doesn’t have a way to declare and reuse bindings, it’s common to see code like this:
(mod (first second)
(defun stager (first second computed_third computed_fourth)
...
)
(stager first second (+ first second) (- second first))
)
This is essentially achieving the same thing - you are calculating a value, and binding it to a parameter in a function. However, notice that first and second are being passed around unchanged - you have to manually pass values in from one function to the next just to prevent reuse of calculated values. This adds unnecessary runtime and byte cost in the construction of the parameters to stager.
In Rue, you can simply declare bindings like so:
fn main(first: Int, second: Int) -> Int {
let computed_third = first + second;
let computed_fourth = second - first;
...
}
This is both more readable, and more efficient. Instead of passing the individual values through to another function, it will group the bindings and prepend them to the rest of the environment. This can sometimes result in a bit larger CLVM paths (which are how references to values in the environment work), but it’s more efficient both in terms of runtime cost and byte cost, so it’s well worth it.
Binary Trees
All credit for the idea of this optimization goes to
(Twitter).In CLVM, the environment to a program is a binary tree. However, we often use it as a glorified list - both for function calls, and things like captures and bindings. This throws away a lot of the cost savings you could get from organizing things in a tree structure.
As a simple example, as far as I can tell it’s not even possible in Chialisp to use the dot operator at the end of the argument list of a function to indicate that you don’t want it to be nil terminated. This adds an additional call to the cons operator for every function call, and increases the CLVM paths unnecessarily. Rue has the spread operator to allow you to do this without needing to write CLVM directly.
However, as a result of a discussion with
, we figured out how this could be optimized further. Instead of simply prepending binding groups to the environment, you can organize the symbols into an optimized binary tree (more frequently used symbols come first) and cons it with the rest of the environment.This provided some cost savings, but after generalizing it to function captures and function parameters, the costs of Rue programs went dramatically down. Every single function call, including functions defined in the standard library such as tree_hash, requires less bytes to include in the generated CLVM and has a lower runtime cost.
The caveat with this optimization is that functions which are converted into values (closures) lose symbol information, and the compiler can’t figure out how to organize the binary tree anymore (or if it even should). Thus, this optimization is disabled for all function values (ie if you don’t call a function directly by its name). Additionally, there will be an extern keyword which can be used to disable this functionality.
The main function could benefit from this optimization as well, but it’s currently disabled to preserve the layout of the solution for now. This will likely be configurable in the future.
Verification Statements
In CLVM, there are four similar operators called bls_pairing_identity, bls_verify, secp256k1_verify, and secp256r1_verify. These take in a list of arguments (which can’t be determined at runtime currently) and either return nil if the verification is successful, or raise an error if not. I call these, and any function with a similar pattern, a “verification statement”, since the return value is meaningless.
In Chialisp, you would have to awkwardly figure out where to put these in your program, since the return value will be nil. This often means manually constructing lists and inserting these verification opcodes in the place of the nil terminator. In Rue, this is done for you so you don’t have to think about it.
If you use an expression as a statement (ending it with a semicolon), the compiler will include it in the generated CLVM in the best way it can, with a few optimizations to improve on CLVM cost.
Firstly, if there are enough instances of nil in the current scope of the generated LIR, it will insert each verification statement in place of one of them. Otherwise, or if any of the verification statements may return a non-nil value according to the type system, it will insert all of them in an all operator in the first nil slot. And finally, if there were no nil values to substitute, it will wrap the rest of the expression in a cons pair, insert the verifications as the first value, and then unwrap the rest value. This has the effect of running the verification statements while discarding the value (which means any assertions will raise an error if they fail as expected).
The Results
I’ve been experimenting with porting common Chialisp puzzles to Rue, both to test the compiler’s correctness as I make changes, and to incrementally optimize the generated CLVM.
Here are a few of the latest comparisons of the number of bytes (less than 100% is better):
CAT2 is 93.5% of the size
Partial offers are 95.9% of the size
Royalty split is 101% of the size (more work to be done here)
For the CAT example, this could mean that if you were to spend 100 CAT coins in Chialisp, you could instead spend 107 in Rue with the same CLVM cost.
Conclusion
When designing Rue, my main concern was that puzzles written in it wouldn’t be efficient enough to be used in production, and nobody would adopt the language (even if the syntax and type system were nice to use) as a result of this. However, I’ve been pleasantly surprised by the optimizations I’ve been able to implement into the compiler to bring it close to, and in many cases exceed, the Chialisp compiler.
I’ll keep working on optimizing the language (even after the first release is out). But, it’s also important to write more and more unit tests to ensure program correctness isn’t left behind in the pursuit of efficiency. Thanks for reading!