While taking the Stepik TON Blockchain Course, I have summarized the whole Message Serialization Format as this handwriting. Here, we observe these:
-
Apart from
begin_cell()
and.end_cell()
, it consists of 15 fields of bits being serialized one after another. -
Those 15 fields are grouped into five batches, namely: (a)
0x18
(b)<addr>
(c)<grams>
(d)+1+4+4+64+32+1+1
and (e)<*>
. -
Batch (a) is self-explanatory: those are the message flags, with the remark that the last two bits will be re-written by validators.
-
In fact, and more generally, quoting from the Lesson 4.3 of the Course:
If a message is sent from the smart contract, some of those fields will be rewritten to the correct values. In particular, validator will rewrite
bounced
,src
,ihr_fee
,fwd_fee
,created_lt
andcreated_at
. That means two things: first, another smart-contract during handling message may trust those fields (sender may not forge source address, bounced flag, etc); and second, that during serialization we may put to those fields any valid values (anyway those values will be overwritten).
-
Batches (b) and (c) serialize the destination address and the coins amount, respectively. (View
grams
as an alias forcoins
.) -
For the case of Batch (d), again I would like to quote from the course:
Technically we just write big a mount of zeros into the cell (namely the amount of zeros equals to 1 + 4 + 4 + 64 + 32 + 1 + 1). Why so? We know that there is a clear structure of how those values are consumed and that means every 0 we put there is for some reason.
Interesting is the part that those zeros are working in two ways - some of them are put as 0 because the validator will rewrite the value anyway, some of them are put as 0 because this feature is not supported yet (ex. extra currencies).
Just to be sure we understand why there is so many zeros, let's break down it's intended structure:
First bit stands for empty extra-currencies dictionary. Then we have two 4-bit long fields. Since `ihr_fee` and `fwd_fee` will be overwritten, we may as well put there zeroes. Then we put zero to `created_lt` and `created_at` fields. Those fields will be overwritten as well; however, in contrast to fees, these fields have a fixed length and are thus encoded as 64- and 32-bit long strings. Next zero-bit means that there is no `init` field. The last zero-bit means that `msg_body` will be serialized in-place. This basically indicates if there is `msg_body` coming with custom layout.
-
Finally, for Batch (e), I have put an asterisk
<*>
to indicate that it has a custom layout. For example, in the course it is formatted as.store_uint(op_code, 32).store_uint(query_id, 64)
-
However, instead of the above long list of fields, developers usually use shortcuts in practice. For example, the course uses this:
var msg = begin_cell()
.store_uint(0x18, 6)
.store_slice(addr)
.store_coins(grams)
.store_uint(0, 1 + 4 + 4 + 64 + 32 + 1 + 1)
.store_uint(op_code, 32)
.store_uint(query_id, 64);
send_raw_message(msg.end_cell(), mode);
In the above snippet, the sequence of methods used are the same as the sequence of Batches (a)--(e), with the remark that Batch (e) corresponds to the composition of the two last methods.
To illuminate the shortcutting further, I have taken a real world example: a code snippet from TONScan (you can access the source code by clicking on the "CONTRACT" tab there):
var msg = begin_cell()
.store_uint(0x18, 6)
.store_uint(4, 3).store_slice(wc_n_address)
.store_grams(plugin_balance)
.store_uint(4 + 2 + 1, 1 + 4 + 4 + 64 + 32 + 1 + 1 + 1)
.store_ref(state_init)
.store_ref(body);
send_raw_message(msg.end_cell(), 3);
Could you tell the similarities and the differences of the above two snippets?
Hope the above comments further clarify the complex subject of TON Message Serialization! 🙂
This is a great question! (Took me a long time to understand as well)
So basically, you are asking:
- Why do we store the uint(....) there?
- And why do we deal with int_msg_info over there?
1/ The message structure
To understand why we store the uint(...) in the message, you need to understand how TVM works for Message. In practice, the Message Layout shows that to "compress" the message we want to store, we must store it into a "Cell" and upload it to the smart contract as the message.
2/ Default values for message fields
There are a series of values we need to set "in default" by giving values for bounced, src, ihr_fee, fwd_fee in some cases.
For example, below is an example for the message we put in a cell:
var msg = begin_cell()
.store_uint(0, 1) ;; tag
.store_uint(1, 1) ;; ihr_disabled
.store_uint(1, 1) ;; allow bounces
.store_uint(0, 1) ;; not bounced itself
.store_slice(source)
.store_slice(destination)
;; serialize CurrencyCollection (see below)
.store_coins(amount)
.store_dict(extra_currencies)
.store_coins(0) ;; ihr_fee
.store_coins(fwd_value) ;; fwd_fee
.store_uint(cur_lt(), 64) ;; lt of transaction
.store_uint(now(), 32) ;; unixtime of transaction
.store_uint(0, 1) ;; no init-field flag (Maybe)
.store_uint(0, 1) ;; in-place message body flag (Either)
.store_slice(msg_body)
.end_cell();
3/ The meaning of integers in .store_uint(0, 1 + 4 + 4 + 64 + 32 + 1 + 1)
The integers used in .store_uint(0, 1 + 4 + 4 + 64 + 32 + 1 + 1)
indicate the number of bits according to the TL-B scheme, broken down by the length of the fields that are indicated there. But we always indicate 0.
Each integer represents the length in bits of a specific field in the header.
- The first integer '1' is for the
tag
field - Followed by two '4's for
ihr_disabled
andbounce
fields - Then 64 bits for
created_lt
field - 32 bits for
created_at
field - and finally two '1's for
init
andbody
fields.
However, in the example given, all the fields are empty, so we indicate 0 bits for all the fields.
Reference: