Taking these comments first because I think it epitomises a difference in attitude: I'm trying to figure out what you're suggesting MessagePack has done there that CBOR hasn't. if your format doesn't expect (say) 16-bit floats ... you could just ... reject them
Protocol defects don't consist only of missing features or simple mistakes. Features can themselves be bugs, particularly in an area where a very simple protocol is possible.
Most applications will be using a general-purpopse encoding and decoding libraries. All those applications pay a price for every format feature. Additional format complexity is not simply a matter of coding up the library, paid once. It has ongoing costs. Additional complexity means more security bugs, worse performance, bigger programs, more maintenance work, more cognitive load for humans debugging things, harder to eyeball hex dumps, etc. As well as of course reimplementation cost whenever a new implementation is needed for any reason.
Protocol complexity has a significant cost even for applications that implement a narrower subset of the spec. The designers must make a decision about which parts to support; the more complex the spec the more work that is and the more likely it is that there will be interactions between the spec's pieces that lead to mistakes while designing the subset. Typically more complex protocols lean on their features so that even a workably narrow subset of a complex protocol is more complicated than the whole of a simpler protocol.
I still don't agree that the streaming version of these binary encoding formats is valuable. In those cases when streaming is wanted, it is usually better achieved by sending a sequence of binary-format messages. MessagePack and CBOR are both self-delimiting so this is straightforward. If you need the overall thing to be self-delimiting you can have an explicit EOF message, but normally the next layer down will have something for you to do that. Streaming is the minority use case, and makes considerably more awkward demands of a decoder (particularly one in a resource-constrained environment) and it makes sense for that complexity to be present only in the applications that need it. Particularly since leaving open indefinitely the outer wrapper(s) in a format like MessagePack/CBOR is not very nice and often isn't the right thing even for a streaming application.
FTAOD I did not mean to imply that CBOR does not have extension mechanisms. I presented MessagePack's single extension mechanism because it was simple enough to describe. CBOR has, of course, extension mechanisms. It has three extension mechanisms and three IANA registries to go along with them, each with two or three subranges with different allocation policies. CBOR's extension mechanisms are too complicated to review in a paragraph or two even in a fairly long blog posting like mine!
It is likely of course that there are applications where CBOR, given that it exists, is a better choice than MessagePack. But I think the existence of CBOR in the world is quite possibly a net negative. The people who will be helped by CBOR's features are fairly few. But many people will be impeded (even if they don't realise it) by CBOR's complexity and didn't realise they should have chosen MessagePack instead; or by division of effort between two competing protocols in a very similar problem space. This is particularly true given that CBOR is often presented as superior to, or even the successor to, MessagePack.
Re: CBOR features vs bugs
Date: 2020-07-15 11:48 am (UTC)Protocol defects don't consist only of missing features or simple mistakes. Features can themselves be bugs, particularly in an area where a very simple protocol is possible.
Most applications will be using a general-purpopse encoding and decoding libraries. All those applications pay a price for every format feature. Additional format complexity is not simply a matter of coding up the library, paid once. It has ongoing costs. Additional complexity means more security bugs, worse performance, bigger programs, more maintenance work, more cognitive load for humans debugging things, harder to eyeball hex dumps, etc. As well as of course reimplementation cost whenever a new implementation is needed for any reason.
Protocol complexity has a significant cost even for applications that implement a narrower subset of the spec. The designers must make a decision about which parts to support; the more complex the spec the more work that is and the more likely it is that there will be interactions between the spec's pieces that lead to mistakes while designing the subset. Typically more complex protocols lean on their features so that even a workably narrow subset of a complex protocol is more complicated than the whole of a simpler protocol.
I still don't agree that the streaming version of these binary encoding formats is valuable. In those cases when streaming is wanted, it is usually better achieved by sending a sequence of binary-format messages. MessagePack and CBOR are both self-delimiting so this is straightforward. If you need the overall thing to be self-delimiting you can have an explicit EOF message, but normally the next layer down will have something for you to do that. Streaming is the minority use case, and makes considerably more awkward demands of a decoder (particularly one in a resource-constrained environment) and it makes sense for that complexity to be present only in the applications that need it. Particularly since leaving open indefinitely the outer wrapper(s) in a format like MessagePack/CBOR is not very nice and often isn't the right thing even for a streaming application.
FTAOD I did not mean to imply that CBOR does not have extension mechanisms. I presented MessagePack's single extension mechanism because it was simple enough to describe. CBOR has, of course, extension mechanisms. It has three extension mechanisms and three IANA registries to go along with them, each with two or three subranges with different allocation policies. CBOR's extension mechanisms are too complicated to review in a paragraph or two even in a fairly long blog posting like mine!
It is likely of course that there are applications where CBOR, given that it exists, is a better choice than MessagePack. But I think the existence of CBOR in the world is quite possibly a net negative. The people who will be helped by CBOR's features are fairly few. But many people will be impeded (even if they don't realise it) by CBOR's complexity and didn't realise they should have chosen MessagePack instead; or by division of effort between two competing protocols in a very similar problem space. This is particularly true given that CBOR is often presented as superior to, or even the successor to, MessagePack.