Question: How to use streaming?
Date: 2010-September-22 @ 20:42
Tags:
streaming,
support
A Forum post raises a question of the practical usage of streaming in the ASN.1 code generated by asn1c:
Hello, I want to use the protocol to send a lot of data, but the device isn't able to hold the whole amount of received data in RAM. It has to work only with parts of the structure at a time.The ASN.1 streaming is easily the most exciting thing the asn1c has to offer. It's not something that's needed too often, but when it does, it's invaluable. Let's split the problem into two parts: how to decode in constant space and how to encode in constant space.For example,
BigMessage ::= SEQUENCE { time-stamp UTCTime, messages SEQUENCE OF SmallMessage }If I get only the first half of the Big Message, I want to work with all received
messages
and then I have to to free my buffer due to limited resources. When I get the second part, I want to work on it and I want to know, if the message was valid...On the encoder site, I also have to send "half" messages.
How can I do this?
Decoding big messages in constant space
We first assume that data comes into the system in parts via some lower level streaming protocol, such as HTTP or a bare TCP.
If you weren't constrained by the amount of RAM, you had essentially two ways to proceed: either accumulate everything and then call ber_decode()
, or repeatedly invoke ber_decode()
whenever the next chunk of data comes in.
If you are constrained by RAM, the first option is out, so the only way to proceed is to:
- Have a resizeable buffer of a certain size (say, 1k);
- Add data to that buffer whenever new chunk of data comes, possibly extending it;
- Invoke
ber_decode()
with that buffer as an argument; - Extract data out of the partially-decoded target structure. See below for details;
- When
ber_decode()
returns withRC_WMORE
, you may shift the buffer contents.consumed
bytes to the left (to reclaim the memory which won't be needed anymore byber_decode()
), and go to step 2; - When
ber_decode()
returns withRC_OK
, you're done.
The most interesting part of decoding in constrained space is ability to use
the structure before it is completely decoded. If you have an extensive
structure without repeated elements (like SEQUENCE OF), you're by and large
out of luck: there is no easy way to determine which part of the structure are
already decoded and which aren't. But in your case it is significantly
easier, since you have a big SEQUENCE OF
smaller messages
as a central part of the structure to be decoded.
The BigMessage
structure is going to look somewhat like that when compiled into C:
typedef struct BigMessage { UTCTime_t time_stamp; struct messages { A_SEQUENCE_OF(struct SmallMessage) list; /* Context for parsing across buffer boundaries */ asn_struct_ctx_t _asn_ctx; } messages; /* Context for parsing across buffer boundaries */ asn_struct_ctx_t _asn_ctx; } BigMessage_t;
The approach is to extract the complete SmallMessage
structures
out of messages
members as they become available. A proper
way to do that is to check for messages.list.count
and
if it becomes larger than N, you can extract the first
N-1 messages out of it.
Be aware though that you can't extract the last available element
too soon (until ber_encode()
returns with RC_OK
),
since it might have not been decoded to completion yet.
Here's the pseudocode:
BigMessage_t *big_msg = 0; while(receive_data_into_buffer(buffer)) { ret = ber_decode(&big_msg, buffer.contents, buffer.size); // Make sure we don't use the last element until we know it's complete. int upto = big_msg.messages.list.count - (ret == RC_OK ? 0 : 1); for(i = 0; i < upto; i++) { consume_small_message(big_msg.messages.list.array[i]); } // Shift used-up members of the list if(big_msg.messages.list.count > 0) { big_msg.messages.list.array[0] = big_msg.messages.list.array[big_msg.messages.list.count - 1]; big_msg.messages.list.count -= upto; } }
Encoding big messages in constant space
Encoding big messages in constant space is trickier, and has no direct support in asn1c.
You can't encode the length of the structure before you know all the lengths of its components, and in the streaming mode the components lengths may not be easily available.
In the DER encoding, the length of the message being encoded needs to be known in advance.
Therefore, der_encode()
won't be able to generate the proper
output.
However, in the more broad BER encoding, there's a possibility to encode a so-called “indefinite length” component size. Unlike the Tag-Length-Value component encoding when its length is available, the “indefinite length” in BER acts as an opening brace in a programming language: you have to terminate the encoding after all the components of the structure are encoded. The termination is done using two consecutive zero-data octets.
Essentially what is it all about is that you encode the components of a particular ASN.1 structure yourself, using DER encoding, but wrap it all up with a framing of “indefinite length”.
Here's a pseudocode:
typedef SmallMessage_t *(small_message_callback_f)(); BER_encode_big_message(BigMessage_t *big_msg, small_message_callback_f *get_next_small_message) { // Encode Tag-Length of the outer structure assert(asn_DEF_BigMessage.all_tags_count == 1); // Showing the simplest case. ber_tlv_tag_serialize(asn_DEF_BigMessage.all_tags[0]); write(0x80); // “Indefinite length” encoding // Write out the first member of the BigMessage der_encode(&asn_DEF_UTCTime, &big_msg.time_stamp); // Write out the Tag-Length for the second member of BigMessage assert(strcmp(asn_DEF_BigMessage.elements[1].name, "messages") == 0); assert(asn_DEF_BigMessage.elements[1].type.tag_mode == 0); assert(asn_DEF_BigMessage.elements[1].type.type.all_tags_count == 1); ber_tlv_tag_serialize(asn_DEF_BigMessage.elements[1].type.tag); write(0x80); // “Indefinite length” encoding of the inner structure // Obtain and serialize smaller messages one by one while((small_msg = get_next_small_message())) { der_encode(&asn_DEF_BigMessage.elements[1].type.type, small_msg); ASN_STRUCT_FREE(asn_DEF_BigMessage, small_msg); } // Terminate “indefinite length” framing for messages write(0x0); write(0x0); // Terminate “indefinite length” framing of BigMessage write(0x0); write(0x0); }
The above pseudocode is very delicate, since it assumes that there is no
additional tagging besides what's shown in your example,
and no AUTOMATIC TAGS
module option is in effect.
But you get the idea.