Lev Walkin → ASN.1 Exposed → Question: How to use streaming?

Question: How to use streaming?

Date: 2010-September-22 @ 20:42
Tags: streaming, support

A Forum post raises a question of the practical usage of streaming in the ASN.1 code generated by asn1c:

Hello, I want to use the protocol to send a lot of data, but the device isn't able to hold the whole amount of received data in RAM. It has to work only with parts of the structure at a time.
For example,
BigMessage ::= SEQUENCE {
    time-stamp UTCTime,
    messages SEQUENCE OF SmallMessage
}
If I get only the first half of the Big Message, I want to work with all received messages and then I have to to free my buffer due to limited resources. When I get the second part, I want to work on it and I want to know, if the message was valid...
On the encoder site, I also have to send "half" messages.
How can I do this?

The ASN.1 streaming is easily the most exciting thing the asn1c has to offer. It's not something that's needed too often, but when it does, it's invaluable. Let's split the problem into two parts: how to decode in constant space and how to encode in constant space.

Decoding big messages in constant space

We first assume that data comes into the system in parts via some lower level streaming protocol, such as HTTP or a bare TCP.

If you weren't constrained by the amount of RAM, you had essentially two ways to proceed: either accumulate everything and then call ber_decode(), or repeatedly invoke ber_decode() whenever the next chunk of data comes in.

If you are constrained by RAM, the first option is out, so the only way to proceed is to:

Have a resizeable buffer of a certain size (say, 1k);
Add data to that buffer whenever new chunk of data comes, possibly extending it;
Invoke ber_decode() with that buffer as an argument;
Extract data out of the partially-decoded target structure. See below for details;
When ber_decode() returns with RC_WMORE, you may shift the buffer contents .consumed bytes to the left (to reclaim the memory which won't be needed anymore by ber_decode()), and go to step 2;
When ber_decode() returns with RC_OK, you're done.

The most interesting part of decoding in constrained space is ability to use the structure before it is completely decoded. If you have an extensive structure without repeated elements (like SEQUENCE OF), you're by and large out of luck: there is no easy way to determine which part of the structure are already decoded and which aren't. But in your case it is significantly easier, since you have a big SEQUENCE OF smaller messages as a central part of the structure to be decoded.

The BigMessage structure is going to look somewhat like that when compiled into C:

typedef struct BigMessage {
	UTCTime_t	 time_stamp;
	struct messages {
		A_SEQUENCE_OF(struct SmallMessage) list;
		
		/* Context for parsing across buffer boundaries */
		asn_struct_ctx_t _asn_ctx;
	} messages;
	
	/* Context for parsing across buffer boundaries */
	asn_struct_ctx_t _asn_ctx;
} BigMessage_t;

The approach is to extract the complete SmallMessage structures out of messages members as they become available. A proper way to do that is to check for messages.list.count and if it becomes larger than N, you can extract the first N-1 messages out of it. Be aware though that you can't extract the last available element too soon (until ber_encode() returns with RC_OK), since it might have not been decoded to completion yet.

Here's the pseudocode:

BigMessage_t *big_msg = 0;
while(receive_data_into_buffer(buffer)) {
  ret = ber_decode(&big_msg, buffer.contents, buffer.size);
  // Make sure we don't use the last element until we know it's complete.
  int upto = big_msg.messages.list.count - (ret == RC_OK ? 0 : 1);
  for(i = 0; i < upto; i++) {
	consume_small_message(big_msg.messages.list.array[i]);
  }
  // Shift used-up members of the list
  if(big_msg.messages.list.count > 0) {
	big_msg.messages.list.array[0]
	  = big_msg.messages.list.array[big_msg.messages.list.count - 1];
	big_msg.messages.list.count -= upto;
  }
}

Encoding big messages in constant space

Encoding big messages in constant space is trickier, and has no direct support in asn1c.

You can't encode the length of the structure before you know all the lengths of its components, and in the streaming mode the components lengths may not be easily available.

In the DER encoding, the length of the message being encoded needs to be known in advance. Therefore, der_encode() won't be able to generate the proper output.

However, in the more broad BER encoding, there's a possibility to encode a so-called “indefinite length” component size. Unlike the Tag-Length-Value component encoding when its length is available, the “indefinite length” in BER acts as an opening brace in a programming language: you have to terminate the encoding after all the components of the structure are encoded. The termination is done using two consecutive zero-data octets.

Essentially what is it all about is that you encode the components of a particular ASN.1 structure yourself, using DER encoding, but wrap it all up with a framing of “indefinite length”.

Here's a pseudocode:

typedef SmallMessage_t *(small_message_callback_f)();
BER_encode_big_message(BigMessage_t *big_msg, small_message_callback_f *get_next_small_message) {
    // Encode Tag-Length of the outer structure
    assert(asn_DEF_BigMessage.all_tags_count == 1);  // Showing the simplest case.
    ber_tlv_tag_serialize(asn_DEF_BigMessage.all_tags[0]);
    write(0x80);	// “Indefinite length” encoding

	// Write out the first member of the BigMessage
	der_encode(&asn_DEF_UTCTime, &big_msg.time_stamp);

	// Write out the Tag-Length for the second member of BigMessage
	assert(strcmp(asn_DEF_BigMessage.elements[1].name, "messages") == 0);
	assert(asn_DEF_BigMessage.elements[1].type.tag_mode == 0);
	assert(asn_DEF_BigMessage.elements[1].type.type.all_tags_count == 1);
	ber_tlv_tag_serialize(asn_DEF_BigMessage.elements[1].type.tag);
	write(0x80);	// “Indefinite length” encoding of the inner structure

	// Obtain and serialize smaller messages one by one
	while((small_msg = get_next_small_message())) {
		der_encode(&asn_DEF_BigMessage.elements[1].type.type, small_msg);
		ASN_STRUCT_FREE(asn_DEF_BigMessage, small_msg);
	}

	// Terminate “indefinite length” framing for messages
	write(0x0); write(0x0);


    // Terminate “indefinite length” framing of BigMessage
    write(0x0); write(0x0);
}

The above pseudocode is very delicate, since it assumes that there is no additional tagging besides what's shown in your example, and no AUTOMATIC TAGS module option is in effect.

But you get the idea.

Question: How to use streaming?

Decoding big messages in constant space

Encoding big messages in constant space

Comments: