Comet server message sizes, Bandwidth considerations
One of the questions I brought up on a previous blog – Comet servers for a Single-Dealer Platform – was that of bandwidth. The main thing that affects bandwidth that a Comet server has some control over is message size. At the time of that blog I hadn’t looked into this issue with all the servers in much detail, so I decided to dive a little deeper and look at the protocols of some of the Comet servers written about on that blog.
A question that a customer once asked was about message sizes, but the question was wrong, they asked what performance Liberator could handle with 1K messages – they wanted to compare Liberator to another Comet server sending 1K messages. I pointed out that they should let us know what the payload they want to send is and we can see the size of the message in our protocol, and they should do the same with the other Comet server. Maybe we could represent a much bigger payload in a 1K message than the other server could.
I have looked at four servers for this blog – Liberator, Lightstreamer, Adobe LCDS and my-Channels Nirvana. They all take slightly different approaches, but can all be used to achieve similar results. My test case is a single message payload, but I am showing the structure of the messages and will comment on how different payloads may be represented better or worse in the different servers.
The payload I am looking at is a fairly typical financial data update. To help highlight a few problems I am looking at what is often called an ‘image’ and an ‘update’. An image is an initial update, containing all the data about the instrument, the update is the subsequent updates where only a subset of all the fields will be updating.
The image is:
| Field1 | 123456 |
| Field2 | 234567 |
| Time1 | 1274181389019 |
| Low | 100000 |
| High | 200000 |
| FullName | Company Name Ltd |
The update is:
| Field1 | 123459 |
| Field2 | 234569 |
| Time1 | 1274181389999 |
These may seem like small messages, but they are fairly typical, and it should be easy to see how a few more fields will affect the message sizes. Lets first look at the summary of results, and if you are interested, further down is the detail of the messages.
The table shows the size of the message in bytes that each server will send. For LCDS and Nirvana there are two entries as they implement different protocols with quite different message sizes. I assume anyone using LCDS would be using the binary AMF protocol, you get to choose. With Nirvana, if you want a native web client you have to use the web protocol, but the protocol available to other technologies has far smaller message sizes.
| Server | Protocol | Image Message Size | Update Message Size |
| Liberator | Web | 101 | 64 |
| Liberator | Other | 80 | 43 |
| Lightstreamer | Web | 95 | 67 |
| Lightstreamer | Other | 64 | 36 |
| Nirvana | NHP (Web clients) | 760 | 675 |
| Nirvana | NSP (Java/Other clients) | 154 | 102 |
| LCDS | AMF binary | 403 | 403 |
| LCDS | AMF XML | 937 | 937 |
Lets now look at the actual message structure and how they might differ with other payloads.
Liberator
Liberator is sending a script tag wrapped Javascript function call, with a single argument containing its own protocol. The messages do not contain the subject (or channel), and field names are mapped onto single characters (for the first 64 fields, then 2 characters etc..) This is quite compact with not much overhead. As more fields or larger field values are added there is little overhead.
Web Image message = 101 bytes
<script>a("7O5W0001 a=123456 b=234567 c=1274181389019 d=100000 e=200000 f=Company+Name+Ltd")</script>
Web Update message = 64 bytes
<script>a("6c5W0002 a=123459 b=234569 c=1274181389999")</script>
Other APIs such as Java and .Net don’t need the script wrapper, which makes the messages that much smaller.
Other Image message = 80 bytes
7O5W0001 a=123456 b=234567 c=1274181389019 d=100000 e=200000 f=Company+Name+Ltd\n
Other Update message = 43 bytes
6c5W0002 a=123459 b=234569 c=1274181389999\n
Lightstreamer
Lightstreamer is similar to Liberator in that it is also sending a single Javascript function call. Again, no subject name needed, and field names are not used at all. The fields are known by the subscriber, so the order the values come back in is significant – this has the benefit that it doesn’t need to send field names, but it does mean that fields that have not updated still need to be sent, albeit as an empty string. This is illustrated by the three empty strings at the end of the update message. Worst case scenario where a subscription to lots of fields where not many updated very often would have a bit of an overhead, but it is not that significant.
Web Image message = 95 bytes
<script>z(0,1,"123456","234567","1274181389019","100000","200000", "Company Name Ltd");</script>
Web Update message = 67 bytes
<script>d(0,1,"123459","234569","1274181389999","","","");</script>
For non web APIs the messages are again smaller.
Other Image update = 64 bytes
0,1|123456|234567|1274181389019|100000|200000|Company Name Ltd\r\n
Other Update message = 36 bytes
0,1|123459|234569|1274181389999|||\r\n
Nirvana
Nirvana’s NHP protocol used by browser Javascript clients is quite verbose. Once again it is Javascript being streamed directly, but it is not that optimised for size. The channel name is sent, twice, and field names are sent both as part of the map and as a list. There is also some extra data in there, which may be useful to some of the features of Nirvana, but seems like quite an overhead.
Note that this is a message using the properties style message (the ‘fprops’ section), it is also possible to send an opaque message (or a combination of the two) – an opaque message uses the ‘badata’ part of the message and would then not need all the fprops information – but you would have to pack some of that into your opaque message.
Since you can send what data/fields you want, the update message is smaller, not containing anything about the fields that haven’t updated.
With much larger payloads, the overhead will become less significant.
The messages below have been formatted a bit for easier readability, adding line breaks, the sizes reflect the actual message sizes though.
Image message = 760 bytes
<script>
try{
window.parent.handleNVLLastEID("/PATH/TO/CHANNEL1","28");
window.parent.handleNVLEvents({ eid : 29,
cname : "/PATH/TO/CHANNEL1",
tag : "notag",
badata : "",
hasData : true,
hasDictionary : true,
'fprops': {
'Field1':"123456",
'Field2':"234567",
'Time1':1274181389019,
'Low':"100000",
'High':"200000",
'FullName':"Company Name Ltd",
'nrvpub.time':1274187070518,
'nrvpub.host':"host.domain.com",
'nrvpub.name':"username",
'JMSMessageID':"My Message Id",
'JMSXUserID':"My User id",
'JMSDeliveryMode':"NON_PERSISTENT",
nrvkeys : ['Field1','Field2','Time1','Low','High',
'FullName','nrvpub.time','nrvpub.host','nrvpub.name',
'JMSMessageID','JMSXUserID','JMSDeliveryMode'],
nrvprops : "",
nrvproparrays : "" }
},
"eventHandlerCallbackFunc");
}catch(Exception){}
</script>
Update message = 675 bytes
<script>
try{
window.parent.handleNVLLastEID("/PATH/TO/CHANNEL1","28");
window.parent.handleNVLEvents({ eid : 29,
cname : "/PATH/TO/CHANNEL1",
tag : "notag",
badata : "",
hasData : true,
hasDictionary : true,
'fprops': {
'Field1':"123459",
'Field2':"234569",
'Time1':1274181389999,
'nrvpub.time':1274187070518,
'nrvpub.host':"host.domain.com",
'nrvpub.name':"username",
'JMSMessageID':"My Message Id",
'JMSXUserID':"My User id",
'JMSDeliveryMode':"NON_PERSISTENT",
nrvkeys : ['Field1','Field2','Time1',
'nrvpub.time','nrvpub.host','nrvpub.name',
'JMSMessageID','JMSXUserID','JMSDeliveryMode'],
nrvprops : "",
nrvproparrays : "" }
},
"eventHandlerCallbackFunc");
}catch(Exception){}
</script>
Nirvana’s NSP protocol is used by the Java API (and other Enterprise APIs), this is a binary protocol and much smaller in message size. Field names are sent but channel names are not.
Image message = 154 bytes
..............Field1..123456.Field2..234567.Time1......J.
Low..100000.High..200000.FullName..Company Name Ltd.........J.
.......host.domain.com.username....
Update message = 102 bytes
..............Field1..123459.Field2..234569.Time1......J.
........J........host.domain.com.username....
Adobe LCDS
LCDS sends serialised representations of objects, which means it sends all the fields whether they have changed or not. You could probably construct objects to only contain changed fields, but it doesn’t play to the strengths of the intended ease of use of sending objects and object remoting.
The XML version of the message is very verbose, but I don’t think there is a reason to use it, so it is only here to show the data more clearly than the binary version of the message. Class names, channel names, field names, XML tags, they are all sent and makes the message very large. Again, line breaks and indentation have been added for readability.
AMF XML message = 937 bytes
<object type="flex.messaging.messages.AsyncMessage">
<traits>
<string>destination</string>
<string>headers</string>
<string>correlationId</string>
<string>messageId</string>
<string>timestamp</string>
<string>clientId</string>
<string>timeToLive</string>
<string>body</string>
</traits>
<string>market-data-feed</string>
<object>
<traits>
<string>DSSubtopic</string>
</traits>
<string>PATH.TO.INSTRUMENT</string>
</object>
<null/>
<string>DCE35EDA-AF0F-B3B5-3ECC-DA2D0F7BE56B</string>
<double>1.243526266655E12</double>
<string>DCE25D41-DC0A-6A55-8F54-E7E2C34D1B07</string>
<double>0.0</double>
<object type="flex.samples.marketdata.Stock">
<traits>
<string>Field1</string>
<string>Field2</string>
<string>Time1</string>
<string>Low</string>
<string>High</string>
<string>FullName</string>
</traits>
<double>123456</double>
<double>234567</double>
<date>1274181389019</date>
<double>100000</double>
<double>200000</double>
<string>Company Name Ltd</string>
</object>
</object>
The binary AMF protocol strips out the XML tags which are replaced by single byte type markers (shown by the percent symbol in the message below). This cuts the message size a lot, but it is still pretty big.
AMF binary message = 403 bytes
1ae%%
1a7%%
%%%
%flex.messaging.messages.AsyncMessage
%destination
%headers
%correlationId
%messageId
%timestamp
%clientId
%timeToLive
%body
%market-data-feed
%%%
%DSSubtopic
%PATH.TO.INSTRUMENT
%
%IC49204A4-D0D5-6C35-476C-B0170F6444AA
%%%%%%%%%
%IC47E1B72-01DF-9FE4-6EEE-D7D230F6A28E
%%%%%%%%%
%%%
%flex.samples.marketdata.Stock
%Field1
%Field2
%Time1
%Low
%High
%FullName
%123456
%234567
%%%%%%%%%
%100000
%200000
%Company Name Ltd
%%%%%
Conclusions
As you can see, message sizes can differ a lot, even when they are representing the same data payload. Message size may not seem that important at first, but if you want to deliver real time data to a lot of clients it quickly adds up to a large bandwidth bill and saturated networks which can affect hosting costs. Bandwidth on the client end may also be an issue and latency can be affected in some situations. Smaller is better!