Data Transparency: Why You Should Care 2

Supreet had a great post recently about the importance of giving your data model significant design attention, just as you do your system’s performance, determinism, and functionality. (Indeed, you can hardly separate these things.) I’d like to take that theme a bit further, and talk about some of the things RTI Data Distribution Service can do to specifically support you in that.

One of the most interesting — and as far as I know, unique — capabilities of DDS middleware is that you can associate type definitions with your data streams. (For an introduction to DDS concepts, see the RTI whitepaper “Is DDS for You?“) Most other messaging systems do one of the following instead:

  1. Use opaque data only, and make the application handle marshalling/serialization itself, including data encapsulation, endianness conversions, and the like.
  2. Include complete data structure information with every message, including the names and types of any fields it contains.

Multi-vendor interoperability is fiction without some kind of type system, which disqualifies (1). How can I so much as send you a single string of text if we can’t agree whether that string should be length-prefixed or NUL-terminated, how wide the characters should be, or what encoding they should use? But self-describing messages are awfully heavy on the network, and they fail to capture generalities that already exist in your system.

The fact is, whichever component inserts a data value into a message almost certainly knows the name and type of that data value, and whichever component receives the message has some expectation about the names and types of the values it will find inside. What DDS gives you is the ability to expose that meta-information so that you can derive some value from it.

  • The programming language you use to interface with your middleware probably supports strong typing. With DDS, you can extend that type system — including the definitions of your application’s specific data types — across the network.
  • The database to which you connect has a type definition implicit in its tables’ schemas. Wouldn’t it be nice if the same information model made sense when your data was in-flight as when it was recorded?
  • Use a CEP engine? Leading implementations rely on SQL-like languages to inspect and correlate data values based on well-known fields within the data.
  • Some people like to pull their data into Microsoft Excel to analyze and visualize it. There are those rows and columns again: I’m noticing a pattern.

When you declare your data type up front, the middleware can also share that type’s definition up front — instead of with every message — so that you enjoy all of the benefits of type awareness as well as those of data compactness. Who says you can’t have your cake and eat it too?

2 comments

  1. Pingback: In Progress at OMG: Extensible and Dynamic Types « Blogs from RTI

  2. Pingback: The Data-Centric Modus Operandi « Blogs from RTI

Submit a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s