Three Simple Steps to Achieving Peak DDS Performance Reply

RTI Connext® DDS provides an order of magnitude performance improvement over most other messaging middleware. But occasionally we run into customers who are trying to improve the performance of their DDS communications. This performance improvement can be achieved in either throughput or latency. In this blog, I will go through the three simple steps required to assess the performance of your system and will also review some of the most common ways customers have improved performance of their DDS communications.

Step 1: What performance should you be getting?

Compare the numbers you are getting with the comprehensive DDS benchmarks that RTI provides here: https://www.rti.com/products/dds/benchmarks.

If you are not getting close to the numbers you see in the DDS benchmarks, there are a couple things to try:

Use RTI Perftest to make sure you’re comparing apples to apples.

The configuration of the NIC and the network switch, as well as the maximum network throughput and the CPU, all have an impact on the final DDS performance results. So, to make a fair comparison run the DDS benchmarks on your hardware.  RTI makes the DDS benchmark program, “RTI Perftest,” available in source code format with complete documentation.  You can find a copy of Perftest here:  https://community.rti.com/downloads/rti-connext-dds-performance-test

Make sure you are running your tests using the network interface you think you are using.

DDS enables shared memory and UDPv4 transports by default. If Shared memory is available between two nodes DDS will use that by default. But if there are many network interfaces available DDS will only use the first four. I’ve seen developers want to test out a certain network interface, say Infiniband, but it was not one of the first four listed and so DDS was not adding it to the mix. In fact, on Windows systems, the order that network interfaces are listed by the OS, and thus selected by DDS, is random and so the network interface you are actually using can change from run to run. In fact, DDS will actually send the same data over two paths, if they exist, to the same endpoint. This can take up CPU time and slow throughput.  You can explicitly select the interface you want (or do not want)  using the transport QOS “allow-interfaces” property.   Here is a good RTI Community article on the subject: https://community.rti.com/howto/control-or-restrict-network-interfaces-nics-used-discovery-and-data-distribution.

Following is the actual XML code for “allow_interfaces” and “deny_interfaces” QOS that lets you explicitly pick the network interface you want to use or do not want to use:

<participant_qos>
 <property>
  <value>
   <element>
    <name>dds.transport.UDPv4.builtin.parent.deny_interfaces</name>
    <value>10.15.*</value>
   <element>
   <element>
    <name>dds.transport.UDPv4.builtin.parent.allow_interfaces</name>
    <value>10.10.*,192.168.*</value>
   </element>
  </value>
 </property>
</participant_qos>

Step 2. Use the RTI DDS tools to diagnose your performance issues. 

Use RTI Monitor to look for the number of ACKs, NACKs, dropped packets, and duplicate packets.  If these numbers are high, it can be due to several things:

  • Transport buffer sizes are too small
  • MTU  is not optimized for switch
  • There may be too many heartbeats causing multiple resends for single NACKs, indicating the reader is not keeping up
  • The CPU and memory process(es) are bound.

Use RTI Monitor or Admin Console to compare QOS settings of the DataReaders and DataWriters.  Sometimes you are not using the QOS values you think you are using.

A great way to learn about using the Admin Console and the Monitor tools is to watch this video from our tools lead, Ken Brophy.

Step 3. Now let’s start to look at your application to see how we can speed things up by changing the “shape” of the data in motion.

RTI DDS gives you many ways to fine tune your system using QOS settings. This flexibility is great because you have a lot of control over how DDS works.  But all the options can be daunting! I won’t go over every setting (this blog would quickly grow to be a textbook) but I will hit on what I feel are the most important settings to check in regards to performance.

First, don’t use strict reliability if it is not needed. Strict reliability makes sure that every sample reaches every reliable destination and will re-send samples if necessary. Resending samples and the structure that supports them take time and memory.  Many applications would be fine missing a sample very occasionally or waiting longer for it to be re-transmitted.

If you do need to use strict reliability then start with the DDS built-in profile “StrictReliable.HighThroughput”.  It is a good idea in general to use the built-in profiles that RTI provides. These built-in profiles are set up by RTI to have all of the default settings needed for the most common DDS use cases. The built-in profiles can be used as-is or can be used as the basis for your QOS configuration and then tweaked for your specific needs. You can read about using DDS built-in profiles and get a working example here:  https://community.rti.com/examples/built-qos-profiles 

Using Extensible types (XTypes) and sequences of structures can hurt performance. DDS serializes and de-serializes data it sends and receives, and this process takes a lot longer with complicated data types.

Adjust heartbeat_period/ ACKNACK combo.  In reliable communications, the DataWritersends DDS data samples and heartbeats to reliable DataReaders. A DataReader responds to a heartbeat by sending an ACKNACK, which tells the DataWriter what the DataReader has received so far. In addition, the DataReader can request missing DDS samples (by sending an ACKNACK) and the DataWriter will respond by resending the missing DDS samples. So, the heartbeat_period can control how quickly a data reader can acknowledge receipt of a sample or ask for a sample to be re-sent, impacting performance.  Here is an article that talks about how the heartbeat_period can impact latency and throughput.

Modify the Asynchronous Publisher configuration to use flow control to lower the data rate. Sometimes if the data rate from the writer is too fast, the reader gets swamped and the resulting dropped samples and resends slow down the system. Lowering the writer’s data rate a little leaves room for repairs, etc. This gives DDS time to handle incoming data and avoids costly resends. You can use a flow controller to shape the output traffic your publisher will generate. By using an asynchronous publisher and custom flow controller you can lower the data rate. You can see a working example of how to use the asynchronous publisher here: https://community.rti.com/examples/asynchronous-publisher

For smaller sample sizes, use batching and/or Turbo Mode. Batching groups of small samples into a single large packet is more efficient to send and can result in a large throughput increase. But note that while the use of batching increases throughput, it can hurt latency when little data is being sent (because of the added time needed to batch small samples). In high-throughput cases, though, average latency results because of all the CPU saved on the subscriber side of the interface.

Turbo Mode is an experimental feature that uses an intelligent algorithm that adjusts the number of bytes in each batch at runtime according to current system conditions, such as write speed (or write frequency) and sample size. This intelligence gives Turbo Mode the ability to increase throughput at high message rates and avoid negatively impacting message latency at low message rates.

Here is an article that goes into detail on how to use batching and includes a working example: https://community.rti.com/examples/batching-and-turbo-mode

Use multicast for topics with more than a couple of subscribers. Multicast allows a publisher to send to multiple readers with a single write, greatly reduces network and publisher-side processor utilization.  Note that sometimes this feature is not available at the network level.  Here is a good article on  how to implement multicast: https://community.rti.com/best-practices/use-multicast-one-many-data

For reliable communications modify the Send Window size. When a reliable DataWriter writes a DDS sample, it keeps that sample in its queue until it has received acknowledgments from all of its subscribing DataReaders that the sample was received. The number of outstanding DDS samples allowed is referred to as the DataWriter’s “send window.” Once the number of outstanding DDS samples has reached the send window size, subsequent writes will block until an outstanding DDS sample is acknowledged. Anytime the writer blocks, it hurts performance. You can read about adjusting the Send Window in section 6.5.3.4 of the DDS User’s Manual.

Modify the transport settings. Whether you are using UDPv4 or shared memory or a custom transport, having the right buffer sizes and message sizes configured is extremely important when trying to optimize performance. Following is XML code for modifying transport message size and buffers sizes for the UDPv4 transport:

<participant_qos>
 <property>
  <value>
   <element>
    <name>dds.transport.UDPv4.builtin.parent.message_size_max</name>
    <value>65536</value>
   </element>
   <element>
    <name>dds.transport.UDPv4.builtin.send_socket_buffer_size</name>
    <value>524288</value>
   </element>
   <element>
    <name>dds.transport.UDPv4.builtin.recv_socket_buffer_size</name>
    <value>2097152</value>
   </element>
  </value>
 </property>
</participant_qos>

Note that the sizes used here are suggestions for optimizing performance when using large samples. You can make these values smaller for smaller samples.

I hope this advice is helpful in getting the best performance out of your DDS Application. I’ve listed the tips I’ve found most helpful for improving DDS performance but there are other methods that can also be helpful depending on the circumstances. In order to get more information on improving throughput or latency (or really help with any other Connext DDS issue), I encourage you to check out the RTI Community portal. The RTI Community portal is an excellent source of information and support! And of course, always feel free to contact our great support department or your local Field Application Engineer for further help.

Hey, Charlie Miller! Let’s Talk About Securing Autonomous Vehicles Reply

Hackedv2

A recent Wired article on Charlie Miller (infamously known for remotely hacking and controlling a Jeep) claims that “open conversation and cooperation among companies” are necessary prerequisites to building secure autonomous vehicles. This seems rather far-fetched when so many companies are racing to dominate the future of the once-nearly-dead-but-newly-revived (remember the Big Three bailouts?) automotive industry. As naive as that part of the article sounds, what really blew my mind was the implication that the answer to re-designing security lies solely within the autonomous-vehicle industry.

IIC_LogoThe concept of security is not isolated to autonomous vehicles so there is no benefit in pretending that’s the case. Every IIoT industry is trying to solve similar problems and are surprisingly open to sharing their findings. I’m not saying that Miller needs to go on a journey of enlightenment through all other industries to create the ideal solution for security. I’m saying this has already been done for us, compliments of the Industrial Internet Consortium (IIC).

The IIC consists of 250+ companies across several industries – including automotive suppliers like Bosch, Denso, and TTTech – with the same fundamental problem of balancing security, safety, performance, and of course costs for their connected systems. If Wired and Miller are looking for an open conversation – it’s happening at the IIC. The IIC published the Industrial Internet Reference Architecture, which is available to everyone for free – as in “free beer,” especially if the car is doing the driving for you! The extensions to this document are the Industrial Internet Connectivity Framework (IICF) and Industrial Internet Security Framework (IISF). These documents provide guidance from a business perspective down to implementation, and the IISF is particularly applicable as it addresses Wired’s brief mentions to securing the connectivity endpoints and the data that passes between them.

Take a ride with me and see how we might modify the connected car’s architecture to protect against potential adversaries. Since we do not have any known malicious attacks on cars, we can start with Miller’s Jeep hack. Thanks to a backdoor “feature” in the Harmon Kardon head unit, Miller was able to execute unprotected remote commands quite easily. Through this initial exploit, he was able to reprogram a chip connected to the CAN Bus. From there, he had nearly full control of the car. You’re thinking, “just remove that unprotected interface,” right?

Miller would not have stopped there, so neither shall we. Assuming we could still find an exploit that granted us access to reprogram the ARM chip, then Wired’s article rightly suggests establishing an authenticated application – perhaps starting with secure boot for the underlying kernel, leverage ARM Trust Zone for the next stage of critical-only software, and implement some sort of authentication for higher level OS and application processes. Your device endpoint might start to look like a trusted application stack (Figure 1 below). I can only guess how much this head unit costs now, but to be fair, these are valid considerations to run a trusted application. The problem now is that we haven’t actually connected to anything, let alone securely. Don’t worry, I won’t leave you by the roadside.

Screen Shot 2017-05-03 at 2.40.51 PM

Figure 1. Trusted Application Stack

Many of these trusted applications connect up directly to the CAN Bus, which arguably expands the attack surface to the vehicle control. The data passed between these applications are not protected from unauthorized data writers and readers. In the case of autonomous taxis, as Wired points out, potential hackers now have physical access to their target, increasing their chance of taking over an application or introducing an imposter. Now the question becomes: can applications trust each other and the data on the CAN bus? How does the instrument cluster trust the external temperature data? Does it really need to? Maybe not and that’s ok. However, I am pretty sure that the vehicle control needs to trust LIDAR, Radar, cameras, and so on. The last thing anyone wants to worry about is a hacker remotely taking the car for a joyride.

We are really talking about data authenticity and access control: two provisions that would have further mitigated risk against Miller’s hack. Securing the legacy applications is a good step, but let’s consider the scenario where an unauthorized producer of data is introduced to the system. This trespasser can inject commands on the CAN Bus – messages that control steering and braking. The CAN Bus does not prevent unauthorized publishers of data nor does it ensure that the data comes from the authenticated producer. I’m not suggesting that replacing the CAN Bus is the way forward – although I’m not opposed to the idea of replacing it with a more data-centric solution. Realistically, with a framework like Data Distribution Services (DDS), we can create a layered architecture as guided by the IISF (Figure 2 below). The CAN Bus and critical drive components are effectively legacy systems for which security risk can be mitigated by creating a DDS databus barrier. New components can then be securely integrated using DDS without further compromising your vehicle control. So what is DDS? And how does it help secure my vehicle? Glad you asked.

Screen Shot 2017-05-03 at 2.41.07 PM

Figure 2. Industrial Internet Security Framework Protecting Legacy Endpoints

Imagine a network of automotive sensors, controllers, and other “participants” that communicate peer-to-peer. Every participant receives only the data it needs from another participant and vice versa. With peer-to-peer, participants in that network can mutually authenticate and if our trusted applications hold up, so does our trusted connectivity. How do we secure those peer-to-peer connections? TLS, right? Possibly, but with the complexity of securing our vehicle we want the flexibility to trade off between performance and security and apply access control mechanisms.

Let’s back up a little and re-visit our conversation about the IICF, which provides guidance on connectivity for industrial control systems. The IICF identifies existing open standards and succinctly attributes them to precise functions of an Industrial IoT system. At its core, an autonomous vehicle, as cool as it sounds, is just an Industrial IoT system in a sleek aerodynamic body with optional leather seats. So what does the IICF suggest for integrating software for an Industrial IoT system, or more specifically, autonomous systems? You guessed it! DDS: an open set of standards designed and documented through open conversations by the Object Management Group (OMG). An ideal automotive solution leveraging DDS allows system applications to publish and subscribe to only messages that they need (see Figure 3 below for our view of an autonomous architecture). With this data-centric approach, we can architecturally break down messages based on criticality for safety or need for data integrity.

Screen Shot 2017-05-03 at 2.41.17 PM

Figure 3. Autonomous Vehicle Data-Centric Architecture

And now that we’ve established a connectivity solution for our autonomous vehicle, we can get back to talking about security and our TLS-alternative: a data-centric security solution for a data-centric messaging framework. With DDS Security, Industrial IoT system architects can use security plugins to fine-tune security and performance trade-offs, a necessary capability not offered by TLS (Figure 4 below). Authenticate only select data topics but no more? Check. Encrypt only sensitive information but no more? Check. Actually, there is more. Casting aside centralized brokers, DDS Security offers distributed access control mechanisms dictating what participants can publish or subscribe to certain topics without single points of vulnerability. This means that Miller’s unauthorized application would be denied permission to publish commands to control braking or steering. Or if Miller compromised the data in motion, the data subscriber could cryptographically authenticate the message and discard anything that doesn’t match established policies. Can we say our autonomous vehicle is now completely secure? No, because as Miller made it perfectly clear, we still need more conversations. However, we can certainly say that DDS and DDS Security provide the forward-looking flexibility needed to help connect and secure autonomous systems.

Screen Shot 2017-05-03 at 2.41.31 PM

Figure 4. Connext DDS Secure Pluggable Architecture

So, to Mr. Charlie Miller (and of course Mr. Chris Valasek), your work is amazing and vision inspiring, but I think you need to look across industries if you want to talk openly about redesigning automotive architecture. When you and all the other Charlie Millers in the world want to have that open conversation, come knock on our door. At RTI, we are ready to talk to you about autonomy, Industrial IoT, safety and security, and everything you else you believe should define cars of tomorrow.

Mission: ace the initial screening call and get asked back for in-depth interviews Reply

 

Congratulations! Hopefully the tips from Mission: score an interview with a Silicon Valley company were helpful, and you have been contacted to talk to the hiring manager. Here are a few tips on how to ace the initial call.

Before the interview

  • Test your system. We often use a video call to interview candidates: Skype or Google Hangouts. Make sure your camera and microphone work. Do not use a phone for video conferencing. Sometimes we do hands-on exercises. Have a working IDE, editor and compiler installed. Lastly, take a few moments to clean up your computer background and desktop files.
  • Be professional. Dress appropriately in a shirt or blouse. A suit or tie is overkill. Also, pay attention to your environment and background. Clean up the room. Coordinate with your family or roommates that you will be in an important interview call.
  • Prepare by exploring the company website. Read about the products, download the evaluation software, check out the videos, or forums. Learn as much as you can about the company. Look up the interviewer on LinkedIn. You have to learn about a common passion.

During the interview

  • Be friendly and personable. Smile. Show a spark.
  • Be confident, but don’t oversell. And no BS.
  • Ask clarifying questions, especially if your English is not that great.
  • Be sincere if you do not know something but try to answer anyway. For example, “I am not a Java expert, though if it works similar to C#, then …”
  • Don’t give up. Brainstorm. An interview is not a pass/fail test. One candidate felt he should have been asked back for an in-depth interview because he answered six of the ten questions correctly; in his eyes, he had passed the interview. Unfortunately, for the four questions he missed, he didn’t even try to answer them. Being able to figure out things you don’t know is one of the most important skills of an engineer.
  • Don’t play hard to get or show you are disinterested. Don’t act selectively.
  • Think of some questions to ask about the company, job, customers, etc.

Good luck getting to the in-depth interview mission. Apply now to start the process!