#TBT: From Predicting to Propelling the Industrial IoT Reply

If you missed it, you should check out the recent press release about RTI’s growth in the Industrial Internet of Things (IIoT). It’s really a great time to be RTI! Sure, from a business perspective all the vectors point the right way. But for me, the most exciting things in that press release aren’t numbers.  I’m more amazed that we get to play with so many futuristic applications. Carbots? Renewable energy? Smart healthcare? Hyperloop? Flying cars? Wind turbines? CT scanners? We got ‘em all. And new things show up all the time.

How can a small company like RTI play in so many areas? It’s Thursday, so in honor of #TBT, I thought I’d take a look back and see how we got here.

For those of you RTI history buffs, we sold our tools business to Wind River in 2005. It was 80% of our revenues at the time. So, in 2006, we had some money and great people, but not much in the way of products. So, we started looking for new trends. What did we find? Here’s an excerpt from our 2006 vision paper called “The Data-Centric Future”:

Truly profound technologies become part of everyday life. Motors, plastics, computers, and now networking have made this transition in the last 100 years. These technologies are embedded in billions of devices; they have melted into the assumed background of the modern world.

Another step is emerging in this progression: pervasive, real-time data. This differs from the Internet in that this pervasive information infrastructure will connect devices, not people. Just as the “web” connected people and fundamentally changed how we all interact; pervasive data will connect devices and change how they interact.

Today’s network makes it easy to connect nodes, but not easy to find and access the information resident in networks of connected nodes. This is changing; we will soon assume the ability to pool information from many distributed sources, and access it at rates meaningful to physical processes.  Many label this new capability the “network-centric” architecture. We prefer the term “data centric” because the change, fundamentally, is driven by a fast and easy availability of information, not the connectivity of the network itself. Whatever the name, it will drive the development of vast, distributed, information-critical applications.

If you change “pervasive information infrastructure” to “Industrial IoT,” that was a pretty good prediction. Too bad the name didn’t stick, but I suppose “PII” sounds more like the second act of a play or a stuttering math teacher than a technology revolution. Anyway, we thought it was coming soon, but 6 years later, we were still predicting. An RTI handbook entry from 2012 (still before the IoT and before the IIC launched the Industrial IoT in 2014) expanded on this a bit:

There is amazing value in distributed information. Connecting people to information transforms society – news feeds, weather satellites, and the Internet are only a few examples – timely information flow drives value in every industry and every endeavor.

However, current technologies only connect people at human speeds. There’s an entirely new opportunity to connect machines at physics speeds. Just as the Internet connected people and fundamentally changed how we all interact, a new “pervasive information infrastructure” will connect devices at speeds fast enough to drive distributed applications. RTI’s people and technology are the best in the world at delivering that data to the right place at the right time. We fundamentally connect complex systems at extreme speeds better than any organization on the planet.

Our technology enables tens, or hundreds, or thousands, or (soon) millions of processors to work together as a single application. Why does that matter? Because intimately-connected systems can do things that weakly-connected systems cannot. They can request and access data from far-flung reaches fast enough to react intelligently. They can read deeply-embedded sensors, use that data to control high-speed machines, and feed the results to the enterprise for monitoring and optimization. This powerful connectivity is a fundamental transformation that will make currently difficult things commonplace and currently impossible things possible. We are already working on many of these applications. Our work will help astronomers probe the deepest reaches of space, protect passengers from injury on tomorrow’s roads, make our nation more secure from attack, and improve the efficiency of renewable energy generation. RTI is leading the new wave of large connected systems, systems that work together as one.

So how can we play in so many futuristic areas? Call us lucky or call us prescient, but the IIoT has been our target for over a decade. RTI is influential in the IIoT because we got a head start. We’re no longer predicting. We’re now realizing the future.

So, what’s next? Our new tagline is “RTI lives at the intersection of functional artificial intelligence and pervasive networking.” Think about that one for a while. AI and connectivity are perhaps the most important trends for the next 40 years. They will combine to bring new wonders to light in every industry on the planet. The IIoT is much more than a name or a connectivity technology; it’s a new infrastructure for, well, everything. Connecting things together is powerful. Connecting those things to smarts is a whole new game. Carbots will improve transportation, autonomous hospital devices will take better care of patients and smart networks will make green energy practical. But, these are just the ENIACs of the IIoT. It won’t be long until every device on the planet is part of a connected, intelligent infrastructure. That infrastructure will make everything more efficient, more useful and friendlier.  These are exciting times indeed!

In any case, it’s an honor to be a leader in the “new” IIoT revolution. It’s even more of an honor to be associated with the technical visionaries at RTI that put us there.

A Foggy Forecast for the Industrial Internet of Things Reply

A Foggy Forecast for the Industrial Internet of Things

Signs on I-280 up the San Francisco peninsula proclaim it the “World’s Most Beautiful Freeway.” It’s best when the fog rolls over the hills into the valley, as in this picture I took last summer.

That fog is not just pretty, it’s also the natural refrigerator responsible for California’s famously perfect weather. Clouds in the right place work wonders.

What is Fog? iiot-glossary

This is a perfect analogy for the impending future of the Industrial Internet of Things (IIoT) computing. In weather, fog is the same thing as clouds, only close to the ground. In the IoT, fog is defined as cloud technology close to the things. Neither is a precise term, but it’s true in both cases: clouds in the right place work wonders.

The major industry consortia, including the Industrial Internet Consortium (IIC) and the OpenFog Consortium, are working hard to better define this future.  All agree that many aspects that drive the spectacular success of the cloud must extend beyond data centers.  The also agree that the real world also contains challenges not handled by cloud systems.  They also bandy about names and brand positioning; see the sidebar for a quick weather map.  By any name, the fog, or layered edge computing, is critical to the operation of the industrial infrastructure.

Perhaps the best way to understand fog is to examine real use cases.

Example: Connected Medical Devices

Consider first the coming future of intelligent medical systems.  The driving issue is an alarming fact: the 3rd leading cause of death in the US is hospital error.  Despite extensive protocols that check and recheck assumptions, device alarms, training on alarm fatigue, and years of experience, the sad truth is that hundreds of thousands of people die every year because of miscommunications and errors.  Increasingly clearly, compensating for human error in such a complex environment is not the solution.  The best path is to use technology to take better care of patients.

The Integrated Clinical Environment standard is a leading effort to create an intelligent, distributed system to monitor and care for patients.  The key idea is to connect medical devices to each other and to an intelligent “supervisory” computing function.  The supervisor acts like a tireless member of the care team, checking patient status and intelligently alerting human caretakers or even taking autonomous actions when there are problems.

icedataflow

The supervisor combines and analyzes oximeter, capnometer, and respirator readings to reduce false alarms and stop drug infusion to prevent overdose. The DDS “databus” connects all the components with real-time reliable delivery.

This sounds simple.  However, consider the real-world challenges.  The problem is not just the intelligence.  Current medical devices do not communicate at all.  They have no idea that they are connected to the same patient.  There’s no obvious way to ensure data consistency, staff monitoring, or reliable operation.

Worse, the above diagram is only one patient.  That’s not the reality of a hospital; they have hundreds or thousands of beds.  Patients move between rooms every day.  The environment includes a mix of wired and wireless networks. Finding and delivering information within the treatment-critical environment is a superb challenge.

A realistic hospital environment includes thousands of patients and hundreds of thousands of devices. Reliable monitoring technology must find the right patient and guarantee delivery of that patient’s data to the right analysis or staff. In the connectivity map above, every red dot is a “fog routing node”, responsible for passing the right data up to the next layer.

A realistic hospital environment includes thousands of patients and hundreds of thousands of devices. Reliable monitoring technology must find the right patient and guarantee delivery of that patient’s data to the right analysis or staff. In the connectivity map above, every red dot is a “fog routing node”, responsible for passing the right data up to the next layer.

This scenario exposes the key need for a layered fog system.  Complex systems like this must build from hierarchical subsystems.  Each subsystem shares internal data, with possibly complex dataflow, to execute its functions.  For instance, a ventilator is a complex device that controls gas flows, monitors patient state, and delivers assisted breathing.  Internally, it includes many sensors and motors and processors that share this data.  Externally, it presents a much simpler interface that conveys the patient’s physiological state.   Each of the hundreds of types of devices in a hospital face a similar challenge.  The fog computing system must exchange the right information up the chain at each level.

Note that this use case is not a good candidate for cloud-based technology.  These machines must exchange fast, real-time data flows, such as signal waveforms, to properly make decisions.  Also, patient health is at stake.  Thus, each critical component will need a very reliable connection and even redundant implementation for failover.  Those failovers must occur in a matter of seconds.  It’s not safe or practical to rely on remote connections.

Example: Autonomous Cars

The “driverless car” is the most disruptive innovation in transportation since the “horseless carriage”.  Autonomous Drive (AD) cars and trucks will change daily life and the economy in ways that hard to imagine.  They will move people and things faster, safer, cheaper, farther, and easier than the primitive “bio-drive” cars of the last century.  And the economic impact is stunning; 30% of all US jobs will end or change; trucking, delivery, traffic control, urban transport, child & elder care, roadside hotels, restaurants, insurance, auto body, law, real estate, and leisure will never again be the same.

Autonomous car software exchanges many data types and sources. Video and Lidar sensors are very high volume; feedback control signals are fast. Infrastructure that reliably sends exactly the right information to exactly the right places at the right time makes system development much easier. The vehicle thus combines the performance of embedded systems with the intelligence of the cloud…aka fog.

Autonomous car software exchanges many data types and sources. Video and Lidar sensors are very high volume; feedback control signals are fast. Infrastructure that reliably sends exactly the right information to exactly the right places at the right time makes system development much easier. The vehicle thus combines the performance of embedded systems with the intelligence of the cloud…aka fog.

Intelligent vehicles are complex distributed systems.  An autonomous car combines vision, radar, lidar, proximity sensors, GPS, mapping, navigation, planning, and control.  These components must work together as a reliable, safe, secure system that can analyze complex environments in real time and react to negotiate chaotic environments.  Autonomy is thus a supreme technical challenge.  An autonomous car is more a robot on wheels than it is a car. Automotive vendors suddenly face a very new challenge.  They need fog.

Fog integrates all the components in an autonomous car design. Each of these components is a complex module on its own. As in the hospital patient monitoring case, this is only one car; fog routing nodes (red) are required to integrate subsystems and connect the car into a larger cloud-based system. This system also requires fast performance, extreme reliability, integration of many types of dataflow, and controlled module interactions. Note that cloud-based applications are also critical components. Fog systems must seamlessly merge with cloud-based applications as well.

Fog integrates all the components in an autonomous car design. Each of these components is a complex module on its own. As in the hospital patient monitoring case, this is only one car; fog routing nodes (red) are required to integrate subsystems and connect the car into a larger cloud-based system. This system also requires fast performance, extreme reliability, integration of many types of dataflow, and controlled module interactions. Note that cloud-based applications are also critical components. Fog systems must seamlessly merge with cloud-based applications as well.

How Can Fog Work?

So, how can this all work?  I’ve hinted at a few of the requirements above.  Connectivity is perhaps the greatest challenge.  Enterprise-class technologies cannot deliver the performance, reliability, redundancy, and distributed scale that IIoT systems need.

The key insight is that systems are all about the data.  The enabling technology is data-centricity.

A data-centric system has no hard-coded interactions between applications.  When applied to fog connectivity, this concept overcomes problems associated with point-to-point system integration, such as lack of scalability, interoperability, and the ability to evolve the architecture. It enables plug-and-play simplicity, scalability, and exceptionally high performance.

The leading standard for data-centric connectivity is the Data Distribution Service (DDS).  DDS is not like other middleware.  It directly addresses real-time systems. It features extensive fine control of real-time Quality of Service (QoS) parameters, including reliability, bandwidth control, delivery deadlines, liveliness status, resource limits, and security.  It explicitly manages the communications “data model”, or types and QoS used to communicate between endpoints.  It is thus a “data-centric” technology.

DDS is all about the data: finding data, communicating data, ensuring fresh data, matching data needs, and controlling data.  Like a database, which provides data-centric storage, DDS understands the contents of the information it manages.  This data-centric nature, analogous to a database, justifies the term “databus”.

Databus vs. Database: The 6 Questions Every IIoT Developer Needs to Ask

Traditional communications architectures directly connect applications. This connection takes many forms, including messaging, remote object-oriented invocation, and service oriented architectures. Data-centric systems fundamentally differ because applications interact only with the data and properties of data. Data centricity decouples applications and greatly enables scalability, interoperability and integration. Because many applications may interact with the data independently, data centricity also makes redundancy natural.

Traditional communications architectures directly connect applications. This connection takes many forms, including messaging, remote object-oriented invocation, and service oriented architectures. Data-centric systems fundamentally differ because applications interact only with the data and properties of data. Data-centricity decouples applications and greatly enables scalability, interoperability and integration. Because many applications may interact with the data independently, data-centricity also makes redundancy natural.

Note that the databus replaces the application-application interaction with application-data-application interaction.  This abstraction is the crux of data-centricity and it’s absolutely critical.  Data-centricity decouples applications and greatly eases scaling, interoperability, and system integration.

Continuing the analogy above, a database implements this same trick for data-centric storage.  It saves old information that you can later search by relating properties of the stored data.  A databus implements data-centric interaction.  It manages future information by letting you filter by properties of the incoming data.  Data-centricity makes a database essential for large storage systems.  Data-centricity makes a databus a fundamental technology for large software-system integration.

The databus automatically discovers and connects publishing and subscribing applications.  No configuration changes are required to add a new smart machine to the network.  The databus matches and enforces QoS.  The databus insulates applications from the execution, or even existence, of other applications.  As long as its data specifications are met, an application can run successfully.

A databus also requires no servers.  It uses a protocol to discover possible connections.  All dataflow is directly peer-to-peer for the lowest possible latency.  And, with no servers to clog or fail, the fundamental infrastructure is both scalable and reliable.

To scale as in our examples above, we must combine hierarchical subsystems; that’s important to fog.  This requires a component that isolates subsystem interfaces, a “fog routing node”.  Note that this is a conceptual term.  It does not have to be, and often is not, implemented as a hardware device.  It is usually implemented as a service, or running application.  That service can run anywhere needed: on the device itself, in a separate box, or in the higher-level system.  Its function is to “wrap a box around” a subsystem, thus hiding the complexity.  The subsystem thus exports only the needed data, allows only controlled access, and even presents a single security domain (certificate).  Also, because the databus so naturally supports redundancy, the service design allows highly reliable systems to simply run many parallel routing nodes.

Hierarchical systems require containment of subsystem internal data. The fog routing node maps data models between levels, controls information export, enables fast internal discovery, and maps security domains. The external interface is thus a much simpler view that hides the internal system.

Hierarchical systems require containment of subsystem internal data. The fog routing node maps data models between levels, controls information export, enables fast internal discovery, and maps security domains. The external interface is thus a much simpler view that hides the internal system.

RTI has immense experience with this design, with over 1000 projects.  These include fast 3kHz feedback loops for robotics, NASA KSC’s huge 300k-point launch control SCADA system, Siemens Wind Power’s largest offshore turbine farms, the Grand Coulee dam, GE Healthcare’s CT imaging and  patient monitoring product lines, almost all Navy ships of the US and its allies, Joy Global’s continuous mining machines, many pilotless drones and ground stations, Audi’s hardware-in-the-loop testing environment, and a growing list of autonomous car and truck designs.

The key benefits of a databus include:

  • Reliability: Easy redundancy and no servers to fail allow extremely reliable operation. The DDS databus supports systems cannot tolerate being offline even for a short period, whether 5 minutes or 5 milliseconds.
  • Real-time: Databus peer-to-peer delivery easily supports latencies measured in milliseconds and even tens of microseconds.
  • Interface scale: Large software projects with more than 10 interacting modules must carefully define, coordinate, and evolve interfaces. Data-centric technology moves this responsibility from manual processes to automatic, enforced infrastructure.  RTI has experience with systems with over 1500 teams of programmers building thousands of interacting applications.
  • Data scale: When systems grow large, they must control dataflow. It’s simply not practical to send everything to every application.  The databus allows filtering by content, rate, and more.  Thus, applications receive only what they truly need.  This greatly reduces both network and processor load.  This is critical for any system with more than 1000 independently-addressable data items.
  • Architecture: Data-centricity is not easily “added” to a system. It is instead adopted as the core design.  Thus, the transformation makes sense only for next-generation IIoT designs.  Most system designs have lifecycles of many years.

Any system that meets most of these requirements should seriously consider a data-centric design.

FREE eBook: Leading Applications & Architecture for the Industrial Internet of Things

The Foggy Future

Like the California fog blanket, a cloud in the right place works wonders.  Databus technology enables elastic computing by bringing the data where it’s needed reliability.  It supports real-time, reliable, scalable system building. Of course, communication is only one of the required functions of the evolving fog architecture.  But it is key and relatively mature.  It is thus driving many designs.

The Industrial IoT will change nearly every industry, including transportation, medical, power, oil and gas, agriculture, and more.  It will be the primary driving trend in technology for the next several decades, the technology story of our lifetimes.  Fog computing will move powerful processing currently only available in the cloud out to the field.  The forecast is foggy indeed.

Databus vs. Database: The 6 Questions Every IIoT Developer Needs to Ask 3

importantQuestionsDatabasevsDatabus

The Industrial Internet of Things (IIoT) is full of confusing terms.  That’s unavoidable; despite its reuse of familiar concepts in computing and systems, the IIoT is a fundamental change in the way things work.  Fundamental changes require fundamentally new concepts.  One of the most important is the concept of a “databus”.

The soon-to-be-released IIC reference architecture version 2 contains a new pattern called the “layered databus” pattern.  I can’t say much more now about the IIC release, but going through the documentation process has been great for driving crisp definitions.

The databus definition is:

A databus is a data-centric information-sharing technology that implements a virtual, global data space.  Software applications read and update entries in a global data space. Updates are shared between applications via a publish-subscribe communications mechanism.

Key characteristics of a databus are:

  1. the participants/applications directly interface with the data,
  2. the infrastructure understands, and can therefore selectively filter the data, and
  3. the infrastructure imposes rules and guarantees of Quality of Service (QoS) parameters such as rate, reliability, and security of data flow.

Of course,  new concepts generate questions.  Some of the best questions came from an architect from a large database company.  We usually try to explain the databus concept from the perspective of a networking or software architect.  But, data science is perhaps a better approach.  Both databases and databuses are, after all, data science concepts.

Let’s look at the 6 most common questions.

Question 1: How is a databus different from a database (of any kind)?

Short answer: A database implements data-centric storage.  It saves old information that you can later search by relating properties of the stored data.  A databus implements data-centric interaction.  It manages future information by letting you filter by properties of the incoming data.

Long answer: Data centricity can be defined by these properties:

  • The interface is the data. There are no artificial wrappers or blockers to that interface like messages, or objects, or files, or access patterns.
  • The infrastructure understands that data. This enables filtering/searching, tools, & selectivity.  It decouples applications from the data and thereby removes much of the complexity from the applications.
  • The system manages the data and imposes rules on how applications exchange data. This provides a notion of “truth”.  It enables data lifetimes, data model matching, CRUD interfaces, etc.

A relational database is a data-centric storage technology. Before databases, storage systems were files with application-defined (ad hoc) structure.  A database is also a file, but it’s a very special file.  A database knows how to interpret the data and enforces access control.  A database thus defines “truth” for the system; data in the database can’t be corrupted or lost.

By enforcing simple rules that control the data model, databases ensure consistency.  By exposing the data to search and retrieval by all users, databases greatly ease system integration.  By allowing discovery of data and schema, databases also enable generic tools for monitoring, measuring, and mining information.

Like a database, data-centric middleware (a databus) understands the content of the transmitted data.  The databus also sends messages, but it sends very special messages.  It sends only messages specifically needed to maintain state.  Clear rules govern access to the data, how data in the system changes, and when participants get updates.  Importantly, only the infrastructure sends messages.  To the applications, the system looks like a controlled global data space.  Applications interact directly with data and data “Quality of Service” (QoS) properties like age and rate.  There is no application-level awareness or concept of “message”.  Programs using a databus read and write data, they do not send and receive messages.

Database vs Databus

A database replaces files with data-centric storage that finds the right old data through search. A databus replaces messages with data-centric connectivity that finds the right future data through filtering. Both technologies make system integration much easier, supporting much larger scale, better reliability, and application interoperability.

With knowledge of the structure and demands on data, the databus infrastructure can do things like filter information, selecting when or even if to do updates.  The infrastructure itself can control QoS like update rate, reliability, and guaranteed notification of peer liveliness.  The infrastructure can discover data flows and offer those to applications and generic tools alike.  This knowledge of data status, in a distributed system, is a crisp definition of “truth”.  As in databases, the infrastructure exposes the data, both structure and content, to other applications.  This accessible source of truth greatly eases system integration.  It also enables generic tools and services that monitor and view information flow, route messages, and manage caching.

Question 2: “Software applications read and update entries in a global data space. Updates are shared between applications via a publish-subscribe communications mechanism.”  Does that mean that this is a database that you interact with via a pub-sub interface?

Short answer: No, there is no database.  A database implies storage: the data physically resides somewhere.  A databus implements a purely virtual concept called a “global data space”.

Long answer: The databus data space defines how to interact with future information.  For instance, if “you” are an intersection controller, you can subscribe to updates of vehicles within 200m of your position.  Those updates will then be delivered to you, should a vehicle ever approach.  Delivery is guaranteed in many ways (start within .01 secs, updated 100x/sec, reliable, etc.).  Note that the data may never be stored at all.  (Although some QoS settings like reliability may require some local storage.)  You can think of a data space as a set of specially-controlled data objects that will be filled with information in the exact way you specify, although that information is not (in general) saved by the databus…it’s just delivered.

Question 3: “The participants/applications directly interface with the data.”  Could you elaborate on what that means?

With “message-centric” middleware, you write an application that sends data, wrapped in messages, to another application.  You may do that by having clients send data to servers, for instance.  Both ends need to know something about the other end, usually including things like the schema, but also likely assumed properties of the data like “it’s less than .01 seconds old”, or “it will come 100x/second”, or at least that there is another end alive, e.g. the server is running.  All these assumed properties are completely hidden in the application code, making reuse, system integration, and interoperability really hard.

With a databus, you don’t need to know anything about the source applications.  You make clear your data needs, and then the databus delivers it.  Thus, with a databus, each application interacts only with the data space.  As an application, you simply write to the data space or read from the data space with a CRUD interface.  Of course, you may require some QoS from that data space, e.g. you need your data updated 100x per second.  The data space itself (the databus) will guarantee you get that data (or flag an error).  You don’t need to know if there are only one or 27 redundant sources of that data, or if it comes over a network or shared memory, or if it’s a C program on Linux or a C# program on Windows.  All interactions are with your own view of the data space.  It also makes sense, for instance, to write data to a space with no recipients.  In this case, the databus may do absolutely nothing, or it may cache information for later delivery, depending on your QoS settings.

Note that both database and databus technologies replace the application-application interaction with application-data-application interaction.  This abstraction is absolutely critical.  It decouples applications and greatly eases scaling, interoperability, and system integration.  The difference is really one of old data stored in a (likely centralized) database, vs future data sent directly to the applications from a distributed data space.

Question 4: “The infrastructure understands, and can therefore selectively filter the data.” Isn’t that true of all pub-sub, where you can register for “events” of interest to you?

Most pub-sub is very primitive.  An application “registers interest”, and then everything is simply sent to that application.  So, for instance, an intersection collision detection algorithm could subscribe to “vehicle positions”.   The infrastructure then sends messages from any sensor capable of producing positions, with no knowledge of the data inside that message.  Even “content filtering” pub-sub offers only very simple specs and requires the system to pre-select what’s important for all.  There’s no real control of flow.

A databus is much more expressive.  That intersection could say “I am interested only in vehicle positions within 200m, moving at 10m/s towards me.  If a vehicle falls into my specs, I need to be updated 200 times a second.  You (the databus) need to guarantee me that all sensors feeding this algorithm promise to deliver data that fast…no slower or faster.  If a sensor updates 1000 times a second, then only send me every 5th update.  I also need to know that you actually are in touch with currently-live sensors (which I define as producing in the last 0.01secs) on all possible roadway approaches at all times.  Every sensor must be able to store 600 old samples (3 seconds worth), and update me with that old data if I need it.”   (These are a few of the 20+ QoS settings in the DDS standard.)

Note that a subscribing application in the primitive pub-sub case is very dependent on the actual properties of its producers.  It has to somehow trust that they are alive (!), that they have enough buffers to save the information it may need, that they won’t flood it with information nor provide it too slowly.  If there are 10,000 cars being sensed 1000x/sec, but only 3 within 200m, it will have to receive 10,000*1000 = 10m samples every second just to find the 3*200 = 600 it needs to pay attention to.  It will have to ping every single sensor 100x/second just to ensure it is active.  If there are redundant sensors on different paths, it has to ping them all independently and somehow make sure all paths are covered.  If there are many applications, they all have to ping all the sensors independently.  It also has to know the schema of the producers, etc.

The application in the second case will, by contrast, receive exactly the 600 samples it cares about, comfortable in the knowledge that at least one sensor for each path is active.  The rate of flow is guaranteed.  Sufficient reliability is guaranteed.  The total dataflow is reduced by 99.994% (we only need 600/10m samples, and smart middleware does filtering at the source).  For completeness, note that the collision algorithm is completely independent of the sensors themselves.  It can be reused on any other intersection, and it will work with one sensor per path or 17.  If during runtime, the network gets too loaded to meet the data specs (or something fails), the application will be immediately notified.

Question 5: How does a databus differ from a CEP engine?

Short answer: a databus is a fundamentally distributed concept that selects and delivers data from local producers that match a simple specification.  A CEP engine is a centralized executable service that is capable of much more complex specifications, but must have all streams of data sent to one place.

Long answer: A Complex Event Processing (CEP) engine examines an incoming stream of data, looking for patterns you program it to identify.  When it finds one of those patterns, you can program it to take action. The patterns can be complex combinations of past and incoming future data.  However, it is a single service, running on a single CPU somewhere.  It transmits no information.

A databus also looks for patterns of data.  However, the specifications are simpler; it makes decisions about each data item as it’s produced.  The actions are also simpler; the only action it may take is to send that data to a requestor.  The power of a databus is that it is fundamentally distributed.  The looking happens locally on potentially hundreds, thousands, or even millions of nodes.  Thus, the databus is a very powerful way to select the right data from the right sources and send them to the right places.  A databus is sort of like a distributed set of CEP engines, one for every possible source of information, that are automatically programmed by the users of that information.  Of course, the databus has many other properties beyond pattern matching, such as schema mediation, redundancy management, transport support, an interoperable protocol, etc.

Question 6: What application drove the DDS standard and databuses?

The early applications were in intelligent robots, “information superiority”, and large coordinated systems like navy combat management.  These systems needed reliability even when components fail, data fast enough to control physical processes, and selective discovery and delivery to scale.  Data centricity really simplified application code and controlled interfaces, letting teams of programmers work on large software systems over time.  The DDS standard is an active, growing family of standards that was originally driven by both vendors and customers.  It has significant use across many verticals, including medical, transportation, smart cities, and energy.

If you’d like to learn about how intelligent software is sweeping the IIoT, be sure to download our whitepaper on the future of the automotive industry,”The Secret Sauce of Autonomous Cars“.