After spending years developing enterprise application and platforms, moving to developing technologies for the “edge” with real-time needs has been very refreshing. One of the more interesting aspects of distributed application development is to be aware of the physics involved in the deployment. Any signal experiences a propagation delay resulting from the finite speed of light, which is about 300,000 kilometers per second, or 1 nanosecond per foot. A signal in a cable or optical fiber travels approximately 2/3 the speed of light in a vacuum. This gets more complicated when routers, satellite links, and other variances in the network topology are introduced.
In real-time applications, speed and performance matter, and often define the success of the project.
The designer of a sensor-based distributed application (with real-time needs) needs to factor in the topology of the network. The equivalent of this is that a CRM architect needs to account for the speed to disk rotation for designing her SQL query (which she does not need to do since the disk problem is localized and the applications do not care about microseconds).
While getting a faster middleware such as RTI Data Distribution Service is part of the solution, …
(Reference: RTI Data Distribution Services performance numbers)
… the comprehensive way to address this issue is by distributing the intelligence in the network. In many cases, the individual sensor read is not as important on its own – only when it crosses its threshold (example: Ganglia sensor reporting > 25% disk usage) or when the event happens in context of another event (example: Ganglia sensor reporting > 25% disk usage AND new port opened within 10 milliseconds on the same node) does the data become meaningful for transmission to (say) an intrusion-detection engine.
So, besides designing an efficient information model, architects of distributed applications with real-time needs also need to move as much information processing as possible close to the source of the data. This is useful both to protect the bandwidth of the network link and to ensure that the subscriber of the data is not overwhelmed with the rate of data that it need to consume. That is where an intelligent middleware comes in… RTI Data Distribution Service provides many rich features which can be enabled through parameters. One interesting feature is Content-Filtered Topics, which letssubscribers use SQL-like expressions to determine which data are they interested in, saving network resources and CPU.
Another technology that is really useful in adding intelligence to the network is Complex Event Processing, which provides users RDBMS-like operations on streaming data, enabling lower processing latency (since the data does not need to be persisted on the disk before running the queries).
(Reference: Read how Complex Event Processing adds intelligence to the distributed system)
With Complex Event Processing (CEP) you can build an application that is more context aware (since it lets you correlate different data streams based on time or samples) with only the meaningful data published on the wire for consumption.
So, this blog ends with the same note as the last post. Distributed systems architects need to think more beyond just tuning the network link. By using an intelligent middleware, by making intelligent choices about their information model, they can circumvent many challenges posed by physics.