Monitoring and anomaly detection in time series data with Elastic X-Pack Machine Learning
Customer is into providing solutions to reduce HVAC energy and maintenance costs for small-format retail and restaurant chains without a heavy investment of money or time. The company is transforming energy management from a hardware-heavy industrial controls approach to a software-centric and data services solution. Their solution reduces energy costs by 20-30%, reduces repair cost with an early warning system, provides a premium customer experience by improving site comfort and saves time with the remote management and data-insight services. Their software-based approach eliminates having to pay for, manage and support complex HVAC control hardware.
The web-based energy management portal is on the AWS Cloud is built for multi-site enterprises, providing centralized control and access to a range of energy cost, usage and operational information, alarms, diagnostics, and analytics.
The product has a high data velocity and volume; which the energy management platform manages using the AWS IoT platform. However, it was essential for the business to have a Big Data analytics platform which would help them ingest millions of data points, analyze and optimize the IoT devices controls to save energy and customer experience.
However, the challenge is to monitor and detect the anomaly in the way the data is generated as to whether the traffic is unusually high for topics received from Internet-of-Things (IoT) devices for the application used by the customer , high payload received from IoT devices, unusual traffic at a particular time of the day for the application, Status of the Message Queuing Telemetry Transport (MQTT) client connection, unusual high Response time for the VEM application and where ever the average response time is unusually high.
To resolve these issues CT approach was to use next gen monitoring and detection of anomalies in the time series with Elastic X-Pack Machine Learning.
The Elastic Stack enables to reliably and securely take data from any source in any format and search, analyze, and visualize it in real time. Elasticsearch is a real-time, distributed storage, search, and analytics engine. It can be used for many purposes, but one context where it excels is indexing streams of semi-structured data, such as logs or decoded network packets.
CT Used this to
- Automated analysis of time-series data
- Create accurate baselines of normal behavior in the data
- Identify anomalous patterns
- Create baselines of normal behaviors in data and identify anomalies
- Unsupervised machine learning algorithms
- Detect, Score and Link with statistically significant influencers in the data
- Anomalies related to temporal deviations in counts, frequencies
- Statistical rarity
- Unusual behaviors for a member of a population
The following diagram depicts time series analysis of a metric – MQTT client connections’ MQTT Client Connection – Metric Viewer
With this next gen monitoring approach, there is no necessity to define algorithms, create/manage rules and keep tweaking the threshold/baseline values. The anomaly detection model is based on unsupervised machine learning algorithm which learns and improves with ingested data over a period of time.