Sensor Diagnostic Based Anomaly Detection in Weather Stations
DescriptionFrom influencing flight planning by airline companies, to motivating preventive actions for anticipating natural catastrophes, weather forecasting is important to individuals and organizations of all level. Like in every data driven models, accurate weather prediction starts with good quality data. The problem is that weather data is both challenging and impossible to detrend. Moreover, some weather variables like precipitations have a high-tailed distribution, with more than 70% of the observations being zeros (no rain observed in the context of liquid precipitations).
In this work, we present a regression based approach for anomaly detection in weather stations. We apply this method on individual sensors in a network for failure diagnostic. Unlike traditional joint anomaly detection method that reports anomalous observations without identifying the broken sensor, our method not only finds outliers but gives information about the probable cause of the anomaly. We model the conditional distribution of a weather variable given the observations at its k nearest neighbors. Our hypothesis is that a diagnostic approach in a network of sensors is more efficient in detecting anomalies than rules based approaches.
Our data set is a record of measurements from 120 weather stations in Oklahoma in 2008 and 2009. It contains more than 13 sensors, with measures recorded at 5 minutes time interval. The weather variables include Relative Humidity, Air Temperature, Average Wind Speed, Wind Direction, Pressure, Liquid Precipitation and Solar Radiation.