Class OutlierDetector

java.lang.Object
org.apache.hadoop.hdfs.server.datanode.metrics.OutlierDetector

@Private @Unstable public class OutlierDetector extends Object
A utility class to help detect resources (nodes/ disks) whose aggregate latency is an outlier within a given set. We use the median absolute deviation for outlier detection as described in the following publication: Leys, C., et al., Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. http://dx.doi.org/10.1016/j.jesp.2013.03.013 We augment the above scheme with the following heuristics to be even more conservative: 1. Skip outlier detection if the sample size is too small. 2. Never flag resources whose aggregate latency is below a low threshold. 3. Never flag resources whose aggregate latency is less than a small multiple of the median.
  • Field Details

    • LOG

      public static final org.slf4j.Logger LOG
  • Constructor Details

    • OutlierDetector

      public OutlierDetector(long minNumResources, long lowThresholdMs)
  • Method Details

    • getOutliers

      public Map<String,Double> getOutliers(Map<String,Double> stats)
      Return a set of nodes/ disks whose latency is much higher than their counterparts. The input is a map of (resource -> aggregate latency) entries. The aggregate may be an arithmetic mean or a percentile e.g. 90th percentile. Percentiles are a better choice than median since latency is usually not a normal distribution. This method allocates temporary memory O(n) and has run time O(n.log(n)), where n = stats.size().
      Returns:
    • getOutlierMetrics

      public Map<String,org.apache.hadoop.hdfs.server.protocol.OutlierMetrics> getOutlierMetrics(Map<String,Double> stats)
      Return a set of nodes whose latency is much higher than their counterparts. The input is a map of (resource -> aggregate latency) entries. The aggregate may be an arithmetic mean or a percentile e.g. 90th percentile. Percentiles are a better choice than median since latency is usually not a normal distribution.
      Parameters:
      stats - map of aggregate latency entries.
      Returns:
      map of outlier nodes to outlier metrics.
    • computeMad

      public static Double computeMad(List<Double> sortedValues)
      Compute the Median Absolute Deviation of a sorted list.
    • computeMedian

      public static Double computeMedian(List<Double> sortedValues)
      Compute the median of a sorted list.
    • setMinNumResources

      public void setMinNumResources(long minNodes)
    • getMinOutlierDetectionNodes

      public long getMinOutlierDetectionNodes()
    • setLowThresholdMs

      public void setLowThresholdMs(long thresholdMs)
    • getLowThresholdMs

      public long getLowThresholdMs()