Class ExponentiallySmoothedTaskRuntimeEstimator

java.lang.Object
org.apache.hadoop.mapreduce.v2.app.speculate.ExponentiallySmoothedTaskRuntimeEstimator
All Implemented Interfaces:
TaskRuntimeEstimator

public class ExponentiallySmoothedTaskRuntimeEstimator extends Object
This estimator exponentially smooths the rate of progress versus wallclock time. Conceivably we could write an estimator that smooths time per unit progress, and get different results.
  • Field Details

    • context

      protected AppContext context
    • startTimes

      protected final Map<org.apache.hadoop.mapreduce.v2.api.records.TaskAttemptId,Long> startTimes
    • mapperStatistics

      protected final Map<Job,DataStatistics> mapperStatistics
    • reducerStatistics

      protected final Map<Job,DataStatistics> reducerStatistics
    • doneTasks

      protected final Set<Task> doneTasks
  • Constructor Details

    • ExponentiallySmoothedTaskRuntimeEstimator

      public ExponentiallySmoothedTaskRuntimeEstimator()
  • Method Details

    • contextualize

      public void contextualize(org.apache.hadoop.conf.Configuration conf, AppContext context)
      Specified by:
      contextualize in interface TaskRuntimeEstimator
    • estimatedRuntime

      public long estimatedRuntime(org.apache.hadoop.mapreduce.v2.api.records.TaskAttemptId id)
      Description copied from interface: TaskRuntimeEstimator
      Estimate a task attempt's total runtime. Includes the time already elapsed.
      Parameters:
      id - the TaskAttemptId of the attempt we are asking about
      Returns:
      our best estimate of the attempt's runtime, or -1 if we don't have enough information yet to produce an estimate.
    • runtimeEstimateVariance

      public long runtimeEstimateVariance(org.apache.hadoop.mapreduce.v2.api.records.TaskAttemptId id)
      Description copied from interface: TaskRuntimeEstimator
      Computes the width of the error band of our estimate of the task runtime as returned by TaskRuntimeEstimator.estimatedRuntime(TaskAttemptId)
      Parameters:
      id - the TaskAttemptId of the attempt we are asking about
      Returns:
      our best estimate of the attempt's runtime, or -1 if we don't have enough information yet to produce an estimate.
    • updateAttempt

      public void updateAttempt(TaskAttemptStatusUpdateEvent.TaskAttemptStatus status, long timestamp)
      Specified by:
      updateAttempt in interface TaskRuntimeEstimator
    • enrollAttempt

      public void enrollAttempt(TaskAttemptStatusUpdateEvent.TaskAttemptStatus status, long timestamp)
      Specified by:
      enrollAttempt in interface TaskRuntimeEstimator
    • attemptEnrolledTime

      public long attemptEnrolledTime(org.apache.hadoop.mapreduce.v2.api.records.TaskAttemptId attemptID)
      Specified by:
      attemptEnrolledTime in interface TaskRuntimeEstimator
    • dataStatisticsForTask

      protected DataStatistics dataStatisticsForTask(org.apache.hadoop.mapreduce.v2.api.records.TaskId taskID)
    • thresholdRuntime

      public long thresholdRuntime(org.apache.hadoop.mapreduce.v2.api.records.TaskId taskID)
      Description copied from interface: TaskRuntimeEstimator
      Find a maximum reasonable execution wallclock time. Includes the time already elapsed. Find a maximum reasonable execution time. Includes the time already elapsed. If the projected total execution time for this task ever exceeds its reasonable execution time, we may speculate it.
      Specified by:
      thresholdRuntime in interface TaskRuntimeEstimator
      Parameters:
      taskID - the TaskId of the task we are asking about
      Returns:
      the task's maximum reasonable runtime, or MAX_VALUE if we don't have enough information to rule out any runtime, however long.
    • estimatedNewAttemptRuntime

      public long estimatedNewAttemptRuntime(org.apache.hadoop.mapreduce.v2.api.records.TaskId id)
      Description copied from interface: TaskRuntimeEstimator
      Estimates how long a new attempt on this task will take if we start one now
      Specified by:
      estimatedNewAttemptRuntime in interface TaskRuntimeEstimator
      Parameters:
      id - the TaskId of the task we are asking about
      Returns:
      our best estimate of a new attempt's runtime, or -1 if we don't have enough information yet to produce an estimate.