Class BlockPlacementPolicy

java.lang.Object
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
Direct Known Subclasses:
BlockPlacementPolicyDefault

@Private public abstract class BlockPlacementPolicy extends Object
This interface is used for choosing the desired number of targets for placing block replicas.
  • Field Details

    • LOG

      public static final org.slf4j.Logger LOG
  • Constructor Details

    • BlockPlacementPolicy

      public BlockPlacementPolicy()
  • Method Details

    • chooseTarget

      public abstract DatanodeStorageInfo[] chooseTarget(String srcPath, int numOfReplicas, org.apache.hadoop.net.Node writer, List<DatanodeStorageInfo> chosen, boolean returnChosenNodes, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, org.apache.hadoop.hdfs.protocol.BlockStoragePolicy storagePolicy, EnumSet<org.apache.hadoop.hdfs.AddBlockFlag> flags)
      choose numOfReplicas data nodes for writer to re-replicate a block with size blocksize If not, return as many as we can.
      Parameters:
      srcPath - the file to which this chooseTargets is being invoked.
      numOfReplicas - additional number of replicas wanted.
      writer - the writer's machine, null if not in the cluster.
      chosen - datanodes that have been chosen as targets.
      returnChosenNodes - decide if the chosenNodes are returned.
      excludedNodes - datanodes that should not be considered as targets.
      blocksize - size of the data to be written.
      flags - Block placement flags.
      Returns:
      array of DatanodeDescriptor instances chosen as target and sorted as a pipeline.
    • chooseTarget

      public DatanodeStorageInfo[] chooseTarget(String srcPath, int numOfReplicas, org.apache.hadoop.net.Node writer, List<DatanodeStorageInfo> chosen, boolean returnChosenNodes, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, org.apache.hadoop.hdfs.protocol.BlockStoragePolicy storagePolicy, EnumSet<org.apache.hadoop.hdfs.AddBlockFlag> flags, EnumMap<org.apache.hadoop.fs.StorageType,Integer> storageTypes)
      Parameters:
      storageTypes - storage types that should be used as targets.
    • verifyBlockPlacement

      public abstract BlockPlacementStatus verifyBlockPlacement(org.apache.hadoop.hdfs.protocol.DatanodeInfo[] locs, int numOfReplicas)
      Verify if the block's placement meets requirement of placement policy, i.e. replicas are placed on no less than minRacks racks in the system.
      Parameters:
      locs - block with locations
      numOfReplicas - replica number of file to be verified
      Returns:
      the result of verification
    • chooseReplicasToDelete

      public abstract List<DatanodeStorageInfo> chooseReplicasToDelete(Collection<DatanodeStorageInfo> availableReplicas, Collection<DatanodeStorageInfo> delCandidates, int expectedNumOfReplicas, List<org.apache.hadoop.fs.StorageType> excessTypes, DatanodeDescriptor addedNode, DatanodeDescriptor delNodeHint)
      Select the excess replica storages for deletion based on either delNodehint/Excess storage types.
      Parameters:
      availableReplicas - available replicas
      delCandidates - Candidates for deletion. For normal replication, this set is the same with availableReplicas. For striped blocks, this set is a subset of availableReplicas.
      expectedNumOfReplicas - The expected number of replicas remaining in the delCandidates
      excessTypes - type of the storagepolicy
      addedNode - New replica reported
      delNodeHint - Hint for excess storage selection
      Returns:
      Returns the list of excess replicas chosen for deletion
    • initialize

      protected abstract void initialize(org.apache.hadoop.conf.Configuration conf, FSClusterStats stats, org.apache.hadoop.net.NetworkTopology clusterMap, org.apache.hadoop.hdfs.server.blockmanagement.Host2NodesMap host2datanodeMap)
      Used to setup a BlockPlacementPolicy object. This should be defined by all implementations of a BlockPlacementPolicy.
      Parameters:
      conf - the configuration object
      stats - retrieve cluster status from here
      clusterMap - cluster topology
    • isMovable

      public abstract boolean isMovable(Collection<org.apache.hadoop.hdfs.protocol.DatanodeInfo> candidates, org.apache.hadoop.hdfs.protocol.DatanodeInfo source, org.apache.hadoop.hdfs.protocol.DatanodeInfo target)
      Check if the move is allowed. Used by balancer and other tools.
      Parameters:
      candidates - all replicas including source and target
      source - source replica of the move
      target - target replica of the move
    • adjustSetsWithChosenReplica

      public void adjustSetsWithChosenReplica(Map<String,List<DatanodeStorageInfo>> rackMap, List<DatanodeStorageInfo> moreThanOne, List<DatanodeStorageInfo> exactlyOne, DatanodeStorageInfo cur)
      Adjust rackmap, moreThanOne, and exactlyOne after removing replica on cur.
      Parameters:
      rackMap - a map from rack to replica
      moreThanOne - The List of replica nodes on rack which has more than one replica
      exactlyOne - The List of replica nodes on rack with only one replica
      cur - current replica to remove
    • getDatanodeInfo

      protected <T> org.apache.hadoop.hdfs.protocol.DatanodeInfo getDatanodeInfo(T datanode)
    • getRack

      protected String getRack(org.apache.hadoop.hdfs.protocol.DatanodeInfo datanode)
      Get rack string from a data node
      Returns:
      rack of data node
    • splitNodesWithRack

      public <T> void splitNodesWithRack(Iterable<T> availableSet, Collection<T> candidates, Map<String,List<T>> rackMap, List<T> moreThanOne, List<T> exactlyOne)
      Split data nodes into two sets, one set includes nodes on rack with more than one replica, the other set contains the remaining nodes.
      Parameters:
      availableSet - all the available DataNodes/storages of the block
      candidates - DatanodeStorageInfo/DatanodeInfo to be split into two sets
      rackMap - a map from rack to datanodes
      moreThanOne - contains nodes on rack with more than one replica
      exactlyOne - remains contains the remaining nodes
    • setExcludeSlowNodesEnabled

      public abstract void setExcludeSlowNodesEnabled(boolean enable)
      Updates the value used for excludeSlowNodesEnabled, which is set by DFSConfigKeys.DFS_NAMENODE_BLOCKPLACEMENTPOLICY_EXCLUDE_SLOW_NODES_ENABLED_KEY initially.
      Parameters:
      enable - true, we will filter out slow nodes when choosing targets for blocks, otherwise false not filter.
    • getExcludeSlowNodesEnabled

      public abstract boolean getExcludeSlowNodesEnabled()
    • setMinBlocksForWrite

      public abstract void setMinBlocksForWrite(int minBlocksForWrite)
      Updates the value used for minBlocksForWrite, which is set by DFSConfigKeys.DFS_NAMENODE_BLOCKPLACEMENTPOLICY_MIN_BLOCKS_FOR_WRITE_KEY.
      Parameters:
      minBlocksForWrite - the minimum number of blocks required for write operations.
    • getMinBlocksForWrite

      public abstract int getMinBlocksForWrite()