Class BlockPlacementPolicyDefault

java.lang.Object
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
Direct Known Subclasses:
AvailableSpaceBlockPlacementPolicy, BlockPlacementPolicyRackFaultTolerant, BlockPlacementPolicyWithNodeGroup, BlockPlacementPolicyWithUpgradeDomain

@Private public class BlockPlacementPolicyDefault extends BlockPlacementPolicy
The class is responsible for choosing the desired number of targets for placing block replicas. The replica placement strategy is that if the writer is on a datanode, the 1st replica is placed on the local machine by default (By passing the CreateFlag.NO_LOCAL_WRITE flag the client can request not to put a block replica on the local datanode. Subsequent replicas will still follow default block placement policy.). If the writer is not on a datanode, the 1st replica is placed on a random node. The 2nd replica is placed on a datanode that is on a different rack. The 3rd replica is placed on a datanode which is on a different node of the rack as the second replica.
  • Field Details

    • considerLoad

      protected boolean considerLoad
    • considerLoadFactor

      protected double considerLoadFactor
    • clusterMap

      protected org.apache.hadoop.net.NetworkTopology clusterMap
    • host2datanodeMap

      protected org.apache.hadoop.hdfs.server.blockmanagement.Host2NodesMap host2datanodeMap
    • heartbeatInterval

      protected long heartbeatInterval
    • tolerateHeartbeatMultiplier

      protected int tolerateHeartbeatMultiplier
      A miss of that many heartbeats is tolerated for replica deletion policy.
  • Constructor Details

    • BlockPlacementPolicyDefault

      protected BlockPlacementPolicyDefault()
  • Method Details

    • initialize

      public void initialize(org.apache.hadoop.conf.Configuration conf, FSClusterStats stats, org.apache.hadoop.net.NetworkTopology clusterMap, org.apache.hadoop.hdfs.server.blockmanagement.Host2NodesMap host2datanodeMap)
      Description copied from class: BlockPlacementPolicy
      Used to setup a BlockPlacementPolicy object. This should be defined by all implementations of a BlockPlacementPolicy.
      Specified by:
      initialize in class BlockPlacementPolicy
      Parameters:
      conf - the configuration object
      stats - retrieve cluster status from here
      clusterMap - cluster topology
    • chooseTarget

      public DatanodeStorageInfo[] chooseTarget(String srcPath, int numOfReplicas, org.apache.hadoop.net.Node writer, List<DatanodeStorageInfo> chosenNodes, boolean returnChosenNodes, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, org.apache.hadoop.hdfs.protocol.BlockStoragePolicy storagePolicy, EnumSet<org.apache.hadoop.hdfs.AddBlockFlag> flags)
      Description copied from class: BlockPlacementPolicy
      choose numOfReplicas data nodes for writer to re-replicate a block with size blocksize If not, return as many as we can.
      Specified by:
      chooseTarget in class BlockPlacementPolicy
      Parameters:
      srcPath - the file to which this chooseTargets is being invoked.
      numOfReplicas - additional number of replicas wanted.
      writer - the writer's machine, null if not in the cluster.
      chosenNodes - datanodes that have been chosen as targets.
      returnChosenNodes - decide if the chosenNodes are returned.
      excludedNodes - datanodes that should not be considered as targets.
      blocksize - size of the data to be written.
      flags - Block placement flags.
      Returns:
      array of DatanodeDescriptor instances chosen as target and sorted as a pipeline.
    • chooseTarget

      public DatanodeStorageInfo[] chooseTarget(String srcPath, int numOfReplicas, org.apache.hadoop.net.Node writer, List<DatanodeStorageInfo> chosen, boolean returnChosenNodes, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, org.apache.hadoop.hdfs.protocol.BlockStoragePolicy storagePolicy, EnumSet<org.apache.hadoop.hdfs.AddBlockFlag> flags, EnumMap<org.apache.hadoop.fs.StorageType,Integer> storageTypes)
      Overrides:
      chooseTarget in class BlockPlacementPolicy
      storageTypes - storage types that should be used as targets.
    • chooseFavouredNodes

      protected void chooseFavouredNodes(String src, int numOfReplicas, List<DatanodeDescriptor> favoredNodes, Set<org.apache.hadoop.net.Node> favoriteAndExcludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType,Integer> storageTypes) throws BlockPlacementPolicy.NotEnoughReplicasException
      Throws:
      BlockPlacementPolicy.NotEnoughReplicasException
    • getMaxNodesPerRack

      protected int[] getMaxNodesPerRack(int numOfChosen, int numOfReplicas)
      Calculate the maximum number of replicas to allocate per rack. It also limits the total number of replicas to the total number of nodes in the cluster. Caller should adjust the replica count to the return value.
      Parameters:
      numOfChosen - The number of already chosen nodes.
      numOfReplicas - The number of additional nodes to allocate.
      Returns:
      integer array. Index 0: The number of nodes allowed to allocate in addition to already chosen nodes. Index 1: The maximum allowed number of nodes per rack. This is independent of the number of chosen nodes, as it is calculated using the target number of replicas.
    • chooseTargetInOrder

      protected org.apache.hadoop.net.Node chooseTargetInOrder(int numOfReplicas, org.apache.hadoop.net.Node writer, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, boolean newBlock, EnumMap<org.apache.hadoop.fs.StorageType,Integer> storageTypes) throws BlockPlacementPolicy.NotEnoughReplicasException
      Throws:
      BlockPlacementPolicy.NotEnoughReplicasException
    • chooseLocalStorage

      protected DatanodeStorageInfo chooseLocalStorage(org.apache.hadoop.net.Node localMachine, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType,Integer> storageTypes) throws BlockPlacementPolicy.NotEnoughReplicasException
      Throws:
      BlockPlacementPolicy.NotEnoughReplicasException
    • chooseLocalOrFavoredStorage

      protected DatanodeStorageInfo chooseLocalOrFavoredStorage(org.apache.hadoop.net.Node localOrFavoredNode, boolean isFavoredNode, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType,Integer> storageTypes) throws BlockPlacementPolicy.NotEnoughReplicasException
      Choose storage of local or favored node.
      Parameters:
      localOrFavoredNode - local or favored node
      isFavoredNode - if target node is favored node
      excludedNodes - datanodes that should not be considered as targets
      blocksize - size of the data to be written
      maxNodesPerRack - max nodes allowed per rack
      results - the target nodes already chosen
      avoidStaleNodes - avoid stale nodes in replica choosing
      storageTypes - storage type to be considered for target
      Returns:
      storage of local or favored node (not chosen node)
      Throws:
      BlockPlacementPolicy.NotEnoughReplicasException
    • chooseLocalStorage

      protected DatanodeStorageInfo chooseLocalStorage(org.apache.hadoop.net.Node localMachine, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType,Integer> storageTypes, boolean fallbackToLocalRack) throws BlockPlacementPolicy.NotEnoughReplicasException
      Choose localMachine as the target. if localMachine is not available, choose a node on the same rack
      Returns:
      the chosen storage
      Throws:
      BlockPlacementPolicy.NotEnoughReplicasException
    • addToExcludedNodes

      protected int addToExcludedNodes(DatanodeDescriptor localMachine, Set<org.apache.hadoop.net.Node> excludedNodes)
      Add localMachine and related nodes to excludedNodes for next replica choosing. In sub class, we can add more nodes within the same failure domain of localMachine
      Returns:
      number of new excluded nodes
    • chooseLocalRack

      protected DatanodeStorageInfo chooseLocalRack(org.apache.hadoop.net.Node localMachine, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType,Integer> storageTypes) throws BlockPlacementPolicy.NotEnoughReplicasException
      Choose one node from the rack that localMachine is on. if no such node is available, choose one node from the rack where a second replica is on. if still no such node is available, choose a random node in the cluster.
      Returns:
      the chosen node
      Throws:
      BlockPlacementPolicy.NotEnoughReplicasException
    • chooseRemoteRack

      protected void chooseRemoteRack(int numOfReplicas, DatanodeDescriptor localMachine, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxReplicasPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType,Integer> storageTypes) throws BlockPlacementPolicy.NotEnoughReplicasException
      Choose numOfReplicas nodes from the racks that localMachine is NOT on. If not enough nodes are available, choose the remaining ones from the local rack
      Throws:
      BlockPlacementPolicy.NotEnoughReplicasException
    • chooseRandom

      protected DatanodeStorageInfo chooseRandom(String scope, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType,Integer> storageTypes) throws BlockPlacementPolicy.NotEnoughReplicasException
      Randomly choose one target from the given scope.
      Returns:
      the chosen storage, if there is any.
      Throws:
      BlockPlacementPolicy.NotEnoughReplicasException
    • chooseRandom

      protected DatanodeStorageInfo chooseRandom(int numOfReplicas, String scope, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType,Integer> storageTypes) throws BlockPlacementPolicy.NotEnoughReplicasException
      Randomly choose numOfReplicas targets from the given scope.
      Returns:
      the first chosen node, if there is any.
      Throws:
      BlockPlacementPolicy.NotEnoughReplicasException
    • chooseDataNode

      protected DatanodeDescriptor chooseDataNode(String scope, Collection<org.apache.hadoop.net.Node> excludedNodes)
      Choose a datanode from the given scope.
      Returns:
      the chosen node, if there is any.
    • chooseDataNode

      protected DatanodeDescriptor chooseDataNode(String scope, Collection<org.apache.hadoop.net.Node> excludedNodes, org.apache.hadoop.fs.StorageType type)
      Choose a datanode from the given scope with specified storage type.
      Returns:
      the chosen node, if there is any.
    • logNodeIsNotChosen

      protected static void logNodeIsNotChosen(DatanodeDescriptor node, BlockPlacementPolicyDefault.NodeNotChosenReason reason, String reasonDetails)
    • verifyBlockPlacement

      public BlockPlacementStatus verifyBlockPlacement(org.apache.hadoop.hdfs.protocol.DatanodeInfo[] locs, int numberOfReplicas)
      Description copied from class: BlockPlacementPolicy
      Verify if the block's placement meets requirement of placement policy, i.e. replicas are placed on no less than minRacks racks in the system.
      Specified by:
      verifyBlockPlacement in class BlockPlacementPolicy
      Parameters:
      locs - block with locations
      numberOfReplicas - replica number of file to be verified
      Returns:
      the result of verification
    • chooseReplicaToDelete

      @VisibleForTesting public DatanodeStorageInfo chooseReplicaToDelete(Collection<DatanodeStorageInfo> moreThanOne, Collection<DatanodeStorageInfo> exactlyOne, List<org.apache.hadoop.fs.StorageType> excessTypes, Map<String,List<DatanodeStorageInfo>> rackMap)
      Decide whether deleting the specified replica of the block still makes the block conform to the configured block placement policy.
      Parameters:
      moreThanOne - The replica locations of this block that are present on more than one unique racks.
      exactlyOne - Replica locations of this block that are present on exactly one unique racks.
      excessTypes - The excess StorageTypes according to the BlockStoragePolicy.
      Returns:
      the replica that is the best candidate for deletion
    • chooseReplicasToDelete

      public List<DatanodeStorageInfo> chooseReplicasToDelete(Collection<DatanodeStorageInfo> availableReplicas, Collection<DatanodeStorageInfo> delCandidates, int expectedNumOfReplicas, List<org.apache.hadoop.fs.StorageType> excessTypes, DatanodeDescriptor addedNode, DatanodeDescriptor delNodeHint)
      Description copied from class: BlockPlacementPolicy
      Select the excess replica storages for deletion based on either delNodehint/Excess storage types.
      Specified by:
      chooseReplicasToDelete in class BlockPlacementPolicy
      Parameters:
      availableReplicas - available replicas
      delCandidates - Candidates for deletion. For normal replication, this set is the same with availableReplicas. For striped blocks, this set is a subset of availableReplicas.
      expectedNumOfReplicas - The expected number of replicas remaining in the delCandidates
      excessTypes - type of the storagepolicy
      addedNode - New replica reported
      delNodeHint - Hint for excess storage selection
      Returns:
      Returns the list of excess replicas chosen for deletion
    • isMovable

      public boolean isMovable(Collection<org.apache.hadoop.hdfs.protocol.DatanodeInfo> locs, org.apache.hadoop.hdfs.protocol.DatanodeInfo source, org.apache.hadoop.hdfs.protocol.DatanodeInfo target)
      Description copied from class: BlockPlacementPolicy
      Check if the move is allowed. Used by balancer and other tools.
      Specified by:
      isMovable in class BlockPlacementPolicy
      Parameters:
      locs - all replicas including source and target
      source - source replica of the move
      target - target replica of the move
    • pickupReplicaSet

      protected Collection<DatanodeStorageInfo> pickupReplicaSet(Collection<DatanodeStorageInfo> moreThanOne, Collection<DatanodeStorageInfo> exactlyOne, Map<String,List<DatanodeStorageInfo>> rackMap)
      Pick up replica node set for deleting replica as over-replicated. First set contains replica nodes on rack with more than one replica while second set contains remaining replica nodes. If only 1 rack, pick all. If 2 racks, pick all that have more than 1 replicas on the same rack; if no such replicas, pick all. If 3 or more racks, pick all.
    • setExcludeSlowNodesEnabled

      public void setExcludeSlowNodesEnabled(boolean enable)
      Description copied from class: BlockPlacementPolicy
      Updates the value used for excludeSlowNodesEnabled, which is set by DFSConfigKeys.DFS_NAMENODE_BLOCKPLACEMENTPOLICY_EXCLUDE_SLOW_NODES_ENABLED_KEY initially.
      Specified by:
      setExcludeSlowNodesEnabled in class BlockPlacementPolicy
      Parameters:
      enable - true, we will filter out slow nodes when choosing targets for blocks, otherwise false not filter.
    • getExcludeSlowNodesEnabled

      public boolean getExcludeSlowNodesEnabled()
      Specified by:
      getExcludeSlowNodesEnabled in class BlockPlacementPolicy
    • setMinBlocksForWrite

      public void setMinBlocksForWrite(int minBlocksForWrite)
      Description copied from class: BlockPlacementPolicy
      Updates the value used for minBlocksForWrite, which is set by DFSConfigKeys.DFS_NAMENODE_BLOCKPLACEMENTPOLICY_MIN_BLOCKS_FOR_WRITE_KEY.
      Specified by:
      setMinBlocksForWrite in class BlockPlacementPolicy
      Parameters:
      minBlocksForWrite - the minimum number of blocks required for write operations.
    • getMinBlocksForWrite

      public int getMinBlocksForWrite()
      Specified by:
      getMinBlocksForWrite in class BlockPlacementPolicy