Class BlockPlacementPolicyWithNodeGroup
java.lang.Object
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyWithNodeGroup
The class is responsible for choosing the desired number of targets
for placing block replicas on environment with node-group layer.
The replica placement strategy is adjusted to:
If the writer is on a datanode, the 1st replica is placed on the local
node(or local node-group or on local rack), otherwise a random datanode.
The 2nd replica is placed on a datanode that is on a different rack with 1st
replica node.
The 3rd replica is placed on a datanode which is on a different node-group
but the same rack as the second replica node.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
BlockPlacementPolicyDefault.NodeNotChosenReasonNested classes/interfaces inherited from class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
BlockPlacementPolicy.NotEnoughReplicasException -
Field Summary
Fields inherited from class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
clusterMap, considerLoad, considerLoadFactor, heartbeatInterval, host2datanodeMap, tolerateHeartbeatMultiplierFields inherited from class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
LOG -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected intaddToExcludedNodes(DatanodeDescriptor chosenNode, Set<org.apache.hadoop.net.Node> excludedNodes) Find other nodes in the same nodegroup of localMachine and add them into excludeNodes as replica should not be duplicated for nodes within the same nodegroupprotected voidchooseFavouredNodes(String src, int numOfReplicas, List<DatanodeDescriptor> favoredNodes, Set<org.apache.hadoop.net.Node> favoriteAndExcludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType, Integer> storageTypes) choose all good favored nodes as target.protected DatanodeStorageInfochooseLocalRack(org.apache.hadoop.net.Node localMachine, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType, Integer> storageTypes) Choose one node from the rack that localMachine is on.protected DatanodeStorageInfochooseLocalStorage(org.apache.hadoop.net.Node localMachine, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType, Integer> storageTypes, boolean fallbackToNodeGroupAndLocalRack) choose local node of localMachine as the target.protected voidchooseRemoteRack(int numOfReplicas, DatanodeDescriptor localMachine, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxReplicasPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType, Integer> storageTypes) Choose numOfReplicas nodes from the racks that localMachine is NOT on.protected StringgetRack(org.apache.hadoop.hdfs.protocol.DatanodeInfo cur) Get rack string from a data nodevoidinitialize(org.apache.hadoop.conf.Configuration conf, FSClusterStats stats, org.apache.hadoop.net.NetworkTopology clusterMap, org.apache.hadoop.hdfs.server.blockmanagement.Host2NodesMap host2datanodeMap) Used to setup a BlockPlacementPolicy object.booleanisMovable(Collection<org.apache.hadoop.hdfs.protocol.DatanodeInfo> locs, org.apache.hadoop.hdfs.protocol.DatanodeInfo source, org.apache.hadoop.hdfs.protocol.DatanodeInfo target) Check if there are any replica (other than source) on the same node group with target.pickupReplicaSet(Collection<DatanodeStorageInfo> first, Collection<DatanodeStorageInfo> second, Map<String, List<DatanodeStorageInfo>> rackMap) Pick up replica node set for deleting replica as over-replicated.verifyBlockPlacement(org.apache.hadoop.hdfs.protocol.DatanodeInfo[] locs, int numberOfReplicas) Verify if the block's placement meets requirement of placement policy, i.e. replicas are placed on no less than minRacks racks in the system.Methods inherited from class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
chooseDataNode, chooseDataNode, chooseLocalOrFavoredStorage, chooseLocalStorage, chooseRandom, chooseRandom, chooseReplicasToDelete, chooseReplicaToDelete, chooseTarget, chooseTarget, chooseTargetInOrder, getExcludeSlowNodesEnabled, getMaxNodesPerRack, getMinBlocksForWrite, logNodeIsNotChosen, setExcludeSlowNodesEnabled, setMinBlocksForWriteMethods inherited from class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
adjustSetsWithChosenReplica, getDatanodeInfo, splitNodesWithRack
-
Constructor Details
-
BlockPlacementPolicyWithNodeGroup
protected BlockPlacementPolicyWithNodeGroup()
-
-
Method Details
-
initialize
public void initialize(org.apache.hadoop.conf.Configuration conf, FSClusterStats stats, org.apache.hadoop.net.NetworkTopology clusterMap, org.apache.hadoop.hdfs.server.blockmanagement.Host2NodesMap host2datanodeMap) Description copied from class:BlockPlacementPolicyUsed to setup a BlockPlacementPolicy object. This should be defined by all implementations of a BlockPlacementPolicy.- Overrides:
initializein classBlockPlacementPolicyDefault- Parameters:
conf- the configuration objectstats- retrieve cluster status from hereclusterMap- cluster topology
-
chooseFavouredNodes
protected void chooseFavouredNodes(String src, int numOfReplicas, List<DatanodeDescriptor> favoredNodes, Set<org.apache.hadoop.net.Node> favoriteAndExcludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType, Integer> storageTypes) throws BlockPlacementPolicy.NotEnoughReplicasExceptionchoose all good favored nodes as target. If no enough targets, then choose one replica from each bad favored node's node group.- Overrides:
chooseFavouredNodesin classBlockPlacementPolicyDefault- Throws:
BlockPlacementPolicy.NotEnoughReplicasException
-
chooseLocalStorage
protected DatanodeStorageInfo chooseLocalStorage(org.apache.hadoop.net.Node localMachine, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType, Integer> storageTypes, boolean fallbackToNodeGroupAndLocalRack) throws BlockPlacementPolicy.NotEnoughReplicasExceptionchoose local node of localMachine as the target. If localMachine is not available, will fallback to nodegroup/rack when flag fallbackToNodeGroupAndLocalRack is set.- Overrides:
chooseLocalStoragein classBlockPlacementPolicyDefault- Returns:
- the chosen node
- Throws:
BlockPlacementPolicy.NotEnoughReplicasException
-
chooseLocalRack
protected DatanodeStorageInfo chooseLocalRack(org.apache.hadoop.net.Node localMachine, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType, Integer> storageTypes) throws BlockPlacementPolicy.NotEnoughReplicasExceptionDescription copied from class:BlockPlacementPolicyDefaultChoose one node from the rack that localMachine is on. if no such node is available, choose one node from the rack where a second replica is on. if still no such node is available, choose a random node in the cluster.- Overrides:
chooseLocalRackin classBlockPlacementPolicyDefault- Returns:
- the chosen node
- Throws:
BlockPlacementPolicy.NotEnoughReplicasException
-
chooseRemoteRack
protected void chooseRemoteRack(int numOfReplicas, DatanodeDescriptor localMachine, Set<org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxReplicasPerRack, List<DatanodeStorageInfo> results, boolean avoidStaleNodes, EnumMap<org.apache.hadoop.fs.StorageType, Integer> storageTypes) throws BlockPlacementPolicy.NotEnoughReplicasExceptionDescription copied from class:BlockPlacementPolicyDefaultChoose numOfReplicas nodes from the racks that localMachine is NOT on. If not enough nodes are available, choose the remaining ones from the local rack- Overrides:
chooseRemoteRackin classBlockPlacementPolicyDefault- Throws:
BlockPlacementPolicy.NotEnoughReplicasException
-
getRack
Description copied from class:BlockPlacementPolicyGet rack string from a data node- Overrides:
getRackin classBlockPlacementPolicy- Returns:
- rack of data node
-
addToExcludedNodes
protected int addToExcludedNodes(DatanodeDescriptor chosenNode, Set<org.apache.hadoop.net.Node> excludedNodes) Find other nodes in the same nodegroup of localMachine and add them into excludeNodes as replica should not be duplicated for nodes within the same nodegroup- Overrides:
addToExcludedNodesin classBlockPlacementPolicyDefault- Returns:
- number of new excluded nodes
-
pickupReplicaSet
public Collection<DatanodeStorageInfo> pickupReplicaSet(Collection<DatanodeStorageInfo> first, Collection<DatanodeStorageInfo> second, Map<String, List<DatanodeStorageInfo>> rackMap) Pick up replica node set for deleting replica as over-replicated. First set contains replica nodes on rack with more than one replica while second set contains remaining replica nodes. If first is not empty, divide first set into two subsets: moreThanOne contains nodes on nodegroup with more than one replica exactlyOne contains the remaining nodes in first set then pickup priSet if not empty. If first is empty, then pick second.- Overrides:
pickupReplicaSetin classBlockPlacementPolicyDefault
-
isMovable
public boolean isMovable(Collection<org.apache.hadoop.hdfs.protocol.DatanodeInfo> locs, org.apache.hadoop.hdfs.protocol.DatanodeInfo source, org.apache.hadoop.hdfs.protocol.DatanodeInfo target) Check if there are any replica (other than source) on the same node group with target. If true, then target is not a good candidate for placing specific replica as we don't want 2 replicas under the same nodegroup.- Overrides:
isMovablein classBlockPlacementPolicyDefault- Parameters:
locs- all replicas including source and targetsource- source replica of the movetarget- target replica of the move- Returns:
- true if there are any replica (other than source) on the same node group with target
-
verifyBlockPlacement
public BlockPlacementStatus verifyBlockPlacement(org.apache.hadoop.hdfs.protocol.DatanodeInfo[] locs, int numberOfReplicas) Description copied from class:BlockPlacementPolicyVerify if the block's placement meets requirement of placement policy, i.e. replicas are placed on no less than minRacks racks in the system.- Overrides:
verifyBlockPlacementin classBlockPlacementPolicyDefault- Parameters:
locs- block with locationsnumberOfReplicas- replica number of file to be verified- Returns:
- the result of verification
-