Class DatanodeAdminManager
A DataNode can be decommissioned in a few situations:
- If a DN is dead, it is decommissioned immediately.
- If a DN is alive, it is decommissioned after all of its blocks are sufficiently replicated. Merely under-replicated blocks do not block decommissioning as long as they are above a replication threshold.
DECOMMISSION_INPROGRESS nodes that become dead do not progress to DECOMMISSIONED until they become live again. This prevents potential durability loss for singly-replicated blocks (see HDFS-6791).
DataNodes can also be put under maintenance state for any short duration maintenance operations. Unlike decommissioning, blocks are not always re-replicated for the DataNodes to enter maintenance state. When the blocks are replicated at least dfs.namenode.maintenance.replication.min, DataNodes transition to IN_MAINTENANCE state. Otherwise, just like decommissioning, DataNodes transition to ENTERING_MAINTENANCE state and wait for the blocks to be sufficiently replicated and then transition to IN_MAINTENANCE state. The block replication factor is relaxed for a maximum of maintenance expiry time. When DataNodes don't transition or join the cluster back by expiry time, blocks are re-replicated just as in decommissioning case as to avoid read or write performance degradation.
This class depends on the FSNamesystem lock for synchronization.
-
Method Summary
Modifier and TypeMethodDescriptionintintintintintprotected booleanisSufficient(BlockInfo block, BlockCollection bc, NumberReplicas numberReplicas, boolean isDecommission, boolean isMaintenance) Checks whether a block is sufficiently replicated/stored for DECOMMISSION_INPROGRESS or ENTERING_MAINTENANCE datanodes.protected voidlogBlockReplicationInfo(BlockInfo block, BlockCollection bc, DatanodeDescriptor srcNode, NumberReplicas num, Iterable<DatanodeStorageInfo> storages) voidrefreshBlocksPerLock(int blocksPerLock, String key) voidrefreshPendingRepLimit(int pendingRepLimit, String key) protected voidprotected voidvoidStart decommissioning the specified datanode.voidstartMaintenance(DatanodeDescriptor node, long maintenanceExpireTimeInMS) Start maintenance of the specified datanode.voidStop decommissioning the specified datanode.voidStop maintenance of the specified datanode.
-
Method Details
-
startDecommission
Start decommissioning the specified datanode.- Parameters:
node-
-
stopDecommission
Stop decommissioning the specified datanode.- Parameters:
node-
-
startMaintenance
@VisibleForTesting public void startMaintenance(DatanodeDescriptor node, long maintenanceExpireTimeInMS) Start maintenance of the specified datanode.- Parameters:
node-
-
stopMaintenance
Stop maintenance of the specified datanode.- Parameters:
node-
-
setDecommissioned
-
setInMaintenance
-
isSufficient
protected boolean isSufficient(BlockInfo block, BlockCollection bc, NumberReplicas numberReplicas, boolean isDecommission, boolean isMaintenance) Checks whether a block is sufficiently replicated/stored for DECOMMISSION_INPROGRESS or ENTERING_MAINTENANCE datanodes. For replicated blocks or striped blocks, full-strength replication or storage is not always necessary, hence "sufficient".- Returns:
- true if sufficient, else false.
-
logBlockReplicationInfo
protected void logBlockReplicationInfo(BlockInfo block, BlockCollection bc, DatanodeDescriptor srcNode, NumberReplicas num, Iterable<DatanodeStorageInfo> storages) -
getNumPendingNodes
@VisibleForTesting public int getNumPendingNodes() -
getNumTrackedNodes
@VisibleForTesting public int getNumTrackedNodes() -
getNumNodesChecked
@VisibleForTesting public int getNumNodesChecked() -
getPendingNodes
-
refreshPendingRepLimit
-
getPendingRepLimit
@VisibleForTesting public int getPendingRepLimit() -
refreshBlocksPerLock
-
getBlocksPerLock
@VisibleForTesting public int getBlocksPerLock()
-