Class NetworkTopology
- Direct Known Subclasses:
NetworkTopologyWithNodeGroup
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Stringstatic final org.slf4j.Loggerprotected ReadWriteLockthe lock used to manage accessprotected intrack counter -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidAdd a leaf node Update node counter & rack counter if necessarychooseRandom(String scope) Randomly choose a node.protected NodechooseRandom(String scope, String excludedScope, Collection<Node> excludedNodes) chooseRandom(String scope, Collection<Node> excludedNodes) Randomly choose one node from scope.booleanCheck if the tree contains node nodeintcountNumOfAvailableNodes(String scope, Collection<Node> excludedNodes) return the number of leaves in scope but not in excludedNodes if scope starts with ~, return the number of nodes that are not in scope and excludedNodes;voiddecommissionNode(Node node) Update empty rack number when remove a node like decommission.getDatanodesInRack(String loc) Given a string representation of a rack, return its childrenintgetDistance(Node node1, Node node2) Return the distance between two nodes It is assumed that the distance from one node to its parent is 1 The distance between two nodes is calculated by summing up their distances to their closest common ancestor.static intgetDistanceByPath(Node node1, Node node2) Return the distance between two nodes by comparing their network paths without checking if they belong to the same ancestor node by reference.static StringgetFirstHalf(String networkLocation) static NetworkTopologygetInstance(Configuration conf) Get an instance of NetworkTopology based on the value of the configuration parameter net.topology.impl.static NetworkTopologygetInstance(Configuration conf, InnerNode.Factory factory) static StringgetLastHalf(String networkLocation) return leaves in scopeGiven a string representation of a node, return its referenceprotected NodeReturn a reference to the node given its string representation.intintintGiven a string representation of a rack for a specific network location To be overridden in subclasses for specific NetworkTopology implementations, as alternative to overriding the fullgetRack(String)method.protected intReturns an integer weight which specifies how far away {node} is away from {reader}.protected static intgetWeightUsingNetworkLocation(Node reader, Node node) Returns an integer weight which specifies how far away node is from reader.booleanprotected voidprotected NetworkTopologyinit(InnerNode.Factory factory) protected static booleanisChildScope(String parentScope, String childScope) Checks whether one scope is contained in the other scope.booleanprotected static booleanisNodeInScope(Node node, String scope) Checks whether a node belongs to the scope.booleanisOnSameNodeGroup(Node node1, Node node2) booleanisOnSameRack(Node node1, Node node2) Check if two nodes are on the same rackprotected booleanisSameParents(Node node1, Node node2) Compare the parents of each node for equalityvoidrecommissionNode(Node node) Update empty rack number when add a node like recommission.voidRemove a node Update node counter and rack counter if necessaryvoidRandomly permute the active nodes of the node array.voidsortByDistance(Node reader, Node[] nodes, int activeLen) Sort nodes array by network distance to reader.<T extends Node>
voidsortByDistance(Node reader, T[] nodes, int activeLen, Consumer<List<T>> secondarySort) Sort nodes array by network distance to reader with secondary sort.voidsortByDistanceUsingNetworkLocation(Node reader, Node[] nodes, int activeLen) Sort nodes array by network distance to reader with secondary sort.<T extends Node>
voidsortByDistanceUsingNetworkLocation(Node reader, T[] nodes, int activeLen, Consumer<List<T>> secondarySort) Sort nodes array by network distance to reader.toString()convert a network tree to a string.
-
Field Details
-
DEFAULT_RACK
- See Also:
-
LOG
public static final org.slf4j.Logger LOG -
numOfRacks
protected int numOfRacksrack counter -
netlock
the lock used to manage access
-
-
Constructor Details
-
NetworkTopology
public NetworkTopology()
-
-
Method Details
-
getInstance
Get an instance of NetworkTopology based on the value of the configuration parameter net.topology.impl.- Parameters:
conf- the configuration to be used- Returns:
- an instance of NetworkTopology
-
getInstance
-
init
-
add
Add a leaf node Update node counter & rack counter if necessary- Parameters:
node- node to be added; can be null- Throws:
IllegalArgumentException- if add a node to a leave or node to be added is not a leaf
-
incrementRacks
protected void incrementRacks() -
getNodeForNetworkLocation
Return a reference to the node given its string representation. Default implementation delegates togetNode(String).To be overridden in subclasses for specific NetworkTopology implementations, as alternative to overriding the full
add(Node)method.- Parameters:
node- The string representation of this node's network location is used to retrieve a Node object.- Returns:
- a reference to the node; null if the node is not in the tree
- See Also:
-
getDatanodesInRack
Given a string representation of a rack, return its children- Parameters:
loc- a path-like string representation of a rack- Returns:
- a newly allocated list with all the node's children
-
remove
Remove a node Update node counter and rack counter if necessary- Parameters:
node- node to be removed; can be null
-
contains
Check if the tree contains node node- Parameters:
node- a node- Returns:
- true if node is already in the tree; false otherwise
-
getNode
Given a string representation of a node, return its reference- Parameters:
loc- a path-like string representation of a node- Returns:
- a reference to the node; null if the node is not in the tree
-
hasClusterEverBeenMultiRack
public boolean hasClusterEverBeenMultiRack()- Returns:
- true if this cluster has ever consisted of multiple racks, even if it is not now a multi-rack cluster.
-
getRack
Given a string representation of a rack for a specific network location To be overridden in subclasses for specific NetworkTopology implementations, as alternative to overriding the fullgetRack(String)method.- Parameters:
loc- a path-like string representation of a network location- Returns:
- a rack string
-
getNumOfRacks
public int getNumOfRacks()- Returns:
- the total number of racks
-
getNumOfLeaves
public int getNumOfLeaves()- Returns:
- the total number of leaf nodes
-
getDistance
Return the distance between two nodes It is assumed that the distance from one node to its parent is 1 The distance between two nodes is calculated by summing up their distances to their closest common ancestor.- Parameters:
node1- one nodenode2- another node- Returns:
- the distance between node1 and node2 which is zero if they are the same
or
Integer.MAX_VALUEif node1 or node2 do not belong to the cluster
-
getDistanceByPath
Return the distance between two nodes by comparing their network paths without checking if they belong to the same ancestor node by reference. It is assumed that the distance from one node to its parent is 1 The distance between two nodes is calculated by summing up their distances to their closest common ancestor.- Parameters:
node1- one nodenode2- another node- Returns:
- the distance between node1 and node2
-
isOnSameRack
Check if two nodes are on the same rack- Parameters:
node1- one node (can be null)node2- another node (can be null)- Returns:
- true if node1 and node2 are on the same rack; false otherwise
- Throws:
IllegalArgumentException- when either node1 or node2 is null, or node1 or node2 do not belong to the cluster
-
isNodeGroupAware
public boolean isNodeGroupAware()- Returns:
- Check if network topology is aware of NodeGroup.
-
isOnSameNodeGroup
- Parameters:
node1- input node1.node2- input node2.- Returns:
- Return false directly as not aware of NodeGroup, to be override in sub-class.
-
isSameParents
Compare the parents of each node for equalityTo be overridden in subclasses for specific NetworkTopology implementations, as alternative to overriding the full
isOnSameRack(Node, Node)method.- Parameters:
node1- the first node to comparenode2- the second node to compare- Returns:
- true if their parents are equal, false otherwise
- See Also:
-
chooseRandom
Randomly choose a node.- Parameters:
scope- range of nodes from which a node will be chosen- Returns:
- the chosen node
- See Also:
-
chooseRandom
Randomly choose one node from scope. If scope starts with ~, choose one from the all nodes except for the ones in scope; otherwise, choose one from scope. If excludedNodes is given, choose a node that's not in excludedNodes.- Parameters:
scope- range of nodes from which a node will be chosenexcludedNodes- nodes to be excluded from- Returns:
- the chosen node
-
chooseRandom
-
getLeaves
return leaves in scope- Parameters:
scope- a path string- Returns:
- leaves nodes under specific scope
-
countNumOfAvailableNodes
@VisibleForTesting public int countNumOfAvailableNodes(String scope, Collection<Node> excludedNodes) return the number of leaves in scope but not in excludedNodes if scope starts with ~, return the number of nodes that are not in scope and excludedNodes;- Parameters:
scope- a path string that may start with ~excludedNodes- a list of nodes- Returns:
- number of available nodes
-
toString
convert a network tree to a string. -
getFirstHalf
- Parameters:
networkLocation- input networkLocation.- Returns:
- Divide networklocation string into two parts by last separator, and get the first part here.
-
getLastHalf
- Parameters:
networkLocation- input networkLocation.- Returns:
- Divide networklocation string into two parts by last separator, and get the second part here.
-
getWeight
Returns an integer weight which specifies how far away {node} is away from {reader}. A lower value signifies that a node is closer.- Parameters:
reader- Node where data will be readnode- Replica of data- Returns:
- weight
-
getWeightUsingNetworkLocation
Returns an integer weight which specifies how far away node is from reader. A lower value signifies that a node is closer. It uses network location to calculate the weight- Parameters:
reader- Node where data will be readnode- Replica of data- Returns:
- weight
-
sortByDistance
Sort nodes array by network distance to reader.In a three-level topology, a node can be either local, on the same rack, or on a different rack from the reader. Sorting the nodes based on network distance from the reader reduces network traffic and improves performance.
As an additional twist, we also randomize the nodes at each network distance. This helps with load balancing when there is data skew.
- Parameters:
reader- Node where data will be readnodes- Available replicas with the requested dataactiveLen- Number of active nodes at the front of the array
-
sortByDistance
public <T extends Node> void sortByDistance(Node reader, T[] nodes, int activeLen, Consumer<List<T>> secondarySort) Sort nodes array by network distance to reader with secondary sort.In a three-level topology, a node can be either local, on the same rack, or on a different rack from the reader. Sorting the nodes based on network distance from the reader reduces network traffic and improves performance.
As an additional twist, we also randomize the nodes at each network distance. This helps with load balancing when there is data skew.- Type Parameters:
T- Generics Type T- Parameters:
reader- Node where data will be readnodes- Available replicas with the requested dataactiveLen- Number of active nodes at the front of the arraysecondarySort- a secondary sorting strategy which can inject into that point from outside to help sort the same distance.
-
sortByDistanceUsingNetworkLocation
Sort nodes array by network distance to reader with secondary sort.using network location. This is used when the reader is not a datanode. Sorting the nodes based on network distance from the reader reduces network traffic and improves performance.
- Parameters:
reader- Node where data will be readnodes- Available replicas with the requested dataactiveLen- Number of active nodes at the front of the array
-
sortByDistanceUsingNetworkLocation
public <T extends Node> void sortByDistanceUsingNetworkLocation(Node reader, T[] nodes, int activeLen, Consumer<List<T>> secondarySort) Sort nodes array by network distance to reader.using network location. This is used when the reader is not a datanode. Sorting the nodes based on network distance from the reader reduces network traffic and improves performance.
- Type Parameters:
T- Generics Type T.- Parameters:
reader- Node where data will be readnodes- Available replicas with the requested dataactiveLen- Number of active nodes at the front of the arraysecondarySort- a secondary sorting strategy which can inject into that point from outside to help sort the same distance.
-
isChildScope
Checks whether one scope is contained in the other scope.- Parameters:
parentScope- the parent scope to checkchildScope- the child scope which needs to be checked.- Returns:
- true if childScope is contained within the parentScope
-
isNodeInScope
Checks whether a node belongs to the scope.- Parameters:
node- the node to check.scope- scope to check.- Returns:
- true if node lies within the scope
-
getNumOfNonEmptyRacks
public int getNumOfNonEmptyRacks()- Returns:
- the number of nonempty racks
-
recommissionNode
Update empty rack number when add a node like recommission.- Parameters:
node- node to be added; can be null
-
decommissionNode
Update empty rack number when remove a node like decommission.- Parameters:
node- node to be added; can be null
-
shuffle
Randomly permute the active nodes of the node array.- Parameters:
nodes- Available replicas with the requested dataactiveLen- Number of active nodes at the front of the array
-