Class FederationInterceptor
java.lang.Object
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AbstractRequestInterceptor
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor
- All Implemented Interfaces:
org.apache.hadoop.conf.Configurable,org.apache.hadoop.yarn.api.ApplicationMasterProtocol,org.apache.hadoop.yarn.server.api.DistributedSchedulingAMProtocol,RequestInterceptor
Extends the AbstractRequestInterceptor and provides an implementation for
federation of YARN RM and scaling an application across multiple YARN
sub-clusters. All the federation specific implementation is encapsulated in
this class. This is always the last interceptor in the chain.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionCreates an instance of the FederationInterceptor class. -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.hadoop.yarn.api.protocolrecords.AllocateResponseallocate(org.apache.hadoop.yarn.api.protocolrecords.AllocateRequest request) Sends the heart beats to the home RM and the secondary sub-cluster RMs that are being used by the application.protected voidcacheAllocatedContainersForSubClusterId(List<org.apache.hadoop.yarn.api.records.Container> containers, org.apache.hadoop.yarn.server.federation.store.records.SubClusterId subClusterId) protected voidOnly for unit test cleanup.protected org.apache.hadoop.yarn.server.AMHeartbeatRequestHandlercreateHomeHeartbeatHandler(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.yarn.api.records.ApplicationId appId, org.apache.hadoop.yarn.server.AMRMClientRelayer rmProxyRelayer) protected <T> TcreateHomeRMProxy(AMRMProxyApplicationContext appContext, Class<T> protocol, org.apache.hadoop.security.UserGroupInformation user) Create a proxy instance that is used to connect to the Home resource manager.protected org.apache.hadoop.yarn.server.uam.UnmanagedAMPoolManagercreateUnmanagedAMPoolManager(ExecutorService threadPool) Create the UAM pool manager for secondary sub-clusters.org.apache.hadoop.yarn.api.protocolrecords.FinishApplicationMasterResponsefinishApplicationMaster(org.apache.hadoop.yarn.api.protocolrecords.FinishApplicationMasterRequest request) Sends the finish application master request to all the resource managers used by the application.protected org.apache.hadoop.yarn.api.protocolrecords.AllocateResponsePrepare the base allocation response.Map<org.apache.hadoop.yarn.server.federation.store.records.SubClusterId,List<org.apache.hadoop.yarn.api.protocolrecords.AllocateResponse>> protected org.apache.hadoop.yarn.api.records.ApplicationAttemptIdprotected Map<org.apache.hadoop.yarn.api.records.ContainerId,org.apache.hadoop.yarn.server.federation.store.records.SubClusterId> protected org.apache.hadoop.yarn.server.AMHeartbeatRequestHandlerprotected org.apache.hadoop.yarn.server.federation.utils.FederationRegistryClientprotected Set<org.apache.hadoop.yarn.server.federation.store.records.SubClusterId>getTimedOutSCs(boolean verbose) protected org.apache.hadoop.yarn.server.uam.UnmanagedAMPoolManagerprotected intvoidinit(AMRMProxyApplicationContext appContext) Initializes the instance using specified context.static <T> booleanisNullOrEmpty(Collection<T> c) Utility method to check if the specified Collection is null or empty.static <T1,T2> boolean isNullOrEmpty(Map<T1, T2> c) Utility method to check if the specified Collection is null or empty.protected TokenAndRegisterResponselaunchUAMAndRegisterApplicationMaster(org.apache.hadoop.yarn.conf.YarnConfiguration config, String subClusterId, org.apache.hadoop.yarn.api.records.ApplicationId applicationId) protected voidmergeAllocateResponse(org.apache.hadoop.yarn.api.protocolrecords.AllocateResponse homeResponse, org.apache.hadoop.yarn.api.protocolrecords.AllocateResponse otherResponse, org.apache.hadoop.yarn.server.federation.store.records.SubClusterId otherRMAddress) protected voidreAttachUAMAndMergeRegisterResponse(org.apache.hadoop.yarn.api.protocolrecords.RegisterApplicationMasterResponse homeResponse, org.apache.hadoop.yarn.api.records.ApplicationId appId) Try re-attach to all existing and running UAMs in secondary sub-clusters launched by previous application attempts if any.voidRecoverRequestInterceptorstate from store.org.apache.hadoop.yarn.api.protocolrecords.RegisterApplicationMasterResponseregisterApplicationMaster(org.apache.hadoop.yarn.api.protocolrecords.RegisterApplicationMasterRequest request) Sends the application master's registration request to the home RM.voidSets theRequestInterceptorin the chain.voidshutdown()This is called when the application pipeline is being destroyed.protected Map<org.apache.hadoop.yarn.server.federation.store.records.SubClusterId,List<org.apache.hadoop.yarn.api.records.ResourceRequest>> splitResourceRequests(List<org.apache.hadoop.yarn.api.records.ResourceRequest> askList) Splits the specified request to send it to different sub clusters.Methods inherited from class org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AbstractRequestInterceptor
allocateForDistributedScheduling, getApplicationContext, getConf, getNextInterceptor, getNMStateStore, registerApplicationMasterForDistributedScheduling, setConf
-
Field Details
-
NMSS_CLASS_PREFIX
- See Also:
-
NMSS_REG_REQUEST_KEY
- See Also:
-
NMSS_REG_RESPONSE_KEY
- See Also:
-
NMSS_SECONDARY_SC_PREFIX
When AMRMProxy HA is enabled, secondary AMRMTokens will be stored in Yarn Registry. Otherwise, if NM recovery is enabled, the UAM token are stored in local NMSS instead under this directory name.- See Also:
-
STRING_TO_BYTE_FORMAT
- See Also:
-
-
Constructor Details
-
FederationInterceptor
public FederationInterceptor()Creates an instance of the FederationInterceptor class.
-
-
Method Details
-
init
Initializes the instance using specified context.- Specified by:
initin interfaceRequestInterceptor- Overrides:
initin classAbstractRequestInterceptor- Parameters:
appContext- AMRMProxy application context
-
recover
Description copied from class:AbstractRequestInterceptorRecoverRequestInterceptorstate from store.- Specified by:
recoverin interfaceRequestInterceptor- Overrides:
recoverin classAbstractRequestInterceptor- Parameters:
recoveredDataMap- states for all interceptors recovered from NMSS
-
registerApplicationMaster
public org.apache.hadoop.yarn.api.protocolrecords.RegisterApplicationMasterResponse registerApplicationMaster(org.apache.hadoop.yarn.api.protocolrecords.RegisterApplicationMasterRequest request) throws org.apache.hadoop.yarn.exceptions.YarnException, IOException Sends the application master's registration request to the home RM. Between AM and AMRMProxy, FederationInterceptor modifies the RM behavior, so that when AM registers more than once, it returns the same register success response instead of throwingInvalidApplicationMasterRequestException. Furthermore, we present to AM as if we are the RM that never fails over (except when AMRMProxy restarts). When actual RM fails over, we always re-register automatically. We did this because FederationInterceptor can receive concurrent register requests from AM because of timeout between AM and AMRMProxy, which is shorter than the timeout + failOver between FederationInterceptor (AMRMProxy) and RM. For the same reason, this method needs to be synchronized.- Throws:
org.apache.hadoop.yarn.exceptions.YarnExceptionIOException
-
allocate
public org.apache.hadoop.yarn.api.protocolrecords.AllocateResponse allocate(org.apache.hadoop.yarn.api.protocolrecords.AllocateRequest request) throws org.apache.hadoop.yarn.exceptions.YarnException, IOException Sends the heart beats to the home RM and the secondary sub-cluster RMs that are being used by the application.- Throws:
org.apache.hadoop.yarn.exceptions.YarnExceptionIOException
-
finishApplicationMaster
public org.apache.hadoop.yarn.api.protocolrecords.FinishApplicationMasterResponse finishApplicationMaster(org.apache.hadoop.yarn.api.protocolrecords.FinishApplicationMasterRequest request) throws org.apache.hadoop.yarn.exceptions.YarnException, IOException Sends the finish application master request to all the resource managers used by the application.- Throws:
org.apache.hadoop.yarn.exceptions.YarnExceptionIOException
-
setNextInterceptor
Description copied from class:AbstractRequestInterceptorSets theRequestInterceptorin the chain.- Specified by:
setNextInterceptorin interfaceRequestInterceptor- Overrides:
setNextInterceptorin classAbstractRequestInterceptor- Parameters:
next- the next interceptor to set
-
shutdown
public void shutdown()This is called when the application pipeline is being destroyed. We will release all the resources that we are holding in this call.- Specified by:
shutdownin interfaceRequestInterceptor- Overrides:
shutdownin classAbstractRequestInterceptor
-
cleanupRegistry
@VisibleForTesting protected void cleanupRegistry()Only for unit test cleanup. -
getRegistryClient
@VisibleForTesting protected org.apache.hadoop.yarn.server.federation.utils.FederationRegistryClient getRegistryClient() -
getAttemptId
@VisibleForTesting protected org.apache.hadoop.yarn.api.records.ApplicationAttemptId getAttemptId() -
getHomeHeartbeatHandler
@VisibleForTesting protected org.apache.hadoop.yarn.server.AMHeartbeatRequestHandler getHomeHeartbeatHandler() -
createUnmanagedAMPoolManager
@VisibleForTesting protected org.apache.hadoop.yarn.server.uam.UnmanagedAMPoolManager createUnmanagedAMPoolManager(ExecutorService threadPool) Create the UAM pool manager for secondary sub-clusters. For unit test to override.- Parameters:
threadPool- the thread pool to use- Returns:
- the UAM pool manager instance
-
createHomeHeartbeatHandler
@VisibleForTesting protected org.apache.hadoop.yarn.server.AMHeartbeatRequestHandler createHomeHeartbeatHandler(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.yarn.api.records.ApplicationId appId, org.apache.hadoop.yarn.server.AMRMClientRelayer rmProxyRelayer) -
createHomeRMProxy
protected <T> T createHomeRMProxy(AMRMProxyApplicationContext appContext, Class<T> protocol, org.apache.hadoop.security.UserGroupInformation user) Create a proxy instance that is used to connect to the Home resource manager.- Type Parameters:
T- the type of the proxy- Parameters:
appContext- AMRMProxyApplicationContextprotocol- the protocol class for the proxyuser- the ugi for the proxy- Returns:
- the proxy created
-
reAttachUAMAndMergeRegisterResponse
protected void reAttachUAMAndMergeRegisterResponse(org.apache.hadoop.yarn.api.protocolrecords.RegisterApplicationMasterResponse homeResponse, org.apache.hadoop.yarn.api.records.ApplicationId appId) Try re-attach to all existing and running UAMs in secondary sub-clusters launched by previous application attempts if any. All running containers in the UAMs will be combined into the registerResponse. For the first attempt, the registry will be empty for this application and thus no-op here. -
launchUAMAndRegisterApplicationMaster
protected TokenAndRegisterResponse launchUAMAndRegisterApplicationMaster(org.apache.hadoop.yarn.conf.YarnConfiguration config, String subClusterId, org.apache.hadoop.yarn.api.records.ApplicationId applicationId) throws IOException, org.apache.hadoop.yarn.exceptions.YarnException - Throws:
IOExceptionorg.apache.hadoop.yarn.exceptions.YarnException
-
generateBaseAllocationResponse
protected org.apache.hadoop.yarn.api.protocolrecords.AllocateResponse generateBaseAllocationResponse()Prepare the base allocation response. Use lastSCResponse and lastHeartbeatTimeStamp to assemble entries about cluster-wide info, e.g. AvailableResource, NumClusterNodes. -
mergeAllocateResponse
@VisibleForTesting protected void mergeAllocateResponse(org.apache.hadoop.yarn.api.protocolrecords.AllocateResponse homeResponse, org.apache.hadoop.yarn.api.protocolrecords.AllocateResponse otherResponse, org.apache.hadoop.yarn.server.federation.store.records.SubClusterId otherRMAddress) -
getTimedOutSCs
protected Set<org.apache.hadoop.yarn.server.federation.store.records.SubClusterId> getTimedOutSCs(boolean verbose) -
splitResourceRequests
protected Map<org.apache.hadoop.yarn.server.federation.store.records.SubClusterId,List<org.apache.hadoop.yarn.api.records.ResourceRequest>> splitResourceRequests(List<org.apache.hadoop.yarn.api.records.ResourceRequest> askList) throws org.apache.hadoop.yarn.exceptions.YarnException Splits the specified request to send it to different sub clusters. The splitting algorithm is very simple. If the request does not have a node preference, the policy decides the sub cluster. If the request has a node preference and if locality is required, then it is sent to the sub cluster that contains the requested node. If node preference is specified and locality is not required, then the policy decides the sub cluster.- Parameters:
askList- the ask list to split- Returns:
- the split asks
- Throws:
org.apache.hadoop.yarn.exceptions.YarnException- if split fails
-
getUnmanagedAMPoolSize
@VisibleForTesting protected int getUnmanagedAMPoolSize() -
getUnmanagedAMPool
@VisibleForTesting protected org.apache.hadoop.yarn.server.uam.UnmanagedAMPoolManager getUnmanagedAMPool() -
getUamRegisterFutures
-
getAsyncResponseSink
-
isNullOrEmpty
Utility method to check if the specified Collection is null or empty.- Type Parameters:
T- element type of the collection- Parameters:
c- the collection object- Returns:
- whether is it is null or empty
-
isNullOrEmpty
Utility method to check if the specified Collection is null or empty.- Type Parameters:
T1- key type of the mapT2- value type of the map- Parameters:
c- the map object- Returns:
- whether is it is null or empty
-
cacheAllocatedContainersForSubClusterId
@VisibleForTesting protected void cacheAllocatedContainersForSubClusterId(List<org.apache.hadoop.yarn.api.records.Container> containers, org.apache.hadoop.yarn.server.federation.store.records.SubClusterId subClusterId) -
getContainerIdToSubClusterIdMap
@VisibleForTesting protected Map<org.apache.hadoop.yarn.api.records.ContainerId,org.apache.hadoop.yarn.server.federation.store.records.SubClusterId> getContainerIdToSubClusterIdMap()
-