java.lang.Object
org.apache.hadoop.service.AbstractService
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
All Implemented Interfaces:
Closeable, AutoCloseable, org.apache.hadoop.service.Service

@Private @Unstable public class ZKRMStateStore extends RMStateStore
RMStateStore implementation backed by ZooKeeper. The znode structure is as follows: ROOT_DIR_PATH |--- VERSION_INFO |--- EPOCH_NODE |--- RM_ZK_FENCING_LOCK |--- RM_APP_ROOT | |----- HIERARCHIES | | |----- 1 | | | |----- (#ApplicationId barring last character) | | | | |----- (#Last character of ApplicationId) | | | | | |----- (#ApplicationAttemptIds) | | | .... | | | | | |----- 2 | | | |----- (#ApplicationId barring last 2 characters) | | | | |----- (#Last 2 characters of ApplicationId) | | | | | |----- (#ApplicationAttemptIds) | | | .... | | | | | |----- 3 | | | |----- (#ApplicationId barring last 3 characters) | | | | |----- (#Last 3 characters of ApplicationId) | | | | | |----- (#ApplicationAttemptIds) | | | .... | | | | | |----- 4 | | | |----- (#ApplicationId barring last 4 characters) | | | | |----- (#Last 4 characters of ApplicationId) | | | | | |----- (#ApplicationAttemptIds) | | | .... | | | | |----- (#ApplicationId1) | | |----- (#ApplicationAttemptIds) | | | |----- (#ApplicationId2) | | |----- (#ApplicationAttemptIds) | .... | |--- RM_DT_SECRET_MANAGER_ROOT |----- RM_DT_SEQUENTIAL_NUMBER_ZNODE_NAME |----- RM_DELEGATION_TOKENS_ROOT_ZNODE_NAME | |----- 1 | | |----- (#TokenId barring last character) | | | |----- (#Last character of TokenId) | | .... | |----- 2 | | |----- (#TokenId barring last 2 characters) | | | |----- (#Last 2 characters of TokenId) | | .... | |----- 3 | | |----- (#TokenId barring last 3 characters) | | | |----- (#Last 3 characters of TokenId) | | .... | |----- 4 | | |----- (#TokenId barring last 4 characters) | | | |----- (#Last 4 characters of TokenId) | | .... | |----- Token_1 | |----- Token_2 | .... | |----- RM_DT_MASTER_KEYS_ROOT_ZNODE_NAME | |----- Key_1 | |----- Key_2 .... |--- AMRMTOKEN_SECRET_MANAGER_ROOT |----- currentMasterKey |----- nextMasterKey |-- RESERVATION_SYSTEM_ROOT |------PLAN_1 | |------ RESERVATION_1 | |------ RESERVATION_2 | .... |------PLAN_2 .... |-- PROXY_CA_ROOT |----- caCert |----- caPrivateKey Note: Changes from 1.1 to 1.2 - AMRMTokenSecretManager state has been saved separately. The currentMasterkey and nextMasterkey have been stored. Also, AMRMToken has been removed from ApplicationAttemptState. Changes from 1.2 to 1.3, Addition of ReservationSystem state. Changes from 1.3 to 1.4 - Change the structure of application znode by splitting it in 2 parts, depending on a configurable split index. This limits the number of application znodes returned in a single call while loading app state. Changes from 1.4 to 1.5 - Change the structure of delegation token znode by splitting it in 2 parts, depending on a configurable split index. This limits the number of delegation token znodes returned in a single call while loading tokens state.
  • Field Details

    • ROOT_ZNODE_NAME

      @VisibleForTesting public static final String ROOT_ZNODE_NAME
      See Also:
    • CURRENT_VERSION_INFO

      protected static final org.apache.hadoop.yarn.server.records.Version CURRENT_VERSION_INFO
    • RM_APP_ROOT_HIERARCHIES

      @VisibleForTesting public static final String RM_APP_ROOT_HIERARCHIES
      See Also:
    • znodeWorkingPath

      @VisibleForTesting protected String znodeWorkingPath
    • delegationTokenNodeSplitIndex

      @VisibleForTesting protected int delegationTokenNodeSplitIndex
    • opDurations

      @VisibleForTesting protected ZKRMStateStoreOpDurations opDurations
  • Constructor Details

    • ZKRMStateStore

      public ZKRMStateStore()
  • Method Details

    • constructZkRootNodeACL

      @VisibleForTesting @Private @Unstable protected List<org.apache.zookeeper.data.ACL> constructZkRootNodeACL(org.apache.hadoop.conf.Configuration conf, List<org.apache.zookeeper.data.ACL> sourceACLs) throws NoSuchAlgorithmException
      Given the Configuration and ACLs used (sourceACLs) for ZooKeeper access, construct the ACLs for the store's root node. In the constructed ACL, all the users allowed by sourceACLs are given read-write-admin access, while the current RM has exclusive create-delete access. To be called only when HA is enabled and the configuration doesn't set an ACL for the root node.
      Parameters:
      conf - the configuration
      sourceACLs - the source ACLs
      Returns:
      ACLs for the store's root node
      Throws:
      NoSuchAlgorithmException - thrown if the digest algorithm used by Zookeeper cannot be found
    • initInternal

      public void initInternal(org.apache.hadoop.conf.Configuration conf) throws IOException, NoSuchAlgorithmException
      Description copied from class: RMStateStore
      Derived classes initialize themselves using this method.
      Specified by:
      initInternal in class RMStateStore
      Parameters:
      conf - Configuration.
      Throws:
      IOException
      NoSuchAlgorithmException
    • startInternal

      public void startInternal() throws Exception
      Description copied from class: RMStateStore
      Derived classes start themselves using this method. The base class is started and the event dispatcher is ready to use at this point.
      Specified by:
      startInternal in class RMStateStore
      Throws:
      Exception - error occur.
    • closeInternal

      protected void closeInternal() throws Exception
      Description copied from class: RMStateStore
      Derived classes close themselves using this method. The base class will be closed and the event dispatcher will be shutdown after this.
      Specified by:
      closeInternal in class RMStateStore
      Throws:
      Exception - error occur.
    • getCurrentVersion

      protected org.apache.hadoop.yarn.server.records.Version getCurrentVersion()
      Description copied from class: RMStateStore
      Get the current version of the underlying state store.
      Specified by:
      getCurrentVersion in class RMStateStore
      Returns:
      current version.
    • storeVersion

      protected void storeVersion() throws Exception
      Description copied from class: RMStateStore
      Derived class use this method to store the version information.
      Specified by:
      storeVersion in class RMStateStore
      Throws:
      Exception - error occur.
    • loadVersion

      protected org.apache.hadoop.yarn.server.records.Version loadVersion() throws Exception
      Description copied from class: RMStateStore
      Derived class use this method to load the version information from state store.
      Specified by:
      loadVersion in class RMStateStore
      Returns:
      current version.
      Throws:
      Exception - error occur.
    • getAndIncrementEpoch

      public long getAndIncrementEpoch() throws Exception
      Description copied from class: RMStateStore
      Get the current epoch of RM and increment the value.
      Specified by:
      getAndIncrementEpoch in class RMStateStore
      Returns:
      current epoch.
      Throws:
      Exception - error occur.
    • loadState

      public RMStateStore.RMState loadState() throws Exception
      Description copied from class: RMStateStore
      Blocking API The derived class must recover state from the store and return a new RMState object populated with that state This must not be called on the dispatcher thread.
      Specified by:
      loadState in class RMStateStore
      Returns:
      RMState.
      Throws:
      Exception - error occur.
    • storeApplicationStateInternal

      public void storeApplicationStateInternal(org.apache.hadoop.yarn.api.records.ApplicationId appId, ApplicationStateData appStateDataPB) throws Exception
      Description copied from class: RMStateStore
      Blocking API Derived classes must implement this method to store the state of an application.
      Specified by:
      storeApplicationStateInternal in class RMStateStore
      Parameters:
      appId - application Id.
      appStateDataPB - application StateData.
      Throws:
      Exception - error occur.
    • updateApplicationStateInternal

      protected void updateApplicationStateInternal(org.apache.hadoop.yarn.api.records.ApplicationId appId, ApplicationStateData appStateDataPB) throws Exception
      Specified by:
      updateApplicationStateInternal in class RMStateStore
      Throws:
      Exception
    • storeApplicationAttemptStateInternal

      protected void storeApplicationAttemptStateInternal(org.apache.hadoop.yarn.api.records.ApplicationAttemptId appAttemptId, ApplicationAttemptStateData attemptStateDataPB) throws Exception
      Description copied from class: RMStateStore
      Blocking API Derived classes must implement this method to store the state of an application attempt.
      Specified by:
      storeApplicationAttemptStateInternal in class RMStateStore
      Parameters:
      appAttemptId - Application AttemptId.
      attemptStateDataPB - Application AttemptStateData.
      Throws:
      Exception - error occur.
    • updateApplicationAttemptStateInternal

      protected void updateApplicationAttemptStateInternal(org.apache.hadoop.yarn.api.records.ApplicationAttemptId appAttemptId, ApplicationAttemptStateData attemptStateDataPB) throws Exception
      Specified by:
      updateApplicationAttemptStateInternal in class RMStateStore
      Throws:
      Exception
    • removeApplicationAttemptInternal

      protected void removeApplicationAttemptInternal(org.apache.hadoop.yarn.api.records.ApplicationAttemptId appAttemptId) throws Exception
      Description copied from class: RMStateStore
      Blocking API Derived classes must implement this method to remove the state of specified attempt.
      Specified by:
      removeApplicationAttemptInternal in class RMStateStore
      Parameters:
      appAttemptId - application attempt id.
      Throws:
      Exception - exception occurs.
    • removeApplicationStateInternal

      protected void removeApplicationStateInternal(ApplicationStateData appState) throws Exception
      Description copied from class: RMStateStore
      Blocking API Derived classes must implement this method to remove the state of an application and its attempts.
      Specified by:
      removeApplicationStateInternal in class RMStateStore
      Parameters:
      appState - ApplicationStateData.
      Throws:
      Exception - error occurs.
    • storeRMDelegationTokenState

      protected void storeRMDelegationTokenState(org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate) throws Exception
      Description copied from class: RMStateStore
      Blocking API Derived classes must implement this method to store the state of RMDelegationToken and sequence number.
      Specified by:
      storeRMDelegationTokenState in class RMStateStore
      Parameters:
      rmDTIdentifier - RMDelegationTokenIdentifier.
      renewDate - token renew date.
      Throws:
      Exception - error occur.
    • removeRMDelegationTokenState

      protected void removeRMDelegationTokenState(org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier rmDTIdentifier) throws Exception
      Description copied from class: RMStateStore
      Blocking API Derived classes must implement this method to remove the state of RMDelegationToken.
      Specified by:
      removeRMDelegationTokenState in class RMStateStore
      Parameters:
      rmDTIdentifier - RMDelegationTokenIdentifier.
      Throws:
      Exception - error occurs.
    • updateRMDelegationTokenState

      protected void updateRMDelegationTokenState(org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate) throws Exception
      Description copied from class: RMStateStore
      Blocking API Derived classes must implement this method to update the state of RMDelegationToken and sequence number.
      Specified by:
      updateRMDelegationTokenState in class RMStateStore
      Parameters:
      rmDTIdentifier - RMDelegationTokenIdentifier.
      renewDate - token renew date.
      Throws:
      Exception - error occurs.
    • storeRMDTMasterKeyState

      protected void storeRMDTMasterKeyState(org.apache.hadoop.security.token.delegation.DelegationKey delegationKey) throws Exception
      Description copied from class: RMStateStore
      Blocking API Derived classes must implement this method to store the state of DelegationToken Master Key.
      Specified by:
      storeRMDTMasterKeyState in class RMStateStore
      Parameters:
      delegationKey - DelegationToken Master Key.
      Throws:
      Exception - error occur.
    • removeRMDTMasterKeyState

      protected void removeRMDTMasterKeyState(org.apache.hadoop.security.token.delegation.DelegationKey delegationKey) throws Exception
      Description copied from class: RMStateStore
      Blocking API Derived classes must implement this method to remove the state of DelegationToken Master Key.
      Specified by:
      removeRMDTMasterKeyState in class RMStateStore
      Parameters:
      delegationKey - DelegationKey.
      Throws:
      Exception - exception occurs.
    • deleteStore

      public void deleteStore() throws Exception
      Description copied from class: RMStateStore
      Derived classes must implement this method to delete the state store.
      Specified by:
      deleteStore in class RMStateStore
      Throws:
      Exception - exception occurs.
    • removeApplication

      public void removeApplication(org.apache.hadoop.yarn.api.records.ApplicationId removeAppId) throws Exception
      Description copied from class: RMStateStore
      Derived classes must implement this method to remove application from the state store.
      Specified by:
      removeApplication in class RMStateStore
      Parameters:
      removeAppId - application Id.
      Throws:
      Exception - exception occurs.
    • storeOrUpdateAMRMTokenSecretManagerState

      protected void storeOrUpdateAMRMTokenSecretManagerState(AMRMTokenSecretManagerState amrmTokenSecretManagerState, boolean isUpdate) throws Exception
      Description copied from class: RMStateStore
      Blocking API Derived classes must implement this method to store or update the state of AMRMToken Master Key.
      Specified by:
      storeOrUpdateAMRMTokenSecretManagerState in class RMStateStore
      Parameters:
      amrmTokenSecretManagerState - amrmTokenSecretManagerState.
      isUpdate - true, update; otherwise not update.
      Throws:
      Exception - exception occurs.
    • removeReservationState

      protected void removeReservationState(String planName, String reservationIdName) throws Exception
      Description copied from class: RMStateStore
      Blocking API Derived classes must implement this method to remove the state of a reservation allocation.
      Specified by:
      removeReservationState in class RMStateStore
      Parameters:
      planName - plan Name.
      reservationIdName - reservationId Name.
      Throws:
      Exception - exception occurs.
    • storeReservationState

      protected void storeReservationState(org.apache.hadoop.yarn.proto.YarnProtos.ReservationAllocationStateProto reservationAllocation, String planName, String reservationIdName) throws Exception
      Description copied from class: RMStateStore
      Blocking API Derived classes must implement this method to store the state of a reservation allocation.
      Specified by:
      storeReservationState in class RMStateStore
      Parameters:
      reservationAllocation - reservation Allocation.
      planName - plan Name.
      reservationIdName - reservationId Name.
      Throws:
      Exception - error occurs.
    • storeProxyCACertState

      protected void storeProxyCACertState(X509Certificate caCert, PrivateKey caPrivateKey) throws Exception
      Description copied from class: RMStateStore
      Blocking API Derived classes must implement this method to store the CA Certificate and Private Key.
      Specified by:
      storeProxyCACertState in class RMStateStore
      Parameters:
      caCert - X509Certificate.
      caPrivateKey - PrivateKey.
      Throws:
      Exception - error occurs.
    • safeDeleteAndCheckNode

      public void safeDeleteAndCheckNode(String path, List<org.apache.zookeeper.data.ACL> fencingACL, String fencingPath) throws Exception
      Deletes the path more safe. When NoNodeException is encountered, if the node does not exist, it will ignore this exception to avoid triggering a greater impact of ResourceManager failover on the cluster.
      Parameters:
      path - Path to be deleted.
      fencingACL - fencingACL.
      fencingPath - fencingNodePath.
      Throws:
      Exception - if any problem occurs while performing deletion.