Class DFSUtil

java.lang.Object
org.apache.hadoop.hdfs.DFSUtil

@Private public class DFSUtil extends Object
  • Field Details

    • LOG

      public static final org.slf4j.Logger LOG
    • helpOptions

      public static final org.apache.commons.cli.Options helpOptions
    • helpOpt

      public static final org.apache.commons.cli.Option helpOpt
  • Method Details

    • getSecureRandom

      public static SecureRandom getSecureRandom()
      Returns:
      a pseudo secure random number generator.
    • isValidName

      public static boolean isValidName(String src)
      Whether the pathname is valid. Currently prohibits relative paths, names which contain a ":" or "//", or other non-canonical paths.
    • isValidNameForComponent

      public static boolean isValidNameForComponent(String component)
      Checks if a string is a valid path component. For instance, components cannot contain a ":" or "/", and cannot be equal to a reserved component like ".snapshot".

      The primary use of this method is for validating paths when loading the FSImage. During normal NN operation, paths are sometimes allowed to contain reserved components.

      Returns:
      If component is valid
    • isReservedPathComponent

      public static boolean isReservedPathComponent(String component)
      Returns if the component is reserved.

      Note that some components are only reserved under certain directories, e.g. "/.reserved" is reserved, while "/hadoop/.reserved" is not.

      Returns:
      true, if the component is reserved
    • bytes2String

      public static String bytes2String(byte[] bytes)
      Converts a byte array to a string using UTF8 encoding.
    • bytes2String

      public static String bytes2String(byte[] bytes, int offset, int length)
      Decode a specific range of bytes of the given byte array to a string using UTF8.
      Parameters:
      bytes - The bytes to be decoded into characters
      offset - The index of the first byte to decode
      length - The number of bytes to decode
      Returns:
      The decoded string
    • string2Bytes

      public static byte[] string2Bytes(String str)
      Converts a string to a byte array using UTF8 encoding.
    • byteArray2PathString

      public static String byteArray2PathString(byte[][] components, int offset, int length)
      Given a list of path components returns a path as a UTF8 String
    • byteArray2PathString

      public static String byteArray2PathString(byte[][] pathComponents)
    • strings2PathString

      public static String strings2PathString(String[] components)
      Converts a list of path components into a path using Path.SEPARATOR.
      Parameters:
      components - Path components
      Returns:
      Combined path as a UTF-8 string
    • path2String

      public static String path2String(Object path)
      Convert an object representing a path to a string.
    • getPathComponents

      public static byte[][] getPathComponents(String path)
      Convert a UTF8 string to an array of byte arrays.
    • bytes2byteArray

      public static byte[][] bytes2byteArray(byte[] bytes, byte separator)
      Splits the array of bytes into array of arrays of bytes on byte separator
      Parameters:
      bytes - the array of bytes to split
      separator - the delimiting byte
    • bytes2byteArray

      public static byte[][] bytes2byteArray(byte[] bytes, int len, byte separator)
      Splits first len bytes in bytes to array of arrays of bytes on byte separator
      Parameters:
      bytes - the byte array to split
      len - the number of bytes to split
      separator - the delimiting byte
    • addKeySuffixes

      public static String addKeySuffixes(String key, String... suffixes)
      Return configuration key of format key.suffix1.suffix2...suffixN
    • getRpcAddressesForNameserviceId

      public static Map<String,InetSocketAddress> getRpcAddressesForNameserviceId(org.apache.hadoop.conf.Configuration conf, String nsId, String defaultValue)
      Get all of the RPC addresses of the individual NNs in a given nameservice.
      Parameters:
      conf - Configuration
      nsId - the nameservice whose NNs addresses we want.
      defaultValue - default address to return in case key is not found.
      Returns:
      A map from nnId -> RPC address of each NN in the nameservice.
    • getAllNnPrincipals

      public static Set<String> getAllNnPrincipals(org.apache.hadoop.conf.Configuration conf) throws IOException
      Returns:
      a collection of all configured NN Kerberos principals.
      Throws:
      IOException
    • getJournalNodeAddresses

      public static Set<String> getJournalNodeAddresses(org.apache.hadoop.conf.Configuration conf) throws URISyntaxException, IOException
      Returns list of Journalnode addresses from the configuration.
      Parameters:
      conf - configuration
      Returns:
      list of journalnode host names
      Throws:
      URISyntaxException
      IOException
    • getBackupNodeAddresses

      public static Map<String,Map<String,InetSocketAddress>> getBackupNodeAddresses(org.apache.hadoop.conf.Configuration conf) throws IOException
      Returns list of InetSocketAddress corresponding to backup node rpc addresses from the configuration.
      Parameters:
      conf - configuration
      Returns:
      list of InetSocketAddresses
      Throws:
      IOException - on error
    • getSecondaryNameNodeAddresses

      public static Map<String,Map<String,InetSocketAddress>> getSecondaryNameNodeAddresses(org.apache.hadoop.conf.Configuration conf) throws IOException
      Returns list of InetSocketAddresses of corresponding to secondary namenode http addresses from the configuration.
      Parameters:
      conf - configuration
      Returns:
      list of InetSocketAddresses
      Throws:
      IOException - on error
    • getNNServiceRpcAddresses

      public static Map<String,Map<String,InetSocketAddress>> getNNServiceRpcAddresses(org.apache.hadoop.conf.Configuration conf) throws IOException
      Returns list of InetSocketAddresses corresponding to namenodes from the configuration. Returns namenode address specifically configured for datanodes (using service ports), if found. If not, regular RPC address configured for other clients is returned.
      Parameters:
      conf - configuration
      Returns:
      list of InetSocketAddress
      Throws:
      IOException - on error
    • getNNServiceRpcAddressesForCluster

      public static Map<String,Map<String,InetSocketAddress>> getNNServiceRpcAddressesForCluster(org.apache.hadoop.conf.Configuration conf) throws IOException
      Returns list of InetSocketAddresses corresponding to the namenode that manages this cluster. Note this is to be used by datanodes to get the list of namenode addresses to talk to. Returns namenode address specifically configured for datanodes (using service ports), if found. If not, regular RPC address configured for other clients is returned.
      Parameters:
      conf - configuration
      Returns:
      list of InetSocketAddress
      Throws:
      IOException - on error
    • getNNLifelineRpcAddressesForCluster

      public static Map<String,Map<String,InetSocketAddress>> getNNLifelineRpcAddressesForCluster(org.apache.hadoop.conf.Configuration conf) throws IOException
      Returns list of InetSocketAddresses corresponding to lifeline RPC servers at namenodes from the configuration.
      Parameters:
      conf - configuration
      Returns:
      list of InetSocketAddress
      Throws:
      IOException - on error
    • getNamenodeLifelineAddr

      public static String getNamenodeLifelineAddr(org.apache.hadoop.conf.Configuration conf, String nsId, String nnId)
      Map a logical namenode ID to its lifeline address. Use the given nameservice if specified, or the configured one if none is given.
      Parameters:
      conf - Configuration
      nsId - which nameservice nnId is a part of, optional
      nnId - the namenode ID to get the service addr for
      Returns:
      the lifeline addr, null if it could not be determined
    • flattenAddressMap

      public static List<DFSUtil.ConfiguredNNAddress> flattenAddressMap(Map<String,Map<String,InetSocketAddress>> map)
      Flatten the given map, as returned by other functions in this class, into a flat list of DFSUtil.ConfiguredNNAddress instances.
    • addressMapToString

      public static String addressMapToString(Map<String,Map<String,InetSocketAddress>> map)
      Format the given map, as returned by other functions in this class, into a string suitable for debugging display. The format of this string should not be considered an interface, and is liable to change.
    • nnAddressesAsString

      public static String nnAddressesAsString(org.apache.hadoop.conf.Configuration conf)
    • getInternalNsRpcUris

      public static Collection<URI> getInternalNsRpcUris(org.apache.hadoop.conf.Configuration conf)
      Get a URI for each internal nameservice. If a nameservice is HA-enabled, and the configured failover proxy provider supports logical URIs, then the logical URI of the nameservice is returned. Otherwise, a URI corresponding to an RPC address of the single NN for that nameservice is returned, preferring the service RPC address over the client RPC address.
      Parameters:
      conf - configuration
      Returns:
      a collection of all configured NN URIs, preferring service addresses
    • getNameServiceIdFromAddress

      public static String getNameServiceIdFromAddress(org.apache.hadoop.conf.Configuration conf, InetSocketAddress address, String... keys)
      Given the InetSocketAddress this method returns the nameservice Id corresponding to the key with matching address, by doing a reverse lookup on the list of nameservices until it finds a match. Since the process of resolving URIs to Addresses is slightly expensive, this utility method should not be used in performance-critical routines.
      Parameters:
      conf - - configuration
      address - - InetSocketAddress for configured communication with NN. Configured addresses are typically given as URIs, but we may have to compare against a URI typed in by a human, or the server name may be aliased, so we compare unambiguous InetSocketAddresses instead of just comparing URI substrings.
      keys - - list of configured communication parameters that should be checked for matches. For example, to compare against RPC addresses, provide the list DFS_NAMENODE_SERVICE_RPC_ADDRESS_KEY, DFS_NAMENODE_RPC_ADDRESS_KEY. Use the generic parameter keys, not the NameServiceId-suffixed keys.
      Returns:
      nameserviceId, or null if no match found
    • getInfoServer

      public static URI getInfoServer(InetSocketAddress namenodeAddr, org.apache.hadoop.conf.Configuration conf, String scheme) throws IOException
      return server http or https address from the configuration for a given namenode rpc address.
      Parameters:
      namenodeAddr - - namenode RPC address
      conf - configuration
      scheme - - the scheme (http / https)
      Returns:
      server http or https address
      Throws:
      IOException
    • getInfoServerWithDefaultHost

      public static URI getInfoServerWithDefaultHost(String defaultHost, org.apache.hadoop.conf.Configuration conf, String scheme) throws IOException
      Lookup the HTTP / HTTPS address of the namenode, and replace its hostname with defaultHost when it found out that the address is a wildcard / local address.
      Parameters:
      defaultHost - The default host name of the namenode.
      conf - The configuration
      scheme - HTTP or HTTPS
      Throws:
      IOException
    • getHttpClientScheme

      public static String getHttpClientScheme(org.apache.hadoop.conf.Configuration conf)
      Determine whether HTTP or HTTPS should be used to connect to the remote server. Currently the client only connects to the server via HTTPS if the policy is set to HTTPS_ONLY.
      Returns:
      the scheme (HTTP / HTTPS)
    • setGenericConf

      public static void setGenericConf(org.apache.hadoop.conf.Configuration conf, String nameserviceId, String nnId, String... keys)
      Sets the node specific setting into generic configuration key. Looks up value of "key.nameserviceId.namenodeId" and if found sets that value into generic key in the conf. If this is not found, falls back to "key.nameserviceId" and then the unmodified key. Note that this only modifies the runtime conf.
      Parameters:
      conf - Configuration object to lookup specific key and to set the value to the key passed. Note the conf object is modified.
      nameserviceId - nameservice Id to construct the node specific key. Pass null if federation is not configuration.
      nnId - namenode Id to construct the node specific key. Pass null if HA is not configured.
      keys - The key for which node specific value is looked up
    • roundBytesToGB

      public static int roundBytesToGB(long bytes)
      Round bytes to GiB (gibibyte)
      Parameters:
      bytes - number of bytes
      Returns:
      number of GiB
    • getNamenodeNameServiceId

      public static String getNamenodeNameServiceId(org.apache.hadoop.conf.Configuration conf)
      Get nameservice Id for the NameNode based on namenode RPC address matching the local node address.
    • getBackupNameServiceId

      public static String getBackupNameServiceId(org.apache.hadoop.conf.Configuration conf)
      Get nameservice Id for the BackupNode based on backup node RPC address matching the local node address.
    • getSecondaryNameServiceId

      public static String getSecondaryNameServiceId(org.apache.hadoop.conf.Configuration conf)
      Get nameservice Id for the secondary node based on secondary http address matching the local node address.
    • getBindAddress

      public static InetSocketAddress getBindAddress(org.apache.hadoop.conf.Configuration conf, String confKey, String defaultValue, String bindHostKey)
      Determine the InetSocketAddress to bind to, for any service. In case of HA or federation, the address is assumed to specified as confKey.NAMESPACEID.NAMENODEID as appropriate.
      Parameters:
      conf - configuration.
      confKey - configuration key (prefix if HA/federation) used to specify the address for the service.
      defaultValue - default value for the address.
      bindHostKey - configuration key (prefix if HA/federation) specifying host to bind to.
      Returns:
      the address to bind to.
    • createUri

      public static URI createUri(String scheme, InetSocketAddress address)
      Create an URI from scheme and address.
    • createUri

      public static URI createUri(String scheme, String host, int port)
      Create an URI from scheme, host, and port.
    • addInternalPBProtocol

      @Private @Unstable public static void addInternalPBProtocol(org.apache.hadoop.conf.Configuration conf, Class<?> protocol, org.apache.hadoop.thirdparty.protobuf.BlockingService service, org.apache.hadoop.ipc.RPC.Server server) throws IOException
      Add protobuf based protocol to the RPC.Server. This method is for exclusive use by the hadoop libraries, as its signature changes with the version of the shaded protobuf library it has been built with.
      Parameters:
      conf - configuration
      protocol - Protocol interface
      service - service that implements the protocol
      server - RPC server to which the protocol & implementation is added to
      Throws:
      IOException - failure
    • addPBProtocol

      @Deprecated public static void addPBProtocol(org.apache.hadoop.conf.Configuration conf, Class<?> protocol, org.apache.hadoop.thirdparty.protobuf.BlockingService service, org.apache.hadoop.ipc.RPC.Server server) throws IOException
      Deprecated.
      Add protobuf based protocol to the RPC.Server. Deprecated as it will only reliably compile if an unshaded protobuf library is also on the classpath.
      Parameters:
      conf - configuration
      protocol - Protocol interface
      service - service that implements the protocol
      server - RPC server to which the protocol & implementation is added to
      Throws:
      IOException
    • addPBProtocol

      @Deprecated public static void addPBProtocol(org.apache.hadoop.conf.Configuration conf, Class<?> protocol, com.google.protobuf.BlockingService service, org.apache.hadoop.ipc.RPC.Server server) throws IOException
      Deprecated.
      Add protobuf based protocol to the RPC.Server. This engine uses Protobuf 2.5.0. Recommended to upgrade to Protobuf 3.x from hadoop-thirdparty and use addInternalPBProtocol(Configuration, Class, BlockingService, RPC.Server).
      Parameters:
      conf - configuration
      protocol - Protocol interface
      service - service that implements the protocol
      server - RPC server to which the protocol & implementation is added to
      Throws:
      IOException
    • getNamenodeServiceAddr

      public static String getNamenodeServiceAddr(org.apache.hadoop.conf.Configuration conf, String nsId, String nnId)
      Map a logical namenode ID to its service address. Use the given nameservice if specified, or the configured one if none is given.
      Parameters:
      conf - Configuration
      nsId - which nameservice nnId is a part of, optional
      nnId - the namenode ID to get the service addr for
      Returns:
      the service addr, null if it could not be determined
    • getNamenodeWebAddr

      public static String getNamenodeWebAddr(org.apache.hadoop.conf.Configuration conf, String nsId, String nnId)
      Map a logical namenode ID to its web address. Use the given nameservice if specified, or the configured one if none is given.
      Parameters:
      conf - Configuration
      nsId - which nameservice nnId is a part of, optional
      nnId - the namenode ID to get the service addr for
      Returns:
      the service addr, null if it could not be determined
    • getWebAddressesForNameserviceId

      public static Map<String,InetSocketAddress> getWebAddressesForNameserviceId(org.apache.hadoop.conf.Configuration conf, String nsId, String defaultValue)
      Get all of the Web addresses of the individual NNs in a given nameservice.
      Parameters:
      conf - Configuration
      nsId - the nameservice whose NNs addresses we want.
      defaultValue - default address to return in case key is not found.
      Returns:
      A map from nnId -> Web address of each NN in the nameservice.
    • getOnlyNameServiceIdOrNull

      public static String getOnlyNameServiceIdOrNull(org.apache.hadoop.conf.Configuration conf)
      If the configuration refers to only a single nameservice, return the name of that nameservice. If it refers to 0 or more than 1, return null.
    • parseHelpArgument

      public static boolean parseHelpArgument(String[] args, String helpDescription, PrintStream out, boolean printGenericCommandUsage)
      Parse the arguments for commands
      Parameters:
      args - the argument to be parsed
      helpDescription - help information to be printed out
      out - Printer
      printGenericCommandUsage - whether to print the generic command usage defined in ToolRunner
      Returns:
      true when the argument matches help option, false if not
    • getInvalidateWorkPctPerIteration

      public static float getInvalidateWorkPctPerIteration(org.apache.hadoop.conf.Configuration conf)
      Get DFS_NAMENODE_INVALIDATE_WORK_PCT_PER_ITERATION from configuration.
      Parameters:
      conf - Configuration
      Returns:
      Value of DFS_NAMENODE_INVALIDATE_WORK_PCT_PER_ITERATION
    • getReplWorkMultiplier

      public static int getReplWorkMultiplier(org.apache.hadoop.conf.Configuration conf)
      Get DFS_NAMENODE_REPLICATION_WORK_MULTIPLIER_PER_ITERATION from configuration.
      Parameters:
      conf - Configuration
      Returns:
      Value of DFS_NAMENODE_REPLICATION_WORK_MULTIPLIER_PER_ITERATION
    • getSpnegoKeytabKey

      public static String getSpnegoKeytabKey(org.apache.hadoop.conf.Configuration conf, String defaultKey)
      Get SPNEGO keytab Key from configuration
      Parameters:
      conf - Configuration
      defaultKey - default key to be used for config lookup
      Returns:
      DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY if the key is not empty else return defaultKey
    • getHttpPolicy

      public static org.apache.hadoop.http.HttpConfig.Policy getHttpPolicy(org.apache.hadoop.conf.Configuration conf)
      Get http policy.
    • loadSslConfToHttpServerBuilder

      public static org.apache.hadoop.http.HttpServer2.Builder loadSslConfToHttpServerBuilder(org.apache.hadoop.http.HttpServer2.Builder builder, org.apache.hadoop.conf.Configuration sslConf)
    • getPassword

      public static String getPassword(org.apache.hadoop.conf.Configuration conf, String alias)
      Leverages the Configuration.getPassword method to attempt to get passwords from the CredentialProvider API before falling back to clear text in config - if falling back is allowed.
      Parameters:
      conf - Configuration instance
      alias - name of the credential to retreive
      Returns:
      String credential value or null
    • dateToIso8601String

      public static String dateToIso8601String(Date date)
      Converts a Date into an ISO-8601 formatted datetime string.
    • durationToString

      public static String durationToString(long durationMs)
      Converts a time duration in milliseconds into DDD:HH:MM:SS format.
    • parseRelativeTime

      public static long parseRelativeTime(String relTime) throws IOException
      Converts a relative time string into a duration in milliseconds.
      Throws:
      IOException
    • loadSslConfiguration

      public static org.apache.hadoop.conf.Configuration loadSslConfiguration(org.apache.hadoop.conf.Configuration conf)
      Load HTTPS-related configuration.
    • getHttpServerTemplate

      public static org.apache.hadoop.http.HttpServer2.Builder getHttpServerTemplate(org.apache.hadoop.conf.Configuration conf, InetSocketAddress httpAddr, InetSocketAddress httpsAddr, String name, String spnegoUserNameKey, String spnegoKeytabFileKey) throws IOException
      Return a HttpServer.Builder that the journalnode / namenode / secondary / router / balancer namenode can use to initialize their HTTP / HTTPS server.
      Throws:
      IOException
    • assertAllResultsEqual

      public static void assertAllResultsEqual(Collection<?> objects) throws AssertionError
      Assert that all objects in the collection are equal. Returns silently if so, throws an AssertionError if any object is not equal. All null values are considered equal.
      Parameters:
      objects - the collection of objects to check for equality.
      Throws:
      AssertionError
    • createKeyProviderCryptoExtension

      public static org.apache.hadoop.crypto.key.KeyProviderCryptoExtension createKeyProviderCryptoExtension(org.apache.hadoop.conf.Configuration conf) throws IOException
      Creates a new KeyProviderCryptoExtension by wrapping the KeyProvider specified in the given Configuration.
      Parameters:
      conf - Configuration
      Returns:
      new KeyProviderCryptoExtension, or null if no provider was found.
      Throws:
      IOException - if the KeyProvider is improperly specified in the Configuration
    • decodeDelegationToken

      public static org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier decodeDelegationToken(org.apache.hadoop.security.token.Token<org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier> token) throws IOException
      Decodes an HDFS delegation token to its identifier.
      Parameters:
      token - the token
      Returns:
      the decoded identifier.
      Throws:
      IOException
    • checkProtectedDescendants

      public static void checkProtectedDescendants(FSDirectory fsd, INodesInPath iip) throws org.apache.hadoop.security.AccessControlException, org.apache.hadoop.fs.UnresolvedLinkException, org.apache.hadoop.fs.ParentNotDirectoryException
      Throw if the given directory has any non-empty protected descendants (including itself).
      Parameters:
      fsd - the namespace tree.
      iip - directory whose descendants are to be checked.
      Throws:
      org.apache.hadoop.security.AccessControlException - if a non-empty protected descendant was found.
      org.apache.hadoop.fs.ParentNotDirectoryException
      org.apache.hadoop.fs.UnresolvedLinkException
    • getFlags

      public static EnumSet<org.apache.hadoop.hdfs.protocol.HdfsFileStatus.Flags> getFlags(boolean isEncrypted, boolean isErasureCoded, boolean isSnapShottable, boolean hasAcl)
      Generates HdfsFileStatus flags.
      Parameters:
      isEncrypted - Sets HAS_CRYPT
      isErasureCoded - Sets HAS_EC
      isSnapShottable - Sets SNAPSHOT_ENABLED
      hasAcl - Sets HAS_ACL
      Returns:
      HdfsFileStatus Flags
    • isParentEntry

      public static boolean isParentEntry(String path, String parent)
      Check if the given path is the child of parent path.
      Parameters:
      path - Path to be check.
      parent - Parent path.
      Returns:
      True if parent path is parent entry for given path.
    • addTransferRateMetric

      public static void addTransferRateMetric(DataNodeMetrics metrics, long read, long durationInNS)
      Add transfer rate metrics in bytes per second.
      Parameters:
      metrics - metrics for datanodes
      read - bytes read
      durationInNS - read duration in nanoseconds
    • getTransferRateInBytesPerSecond

      public static long getTransferRateInBytesPerSecond(long bytes, long durationInNS)
      Calculate the transfer rate in bytes per second. We have the read duration in nanoseconds for precision for transfers taking a few nanoseconds. We treat shorter durations below 1 ns as 1 ns as we also want to capture reads taking less than a nanosecond. To calculate transferRate in bytes per second, we avoid multiplying bytes read by 10^9 to avoid overflow. Instead, we first calculate the duration in seconds in double to keep the decimal values for smaller durations. We then divide bytes read by durationInSeconds to get the transferRate in bytes per second. We also replace a negative value for transferred bytes with 0 byte.
      Parameters:
      bytes - bytes read
      durationInNS - read duration in nanoseconds
      Returns:
      bytes per second