Setting Up Hadoop Credential Provider API

Today, security is the main concern to everyone and when you product need to be deployed on premises there are few things which need to be provided to our application, a very basic example is database password, today industries are not ready to put them in a configuration file in cleartext format, everyone is looking for encryption. Which is now commonly known as Vault.

Here I’ve prepared a working vault using hadoop credential provider api.

Passwordless

This command will generate hdfs.jceks file on HDFS: [Hence no need to localise]

HDFS: Create alias and save password
hadoop credential create db.password -value db_123 -provider jceks://hdfs/credentials/hdfs.jceks

This command will generate hdfs.jceks file on local FS:

FS: Create alias and save password
hadoop credential create db.password -value db_123 -provider jceks://file/credentials/file.jceks

Java API to access the password:

Fetch password using credential API
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.security.alias.CredentialProviderFactory

object HC {

  def main(args: Array[String]): Unit = {

//    val path = "jceks://file/home/ec2-user/example/file.jceks"
    val path = "jceks://hdfs/credentials/hdfs.jceks"
    val conf = new Configuration()

    val provider = conf.get(path)
    conf.set(CredentialProviderFactory.CREDENTIAL_PROVIDER_PATH, path)

    val credentialProvider = CredentialProviderFactory.getProviders(conf).get(0)
    println(credentialProvider.getAliases)
    val password = credentialProvider.getCredentialEntry("db.password").getCredential.mkString

    println(password)
  }

}
output
[db.password, aws.secret.key.password]
db_123

With Password

  1. Setting password using environment variable

    Set Password
    export HADOOP_CREDSTORE_PASSWORD=TEST-password@12
  2. There is another option as well to put password in a file and make it available on Hadoop classpath.
    1. The name of the file can be specified by `hadoop.security.credstore.java-keystore-provider.password-file` property and then Hadoop will search for this file name on classpath and then it will get the password from file.
    2. HDFS: Create alias and save password
      hadoop credential -Dhadoop.security.credstore.java-keystore-provider.password-file=hdfs.jceks.password create db.password -value db_123 -provider jceks://hdfs/credentials/hdfs.jceks

      hdfs.jceks.password is the password file name.

    3. `hadoop.security.credstore.java-keystore-provider.password-file` this property can also be the part of core-site.xml but then it will be wide visible to all the jobs working on same cluster.

Java API to access the password:

Fetch password using credential API
import java.io.{File, IOException}
import java.net.{URL, URLClassLoader}
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.CommonConfigurationKeysPublic
import org.apache.hadoop.security.alias.{CredentialProvider, CredentialProviderFactory}
import scala.util.{Failure, Success, Try}

object VaultConfig extends Logger {

  def getCredential(alias: String, default: Option[String] = None): String = {

    Try(getCredentialProvider) match {
      case Success(provider) => {
        if (provider.getAliases.contains(alias)) {
          logger.info(s"Fetching value for alias $alias from the vault")
          provider.getCredentialEntry(alias).getCredential.mkString
        } else {
          throw new Exception(s"Value not found for $alias from the vault, loading from the dynamic properties")
        }
      }
      case Failure(ex) => {
        logger.error(s"Failed to load from ${Constants.VAULT_LOCATION}", ex)
        throw new Exception(s"Failed to load from vault", ex)
      }
    }
  }

  private def getCredentialProvider: CredentialProvider = {

    // val vaultLocation = "jceks://file/home/ec2-user/example/file.jceks"
    val vaultLocationFile = "jceks://hdfs/credentials/hdfs.jceks"
    val localVaultPasswordFile = new File("/credentials/hdfs.jceks.password") //Should be on local, if not then need to localise

    logger.info(s"Loading vault from config $vaultLocationFile")
    logger.info(s"Loading password from $localVaultPasswordFile")

    val conf = new Configuration()
    conf.set(CredentialProviderFactory.CREDENTIAL_PROVIDER_PATH, vaultLocationFile)

    dynamicallyLoadDirToClassPath(localVaultPasswordFile.getParent)

    conf.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_CREDENTIAL_PASSWORD_FILE_KEY, localVaultPasswordFile.getName)
    CredentialProviderFactory.getProviders(conf).get(0)
  }

  private def dynamicallyLoadDirToClassPath(u: String) {

    try {
      logger.info(s"Dynamically adding dir $u to the classpath")
      val dirObj = new File(u).toURI.toURL match {
        case o: Object => o
        case _ => throw new Exception("impossible")
      }
      val method = classOf[URLClassLoader].getDeclaredMethod("addURL", classOf[URL])
      method.setAccessible(true)
      method.invoke(Thread.currentThread().getContextClassLoader, dirObj)
    } catch {
      case t: Exception => t.printStackTrace()
        throw new IOException("Error, could not add URL to system classloader");
    }
  }
}
NOTE: You can export HADOOP_CREDSTORE_PASSWORD option to provide the password while creating vault file and then you can put this password in the file and can you use above Java API to use that file password while reading from vault.

For more details : CredentialProviderAPI link

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a website or blog at WordPress.com

Up ↑

%d bloggers like this: