# Use multiple Kerberos principals within the same JVM

This is my second post in the Data Engineering category. In this post, I discuss how to authenticate and use multiple Kerberos principals within the same JVM. Note that the examples in this post use the Hadoop UserGroupInformation API.

Note: This post is neither an introduction to Kerberos, nor a deep dive. Some familiarity with Kerberos is required.

## What is Kerberos?

Kerberos1 is a sophisticated, and widely used, network authentication protocol developed by MIT. Kerberos is usually used by client-server applications, in order for a client to prove its identity to the server (and potentially vice-versa). Kerberos is frequently used with SASL and TLS to protect communication between the client and server. The Hadoop ecosystem, specifically the Hadoop UserGroupInformation (abbreviated UGI) API, provides a comprehensive framework for using Kerberos and TLS in your applications, but it suffers from a complex implementation and lack of comprehensive documentation.

Note: In all code examples in this post, Scaladoc comments are provided where useful, and omitted where unnecessary for brevity.

The vast majority of applications built for Hadoop use a single Kerberos principal for the lifetime of the application. Hadoop’s UGI API provides a simple way to work with this:

The above code works great when only a single Kerberos principal is to be used within the app, but does not work well when multiple principals are to be used. This is because the UserGroupInformation.loginUserFromKeytab method sets static state within the JVM, meaning that whatever principal is being authenticated will be used for all subsequent operations that require authentication.

For using multiple principals within the same JVM, Hadoop provides another method - UserGroupInformation.loginUserFromKeytabAndReturnUGI - which authenticates the specified principal using the specified keytab, and returns the authenticated principal as a UserGroupInformation object. This makes it very flexible to work with multiple principals within the same application. This is where I find even Hadoop experts fumble, so here’s (an attempt at) a comprehensive example:

In the code above, the method authenticateAndGetUGI authenticates the specified principal to the KDC using the specified keytab, and returns this authenticated principal as a UserGroupInformation object. The method ugiDoAs provides a helpful abstraction for executing any arbitrary piece of code as the specified Kerberos principal. Finally, this is demonstrated in the method doYourThing, which authenticates two users - userA and userB - and then invokes the ugiDoAs method to execute arbitrary code as those users.

That’s it for this post! Hope that helps you navigate the rather complicated Hadoop and Kerberos maze.

1. Neuman, B. C., & Ts’o, T. (1994). Kerberos: An authentication service for computer networks. IEEE Communications magazine, 32(9), 33-38.

