Workaround for com.microsoft.aad.adal4j.AuthenticationException when accessing SQL Server table via Active Directory in Databricks

Symptom

When using Databricks 5.5 LTS to read a table from SQL Server using Azure Active Directory (AAD) authentication, the following exception occurs:

Error : java.lang.NoClassDefFoundError: com/microsoft/aad/adal4j/AuthenticationException Error : java.lang.NoClassDefFoundError: com/microsoft/aad/adal4j/AuthenticationException
 at com.microsoft.sqlserver.jdbc.SQLServerConnection.getFedAuthToken(SQLServerConnection.java:3609)
 at com.microsoft.sqlserver.jdbc.SQLServerConnection.onFedAuthInfo(SQLServerConnection.java:3580)
 at com.microsoft.sqlserver.jdbc.SQLServerConnection.processFedAuthInfo(SQLServerConnection.java:3548)
 at com.microsoft.sqlserver.jdbc.TDSTokenHandler.onFedAuthInfo(tdsparser.java:261)
 at com.microsoft.sqlserver.jdbc.TDSParser.parse(tdsparser.java:103)
 at com.microsoft.sqlserver.jdbc.SQLServerConnection.sendLogon(SQLServerConnection.java:4290)
 at com.microsoft.sqlserver.jdbc.SQLServerConnection.logon(SQLServerConnection.java:3157)
 at com.microsoft.sqlserver.jdbc.SQLServerConnection.access$100(SQLServerConnection.java:82)
 at com.microsoft.sqlserver.jdbc.SQLServerConnection$LogonCommand.doExecute(SQLServerConnection.java:3121)
 at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7151)
 at ...io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
 at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: com.microsoft.aad.adal4j.AuthenticationException
 at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 59 more...

Cause

https://github.com/Azure/azure-sqldb-spark/issues/28

Workaround steps

1 – Create a new init script which will remove legacy MSSQL drivers from the cluster. The following commands create a new directory on DBFS and then create a shell script with a single command to remove mssql driver JARs:

%sh
mkdir /dbfs/myInitScriptDir
echo "rm /databricks/jars/*mssql*" > /dbfs/myInitScriptDir/myInitScript.sh

2 – Add the cluster init script in Clusters > Cluster > Edit > Advanced Options:

3 – Add the following two libraries to the cluster via Clusters > Cluster > Libraries > Install new:

com.microsoft.azure:adal4j:1.6.5
com.microsoft.sqlserver:mssql-jdbc:8.4.1.jre8

4 – Restart the cluster.

5 – Run the following R code in aworkbook cell to validate that AAD authentication is working. NB – Replace the placeholder values in bold:

library(sparklyr)

connection <- spark_connect(method = "databricks")

x <- spark_read_jdbc(
connection,
name = 'mytemptable',
options = list(
url = 'jdbc:sqlserver://myazuresqlserver.database.windows.net:1433;database=myazuresqldatabase;authentication=ActiveDirectoryPassword;',
driver = 'com.microsoft.sqlserver.jdbc.SQLServerDriver',
user = 'myuser@example.com',
password = 'XXXXXXXX',
hostNameInCertificate = '*.database.windows.net',
dbtable = 'dbo.mytable'
)
)

x

After running the command “x” above, the table data should be displayed.

Conclusion

The Azure SQL Database table can now be read and the AuthenticationException no longer occurs:

Successful table query after spark_read_jdbc()

Credit: This workaround is based on thereverand‘s very helpful post on GitHub here.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s