Best Practices for Kafka Security

Nitesh Jangir

Data Engineering

Tags:

kafka

security

stream processing

Overview‍

We will cover the security concepts of Kafka and walkthrough the implementation of encryption, authentication, and authorization for the Kafka cluster.

This article will explain how to configure SASL_SSL (simple authentication security layer) security for your Kafka cluster and how to protect the data in transit. SASL_SSL is a communication type in which clients use authentication mechanisms like PLAIN, SCRAM, etc., and the server uses SSL certificates to establish secure communication. We will use the SCRAM authentication mechanism here for the client to help establish mutual authentication between the client and server. We'll also discuss authorization and ACLs, which are important for securing your cluster.

Prerequisites

‍Running Kafka Cluster, basic understanding of security components.

Need for Kafka Security

The primary reason is to prevent unlawful internet activities for the purpose of misuse, modification, disruption, and disclosure. So, to understand the security in Kafka cluster a secure Kafka cluster, we need to know three terms:

Authentication - It is a security method used for servers to determine whether users have permission to access their information or website.

Authorization - The authorization security method implemented with authentication enables servers to have a methodology of identifying clients for access. Basically, it gives limited access, which is sufficient for the client.

Encryption - It is the process of transforming data to make it distorted and unreadable without a decryption key. Encryption ensures that no other client can intercept and steal or read data.

Here is the quick start guide by Apache Kafka, so check it out if you still need to set up Kafka.

https://kafka.apache.org/quickstart

We’ll not cover the theoretical aspects here, but you can find a ton of sources on how these three components work internally. For now, we’ll focus on the implementation part and how Kafka revolves around security.

This image illustrates SSL communication between the Kafka client and server.

We are going to implement the steps in the below order:

Create a Certificate Authority
Create a Truststore & Keystore

Certificate Authority - It is a trusted entity that issues SSL certificates. As such, a CA is an independent entity that acts as a trusted third party, issuing certificates for use by others. A certificate authority validates the credentials of a person or organization that requests a certificate before issuing one.

Truststore - A truststore contains certificates from other parties with which you want to communicate or certificate authorities that you trust to identify other parties. In simple words, a list of CAs that can validate the certificate signed by the trusted CA.

KeyStore - A KeyStore contains private keys and certificates with their corresponding public keys. Keystores can have one or more CA certificates depending upon what’s needed.

For Kafka Server, we need a server certificate, and here, Keystore comes into the picture since it stores a server certificate. The server certificate should be signed by Certificate Authority (CA). The KeyStore requests to sign the server certificate and in response, CA send a signed CRT to Keystore.

We will create our own certificate authority for demonstration purposes. If you don’t want to create a private certificate authority, there are many certificate providers you can go with, like IdenTrust and GoDaddy. Since we are creating one, we need to tell our Kafka client to trust our private certificate authority using the Trust Store.

This block diagram shows you how all the components communicate with each other and their role to generate the final certificate.

So, let’s create our Certificate Authority. Run the below command in your terminal:

“openssl req -new -keyout <private_key_name> -out <public_certificate_name>”

It will ask for a passphrase, and keep it safe for future use cases. After successfully executing the command, we should have two files named private_key_name and public_certificate_name.

Now, let’s create a KeyStore and trust store for brokers; we need both because brokers also interact internally with each other. Let’s understand with the help of an example: Broker A wants to connect with Broker B, so Broker A acts as a client and Broker B as a server. We are using the SASL_SSL protocol, so A needs SASL credentials, and B needs a certificate for authentication. The reverse is also possible where Broker B wants to connect with Broker A, so we need both a KeyStore and a trust store for authentication.

Now let’s create a trust store. Execute the below command in the terminal, and it should ask for the password. Save the password for future use:

“keytool -keystore <truststore_name.jks> -alias <alias name of the entry to process> -import -file <public_certificate_name>”

Here, we are using the .jks extension for the file, which stands for Java KeyStore. You can also use Public-Key Cryptography Standards #12 (pkcs12) instead of .jks, but that’s totally up to you. public_certificate_name is the same certificate while we create CA.

For the KeyStore configuration, run the below command and store the password:

“keytool genkey -keystore <keystore_name.jks> -validity <number_of_days> -storepass <store_password> -genkey -alias <alias_name> -keyalg <key algorithm name> -ext SAN=<"DNS:localhost”>”

This action creates the KeyStore file in the current working directory. The question "First and Last Name" requires you to enter a fully qualified domain name because some certificate authorities, such as VeriSign, expect this property to be a fully qualified domain name. Not all CAs require a fully qualified domain name, but I recommend using a fully qualified domain name for portability. All other information should be valid. If the information cannot be verified, a certificate authority such as VeriSign will not sign the CSR generated for that record. I’m using localhost for the domain name here, as seen in the above command itself.

Keystore has an entry with alias_name. It contains the private key and information needed for generating a CSR. Now let's create a signing certificate request, so it will be used to get a signed certificate from Certificate Authority.

Execute the below command in your terminal:

“keytool -keystore <keystore_name.jks> -alias <alias_name> -certreq -file <file_name.csr>”

So, we have generated a signing certificate request using a KeyStore (the KeyStore name and alias name should be the same). It should ask for the KeyStore password, so enter the same one used while creating the KeyStore.

Now, execute the below command. It will ask for the password, so enter the CA password, and now we have a signed certificate:

“openssl x509 -req -CA <public_certificate_name> -CAkey <private_key_name> -in <csr file> -out <signed_file_name> -CAcreateserial”

Finally, we need to add the public certificate of CA and signed certificate in the KeyStore, so run the below command. It will add the CA certificate to the KeyStore.

“keytool -keystore <keystore_name.jks> -alias <public_certificate_name> -import -file <public_certificate_name>”

Now, let’s run the below command; it will add the signed certificate to the KeyStore.

“keytool -keystore <keystore_name.jks> -alias <alias_name> -import -file <signed_file_name>”

As of now, we have generated all the security files for the broker. For internal broker communication, we are using SASL_SSL (see security.inter.broker.protocol in server.properties). Now we need to create a broker username and password using the SCRAM method. For more details, click here.

Run the below command:

“kafka-configs.sh --zookeeper <host: port> --entity-type users --entity-name <username> --alter --add-config 'SCRAM-SHA-512=[password=<password>]'”

NOTE: Credentials for inter-broker communication must be created before Kafka brokers are started.

Now, we need to configure the Kafka broker property file, so update the file as given below:

CODE: https://gist.github.com/velotiotech/253f7d226d1b4551bafb783976a2d84b.js

NOTE: If you are using an external jaas config file, then remove the ScramLoginModule line and set this environment variable before starting broker. “export KAFKA_OPTS=-Djava.security.auth.login.config={path/to/broker.conf}”

Now, if we run Kafka, the broker should be running on port 9092 without any failure, and if you have multiple brokers inside Kafka, the same config file can be replicated among them, but the port should be different for each broker.

Producers and consumers need a username and a password to access the broker, so let’s create their credentials and update respective configurations.

Create a producer user and update producer.properties inside the bin directory, so execute the below command in your terminal.

“bin/kafka-configs.sh --zookeeper <host: port> --entity-type users --entity-name <producer_name> --alter --add-config 'SCRAM-SHA-512=[password=<password>]'”

We need a trust store file for our clients (producer and consumer), but as we already know how to create a trust store, this is a small task for you. It is suggested that producers and consumers should have separate trust stores because when we move Kafka to production, there could be multiple producers and consumers on different machines.

CODE: https://gist.github.com/velotiotech/130838becee4808be721a2eb9fb77854.js

The below command creates a consumer user, so now let’s update consumer.properties inside the bin directory:

“bin/kafka-configs.sh --zookeeper <host: port> --entity-type users --entity-name <consumer_name> --alter --add-config 'SCRAM-SHA-512=[password=<password>]'”

CODE: https://gist.github.com/velotiotech/5d93e9b739b4e706353270b69a91ccf7.js

As of now, we have implemented encryption and authentication for Kafka brokers. To verify that our producer and consumer are working properly with SCRAM credentials, run the console producer and consumer on some topics.

Authorization is not implemented yet. Kafka uses access control lists (ACLs) to specify which users can perform which actions on specific resources or groups of resources. Each ACL has a principal, a permission type, an operation, a resource type, and a name.

The default authorizer is ACLAuthorizer provided by Kafka; Confluent also provides the Confluent Server Authorizer, which is totally different from ACLAuthorizer. An authorizer is a server plugin used by Kafka to authorize actions. Specifically, the authorizer controls whether operations should be authorized based on the principal and resource being accessed.

Format of ACLs - Principal P is [Allowed/Denied] Operation O from Host H on any Resource R matching ResourcePattern RP

Execute the below command to create an ACL with writing permission for the producer:

“bin/kafka-acls.sh --authorizer-properties zookeeper.connect=<host: port> --add --allow-principal User:<producer_name> --operation WRITE --topic <topic_name>”

The above command should create ACL of write operation for producer_name on topic_name.

Now, execute the below command to create an ACL with reading permission for the consumer:

“bin/kafka-acls.sh --authorizer-properties zookeeper.connect=<host: port> --add --allow-principal User:<consumer_name> --operation READ --topic <topic_name>”

Now we need to define the consumer group ID for this consumer, so the below command associates a consumer with a given consumer group ID.

“bin/kafka-acls.sh --authorizer-properties zookeeper.connect=<host: port> --add --allow-principal User:<consumer_name> --operation READ --group <consumer_group_name>”

Now, we need to add some configuration in two files: broker.properties and consumer.properties.

CODE: https://gist.github.com/velotiotech/7536047d4f6e3b11224f2aec49adcb31.js

The above line indicates that AclAuthorizer class is used for authorization.

CODE: https://gist.github.com/velotiotech/d50f6602e2eb5a46082e6f6f0d4c6141.js

Consumer group-id is mandatory, and if we do not specify any group, a consumer will not be able to access the data from topics, so to start a consumer, group-id should be provided.

Let’s test the producer and consumer one by one, run the console producer and also run the console consumer in another terminal; both should be running without error.

Voila!! Your Kafka is secured.

Summary

In a nutshell, we have implemented security in our Kafka using the SASL_SSL mechanism and learned how to create ACLs and give different permission to different users.

Apache Kafka is the wild west without security. By default, there is no encryption, authentication, or access control list. Any client can communicate with the Kafka broker using the PLAINTEXT port. Access using this port should be restricted to trusted clients only. You can use network segmentation and/or authentication ACLs to restrict access to trusted IP addresses in these cases. If none of these are used, the cluster is wide open and available to anyone. A basic knowledge of Kafka authentication, authorization, encryption, and audit trails is required to safely move a system into production.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.