How to use kafka and storm on cloudfoundry?


I want to know if it is possible to run kafka as a cloud-native application, and can I create a kafka cluster as a service on Pivotal Web Services. I don't want only client integration, I want to run the kafka cluster/service itself?

Thanks, Anil


Answers:


I can point you at a few starting points, there would be some work involved to go from those starting points to something fully functional.

One option is to deploy the kafka cluster on Cloud Foundry (e.g. Pivotal Web Services) using docker images. Spotify has Dockerized kafka and kafka-proxy (including Zookeeper). One thing to keep in mind is that PWS currently doesn't support apps with persistence (although this work is starting) so if you were to go this route right now, you would lose the data in kafka when the application is rolled. Looking at that Spotify repo, it looks like the docker images are generally run without any mounted volumes, so this persistence-less kafka seems like it may be a valid use case (I don't know enough about kafka to say).

The other option is to deploy kafka directly on some IaaS (e.g. AWS) using BOSH. BOSH can be hard if you're seeing it for the first time, but it is the ideal way to deploy any distributed software that you want running on VMs. You will also be able to have persistent volumes attached to your kafka VMs if necessary. Here is a kafka BOSH release which may work.

Once you have your cluster running, you have two ways to integrate your Cloud Foundry applications with it. The simplest is just to provide it to your applications as a "user-provided service", which lets you flow kafka cluster access info to your apps. The alternative would to put a service broker in front of your cluster, which would be especially useful if you have many different people who will be pushing apps that need to talk to the kafka cluster. Rather than you having to manually tell people the access info each time, they can do something simple like cf bind-service SOME_APP YOUR_KAFKA_SERVICE. Here is a kafka service broker along with more info about service brokers in general.