Skip to content

Kafka Summit 2023: Announcements & Trends

The Kafka Summit, an annual event dedicated to Apache Kafka and its ecosystem, brought together industry experts, developers, and enthusiasts to discuss the latest advancements and practical applications of event streaming and microservices. In this article, our CTO Sabri Skhiri will delve into some of the keynotes and talks that took place during the summit, highlighting the noteworthy insights and innovations shared by industry leaders.

<Stream processing, data stream processing, data stream management, Complex event processing, high availability streaming>
 

Main Announcements for Kafka Enthusiasts

The first keynote speech was delivered by Mr Jay Kreps, the Kafka guru who created Kafka with Neha Narkhede and Jun Rao in 2011 at LinkedIn and then co-founded Confluent in 2014. Kreps presented the main trends of the conference and listed key announcements that are set to transform the Kafka experience. 

Interestingly, it reminded me of the Spark Summit in 2018 when Databricks went 100% on the cloud with Delta Lake. At that time, there was a notable trend among major players in the data processing and analytics space, such as Kafka, Cloudera, and Spark, to shift towards offering their platforms as cloud-based services. They realised that their existing customer bases were huge, but they were not financially exploiting it due to limitations associated with on-premises deployments—the solution: transitioning to the cloud. 

In 2017, Kafka released a fully managed platform that could welcome these millions of unreachable customers, relieved of the burden of infrastructure management. Indeed, that should have made their Kafka experience way easier: auto-scaling, high availability, multi-AZ management (a nightmare in Kafka), etc. It should have been great, right? But if you use a central component like Kafka, you need several things: 

  • Integration with your existing data landscape
  • Integration with strong governance 
  • A strong data flow processing system that can offer a real-time data service, but also cataloguing, data mining, etc. 
  • Integration with your partners’ ecosystem by sharing your data. 

All these elements were missing until… the announcements of the Kafka 2023 Summit: 

  • A new cloud management platform: KORA is a large-scale, multi-tenant Kafka platform in the cloud. With KORA, users can scale their Kafka infrastructure, ensure high availability, and manage multi-AZ deployments, thereby simplifying the Kafka experience for millions of users previously beyond reach.
  • Integration of Flink SQL within Confluent Cloud Platform: this integration brings powerful stream processing capabilities, data exploration, and real-time analytics to users. This is a really nice piece of work. By connecting to a Kafka cloud Flink environment, users gain access to a streamlined experience that includes viewing the database catalogue, exploring tables, materialising results in persistent Kafka topics, and more. The Confluent cloud management platform seamlessly handles deployment, upgrades, and continuous integration across different environments.
  • Enhanced Connectivity with Custom Connectors: Confluent answered the demand for greater flexibility in data integration by announcing the general availability (GA) of custom connector features within the Confluent Cloud platform. This allows users to deploy their own connectors or choose from a selection of over 100 open-source connectors. That should help to boost your on-prem and on-cloud integration.
  • Empowering Data Governance: they also announced at the summit the GA of the governance feature that supports schema management, data validation, lineage tracking, and now incorporates data quality rules! Users can apply data quality rules directly to their data streams, configure necessary transformations, and implement a dead-letter queue pattern. 
  • Stream Sharing Beyond Enterprise Boundaries: they announced the GA of the stream-sharing feature. It enables users to seamlessly share streams of data outside their enterprise, facilitating data-driven partnerships and enabling deeper insights and innovation across the ecosystem.
 

The Big Trends

Categories of Talks

Most of the talks at the Kafka Summit can be categorised into three key areas:

  1. Customer deployments showcased specific use cases where Kafka is implemented in production contexts.
  2. Speakers delved into best practices to enhance understanding and utilisation of the Kafka ecosystem. Topics such as rebalancing partitions, the role of consumer groups, Kafka Connect architecture, transactional operations, and other intricate aspects were discussed, providing attendees with insights to optimise their Kafka implementations.
  3. Observability emerged as a central theme, with a focus on distributed application tracing.

The Rise of Apache Flink and Java Usage

Two other things I still want to add here. One notable trend observed throughout the summit was the widespread adoption of Apache Flink as the preferred stream processing framework. Talks highlighted Flink’s scalability, stateful streaming capabilities, and seamless integration with Kafka, positioning it as the de facto standard for stateful streaming applications. It was fascinating to see the complete absence of Apache Spark Structured streaming. Is there a strong communication strategy from Confluent? I don’t know, but that was cool to see Flink everywhere!

Another interesting observation was the prevalent usage of Java as the primary development language in the Kafka ecosystem. Presenters predominantly showcased examples and demonstrations using Java, emphasizing its widespread adoption among developers. This reinforces the notion that Java and Scala remain prominent languages for building event-driven architectures and leveraging the Kafka ecosystem.

Jay Kreps Keynote

Jay Kreps, in his keynote, provided an overview of the data streaming platform, highlighting achievements, ongoing work, and future plans. The talk started slowly, showing the ongoing work and achievements, but nothing crunchy. Until Jay arrived at the cloud strategy, this is where most of the biggest announcements were made. 

Here is a summary of the key points:

Achievements

Jay came back on the main achievements from the community : 

  • ZooKeeper’s removal is now production ready and replaced by KRaft 
  • Introduction of tiered storage, enabling the decoupling of compute and storage, providing more flexibility and cost-efficiency.

Ongoing work

  • Simplifying the client’s protocol by moving logic to the server. In addition, they want to make updates easy to implement and expand the ecosystem with even more languages.
  • Introducing KIP-932, which aims to decouple consumer groups and partitions through share groups, enabling both queue and stream use cases within Kafka.

What may happen in the future

  • Introduction of directories to provide a hierarchical structure for organising topics, simplifying access management and enterprise-scale topic organisation.
  • Exploring autoscaling or partition-free topics, abstracting or removing the notion of partitions to enable seamless scalability.

What mattered: the cloud service

Jay emphasised that deploying Kafka in the cloud involves rethinking the cloud-native architecture, considering factors such as storage (S3, Blob storage), managing 10K servers, multi-tenancy, cloud networking, and automation.

Confluent introduced KORA, a multi-cloud Kafka management platform, addressing challenges such as multi-AZ support, multi-cluster partition, durability, optimised cloud networking, tenant-aware dynamic quotas, and predictive auto-balancing.

They claimed interesting results:  30x workload without additional support, 10x resiliency with four 9 uptimes and Multi-AZ support, and infinite storage. What is it in for me? The idea is that if Confluent is able to optimise its resources and workload, that means that WE are supposed to save costs! This tool is mainly for confluent-managed environments, but they are exploring the open-source path. 

Importance in the Kafka Ecosystem

Connectors

Connectors play a strategic role in Kafka, enabling integration with various systems.

Confluent provides over a hundred open-source connectors and offers 70+ fully managed connectors, eliminating the need for deploying and managing a KConnect cluster.

Custom connectors are now supported in the Confluent Cloud, allowing users to bring and deploy their own connectors within the Confluent Cloud.

Governance

Data governance is crucial for Kafka, and Confluent has been working on evolving features related to schema versioning, compatibility, data lineage, metadata management, integrity, and validation.

Confluent Cloud now offers stream data governance as a key feature, along with the announcement of data quality rules support. From the discussion I had with Confluent experts today, it is mainly a constraint language used directly at the schema level. They are compatible with Google CEL. Currently, there is nothing to support more than that, such as referential integrity or cross-message constraints. 

Stream processing

Flink was positioned as the emerging de facto standard for stateful streaming applications. For 3 minutes, there was a complete marketing pitch to sell Flink to the community! Jay Kreps came back on Apache Flink unified runtime, the similar curve of adaption for Kafka and Flink, the many big names using it, the fact that it is one of the most scalable engines in the market, etc.

Confluent also announced a cloud-native Flink service integrated into the Confluent Cloud platform, providing fully managed and auto-scaling capabilities.

The Confluent Cloud platform offers a unified platform for Kafka, Flink, and connectors, available on Azure, AWS, and GCP.

Early access registration for the integrated Flink SQL streaming interface was opened, offering seamless integration with the connector layer.

Favourite Talks

Pragmatic Patterns and Pitfalls for event streaming in Brownfield Environments

Ana McDonnald delivered an insightful talk focusing on event streaming patterns and best practices when integrating external data systems in brownfield environments. The talk addressed common challenges faced in day-to-day data streaming integration and provided simple yet elegant solutions. Here is a summary of the key points:

Stream Only What’s Necessary:

Ana emphasised that not everything needs to be streamed in an event-streaming architecture. Consider if the knowledge of real-time events is actionable and if there are integration requirements or user expectations for real-time information before deciding to stream the data.

Getting Data:

Instead of directly using Kafka producers (good but risky), Ana recommended leveraging Kafka Connect and its extensive ecosystem. Kafka Connect handles high availability, format translation, and connection protocols. A CDC (Change Data Capture) connector is particularly useful as it provides valuable information about changes made to a table.

Moving from Synchronous to Asynchronous Workflows:

Ana discussed the challenge of transitioning from synchronous workflows to asynchronous event streaming, highlighting a specific scenario involving two tables: “order” and “orderID” with “itemID.”

This is a tricky issue, and I had this question raised by a customer a few days before the summit. Interestingly, I had a nice discussion with an expert architect of Confluent on this topic, and I have a bunch of options. But let’s stick to Ana’s propositions: to ensure the completeness of events, Ana proposed two options:

a) Embrace completion criteria by implementing the Outbox pattern. Create a third table in the database  (even if it is denormalised) that is updated through SQL triggers or views. Ingest CDC events only from this view, allowing the source database to handle completeness.

b) Use Derivative events in the streaming transformation layer. Wait for an event that indicates completeness, such as the number of items in a transaction. For example, using a KTable <OrderID, ItemNumber> to track the number of items and filter events based on completeness criteria.

I asked if the order ID table will grow infinitely. The answer: yes, indeed, you need to have a separate service that is notified that you got the order completed and cleaned the table in KStream. 

Handling Incomplete Transactions:

Ana addressed the scenario where completion criteria never arrive. She suggested using a timer on the element in the state store. The processor API can evaluate the time an entry has existed in a state store and act as a timer to raise the alarm for incomplete transactions.

 

Editor’s note: I did not know Brownfield was the opposite of Greenfield! In Greenfield environments, you do almost whatever you want: this is the true freedom to play with any fancy and shining tech. But in environments that run for years, you must deal with EBCDIC, mainframe, old tech, etc.—here, nicknamed Brownfield Environments. 

 

Building Streaming Event driven microservice with Kafka and Apache Flink

Ali Alemi from AWS delivered a talk on building event-driven microservices using an event-driven architecture. This is not something new. In our streaming courses for KUL and UGhent, we have been explaining this kind of architecture since 2018. However, what was interesting here, is that the speaker proposes to reuse Statefun, a concept presented three years ago at Flink Forward, to address the challenges of maintaining stateful applications while utilizing a stream processing backend.

The intuition is fascinating. Traditionally, when developing microservice architectures, developers must choose between maintaining a central state and code for orchestrating the services or using an asynchronous choreography. Until now, if you wanted an event-driven architecture, you had to choose the choreography. However, writing this kind of architecture is way more complicated. If you want more information about this concept, do not hesitate to contact me. 

Ali proposed combining the benefits of both approaches by leveraging Statefun. Statefun allows Apache Flink to orchestrate serverless lambda functions, enabling the maintenance of stateful applications while centralizing orchestration. Each lambda function can have its own state stored in Flink, and the orchestration logic can be modelled in Statefun as a job. This setup resembles a streaming application where each operator maintains a state, and the execution is delegated to serverless functions, such as AWS Lambda functions accessed through a gateway.

Overall, the talk explored a novel approach to building event-driven microservices by combining Statefun, Apache Flink, and AWS Lambda, providing an alternative to traditional orchestration methods in event-driven architectures. 

Releated Posts

Privacy Enhancing Technologies 2024: A Summary

For Large Language Models (LLMs), Azure confidential computing offers TEEs to protect data integrity throughout various stages of the LLM lifecycle, including prompts, fine-tuning, and inference. This ensures that all
Read More

IEEE Big Data 2023 – A Summary

Our CTO, Sabri Skhiri, recently travelled to Sorrento for IEEE Big Data 2023. In this article, Sabri explores for you the various keynotes and talks that took place during the
Read More