·7 min read

Connecting Millions of Cars using EMQX MQTT and Upstash Kafka

Noah FischerNoah FischerDevRel @Upstash


Vehicle connectivity is set to shape the future of automotive transportation and smart mobility. In the upcoming decade, we anticipate a diverse array of transportation modes and interconnected infrastructure, forming a cohesive, integrated system. This evolution demands vehicles equipped with fast and reliable messaging capabilities to facilitate seamless communication within this interconnected ecosystem.

Here are some of the connected car features:

  • Real-Time Traffic Updates: Empower drivers with real-time traffic information, including congestion, accidents, and road closures, enabling informed route decisions.

  • Advanced Driver Assistance Systems (ADAS): Enhance road safety with ADAS, featuring systems like lane departure warnings, forward collision warnings, and automatic emergency braking to help drivers avoid accidents.

  • Driver Behavior Monitoring: Employ connected car data to monitor and analyze driver behavior. Provide feedback to drivers on safe driving practices, encouraging responsible behavior on the road.

  • Emergency Assistance and Crash Response: Implement automatic crash detection and emergency response systems. Connected cars can transmit critical information to emergency services, enabling faster response times and potentially saving lives.

  • Environmental Impact Monitoring: Monitor and report the environmental impact of vehicles, including emissions data. Provide drivers with insights into their carbon footprint and encourage eco-friendly driving habits.

With robust connectivity and data infrastructure, organizations can implement these connected car features, delivering enhanced value to drivers and passengers, elevating the overall driving experience, and positively impacting brand perception. MQTT and Kafka stand ready to assist in building these features for connected car services and other IoT initiatives.

Architectural Challenges Building a Connected Car Platform

Developing a connected car platform involves creating the software infrastructure essential for new connected car services. This infrastructure encompasses the technology needed to establish connections between vehicles and the cloud, and facilitate the transmission of data and events between the vehicle and the cloud. enter image description here

The construction of a connected car platform introduces distinct technical challenges. The mobility of vehicles and the proliferation of connected devices simultaneously give rise to unique architectural considerations, including:

  • Data Volume and Variety: Managing large volumes of diverse data generated by connected vehicles, including sensor data, multimedia streams, and diagnostic information.

    • Move millions of MQTT messages
    • End-to-end Multi QoS Guarantee
  • Standardization and Interoperability: Adherence to industry standards for communication and data exchange to promote interoperability among diverse vehicles and systems.

  • Network Connectivity Challenges: Vehicles on cellular networks may traverse blind spots, leading to interruptions in the connection between a vehicle and the cloud. Reconnecting may result in slow response times and message loss.

  • Scalability Challenges: The cloud platform supporting the system must handle millions of simultaneous connections reliably to ensure a consistent customer experience, even during peak usage periods.

  • Network Latency Issues: Network speed and latency: Similar to blind spots, network speed and latency can create variations in data flow between the vehicle and the cloud. A responsive user experience should mitigate the impact of network latency.

  • Bidirectional Data Movement: Instant bidirectional data movement: Connected cars must efficiently transmit data between the vehicle and the cloud in both directions. Traditional client request/response architectures are inadequate for platforms communicating with millions of connected cars.

  • Integration Complexity: Effortlessly incorporate IoT data into cloud services and enterprise systems, such as Kafka, SQL, NoSQL, and time series databases

    • Integration with Various Databases, Streaming Platforms or Cloud Services
    • Transform and process the data at scale
  • Cost Management: Effectively managing the costs associated with the infrastructure, data storage, and communication networks, especially as the scale of connected devices increases.

Design Decisions

Upstash Kafka

  • Webhook API for Kafka: Upstash Kafka Webhook API allows publishing these events directly to a user-defined topic without using a third-party infrastructure or service.
  • Connectors: Via Kafka Sink Connectors, you can export your data into any other storage. Via Kafka Source Connectors, you can pull data to your Kafka topics from other systems. Examples of connectors available are MongoDB Source/Sink Connector, Debezium MongoDB Source Connector, Google BigQuery Sink Connector, and Snowflake Sink Connector.


EMQX is a large-scale distributed MQTT messaging platform that offers unlimited connections, seamless integration, and anywhere deployment. It provides the following capabilities:

  • Distributed MQTT Broker: Construct a global MQTT access network spanning multiple clouds, facilitating communication among devices, systems, and apps from diverse network endpoints.
  • Effortless Scale Connect hundreds of millions of IOT devices to the cloud seamlessly and reliably. It can efficiently manage a high volume of MQTT messages per second by leveraging a broker cluster equipped with a high-performance, real-time processing engine.
  • Integration with Streaming Platforms: Connect to popular streaming platforms such as Kafka/Confluent, Pulsar, and RabbitMQ, facilitating message routing between the MQTT broker and enterprise systems.

Architecture and Data Flow

The data flow for connected cars involves the seamless transfer of information between the cars and the backend systems. Here's a high-level overview of the data flow, incorporating EMQX, Upstash, Kafka, and other relevant IT systems:

enter image description here

Data Generation at Connected Cars

  • Sensor Data: Connected cars generate a variety of data from onboard sensors, including GPS, accelerometers, cameras, and other relevant sensors.
  • Telemetry Data: The cars transmit telemetry data such as speed, fuel consumption, engine health, and other performance metrics.

Data Ingestion through EMQX (MQTT)

  • MQTT Protocol: Connected cars use the MQTT protocol to publish data to an EMQX MQTT broker.
  • EMQX MQTT Broker: EMQX acts as the MQTT broker, facilitating communication between connected cars and backend systems.

Upstash Kafka for Event Streaming

  • Event Streaming: Use Kafka for high-throughput event streaming. This is particularly useful for scenarios where data needs to be reliably persisted, processed asynchronously, or shared among multiple microservices.
  • Topics and Partitions: Kafka topics can be used to categorize different types of events, and partitions can help parallelize the processing of events.

Data Processing and Transformation

  • Real-time Processing (Optional): If real-time processing is required, systems like Apache Kafka Streams or Apache Flink can process and analyze incoming data streams in real time.
  • Data Transformation: Transform and enrich raw data as needed, converting it into a format suitable for storage, analytics, or further processing.

Data Storage

  • Databases: Store processed and transformed data in appropriate databases based on the data type and requirements. For example:
    • Use relational databases (e.g., PostgreSQL) for structured data.
    • Use NoSQL databases (e.g., MongoDB) for unstructured or semi-structured data.

Real-time Analytics and Reporting

  • Analytics Engine: Implement a real-time analytics engine to derive insights from the aggregated and processed data.
  • Reporting Services: Develop reporting services or dashboards for users or administrators to visualize the analytics results.

This data flow provides a high-level overview, and the specifics may vary based on the requirements of your connected car system. It's important to continuously assess and update the architecture to accommodate evolving needs and technologies.

Closing Notes

In our design, IoT-enabled connected cars seamlessly communicate through EMQX, ensuring efficient data exchange using MQTT. Upstash Kafka provides a robust foundation for reliable event streaming. Security is paramount, with end-to-end encryption and access controls safeguarding sensitive information. The system is designed to scale horizontally, ensuring flexibility and fault tolerance for future growth. Real-time processing extracts valuable insights from incoming data, contributing to dynamic analytics. Continuous monitoring, logging, and API-driven integrations enhance the system's reliability and adaptability, ensuring it stays at the forefront of technological advancements.