Delivering one lakh ride stats everyday!

hitesh.bhat · January 20, 2022, 11:33am

How do you define your relationship with your bike? Is it just buying it one auspicious day, clicking a picture tagging ‘owner to new ride’, and then making it part of your daily routine? It goes a bit beyond that.

Most of us are used to the concept of tracking, be it our finances, or the few miles we ran over the weekend. Tracking gives us a sense of progress and helps us see how far we have come from our day one at something. At Ather, we believe that a vehicle, especially a connected EV, can be more than just a mode of commute. The Ather 450X can help you celebrate your tiny moments of joy, be it clocking your longest ride on a Sunday morning, or helping you reinstate your decision of switching to an electric alternative - whether the drive for the switch was to go for a significantly lower pollutant alternative or that saves you money or both!

)

Over these years the tens of scooters have become tens of thousands and just as we write this article, we have reached 1.5L trips per day across the country, and over 90% of these rides have been available to view on the Ather app near-real-time (within 15 minutes). Two revolutions around the sun, and the consistency in quality looks the same.

To support this massive number of rides each day, our platforms have gone through at least three iterations, most of them requiring us to go back to the drawing board and come up with scalable pipelines to give consistent outcomes.

The Ride Stats Saga: The Tech Stack and its Nuances!

A ride on your scooter is one of the many transitions it goes through every day, and identifying each of these transitions is fairly complex. There could be tiny rides that were meant to clear the parking lot, or a long ride for a weekend getaway. There could also be rides navigating through the city’s traffic and each stop at a signal stretching longer than a few minutes.

The ride engine should identify these nuances and accurately categorize these scenarios accounting for the rides to be ignored and the rides that need to be stitched together.

Identifying a Ride and what isn't one

The stream processing engine: computing the ride stats real-time

Catering to the nuances seen above is more complicated than it seems; try relating it with the way each person rides the two-wheeler. The high-frequency data generated by the scooter hits the cloud IoT broker, which is constantly tackling various scenarios (such as network challenges, reconnection issues, and of course, the asynchronous nature of data).

The IoT broker pushes data via topics using a Messaging Queue (MQ) - Kafka in this case. MQs are set up with different topics, each topic playing publishers and subscribers among a group of microservices and each playing an independent but pivotal role in this “Session” enrichment.

Every ride is a session, and sessions can be different transitional states of the vehicle like ride or charging. Defining a session is done through multiple stages:

Segregation: to make sense of the data in the message queue
Timesorting: Microbatch to account for robust session definition and take care of the real-world traffic nuances
Enrichment/Enhancement: Defining the business logic and the primary metadata for the session
Publishing: Publish the data inferred to a data store and the customer’s mobile application

Going (growing) by the numbers

With the first generation of the Ather 450, all of this intelligence was embedded in the pipeline. The heuristics for session computation accounted for a few corner cases but as the vehicles on the road increased, the session definition got far more complex. Adding to this is the layer of OTA orchestration for every small change with the definition. Hence, it made sense to compute the session real-time on the cloud.

The initial real-time pipeline was built with an off-the-shelf Google IoT broker, managed data stores such as Influx, Bigquery, and a quick MVP of Nodejs API server to be able to aggregate the metrics that are today published on the mobile app. This enabled us to get this feature quickly to the market, but leveraging managed services did prove to be an expensive option for scale.

With the challenge of enabling this feature for further versions of improvement and catering to tens of thousands of bikes, our best bet was to evaluate open-source alternates and track the per-bike cloud costs closely as a driving factor. The real-time stack today (see the image in the previous section) leverages a custom tweaked flavor of the Artemis & Kafka integration and a series of components built on Java and NodeJS each performing a distinct function. We also leverage Elasticsearch as temporary data stores during the flow of data through these stages.

All of the application, infrastructure, and data nodes are containerized applications running on the Kubernetes engine. The Kubernetes orchestration helps us spin up pods, scale-up applications quickly depending on the volume of data, and the operation is semi-automated through Terraform.

Our dev-ops processes have evolved through tackling various scenarios that are very unique to Ather. For example, The scooters are dependent on a network provider. We’ve faced scenarios where with a drop in network connectivity and sudden re-connections, There is a significant back-pressure on the Kafka, and the only option to handle such erratic load of data bursts on the platform is to manually scale the applications and data notes by 10X. Today, with processes and scale tests in place, we’re able to handle such situations with minimal or no interventions at all.

Gazing at the horizon

While the ride statistics have been our core feature, we believe there are more than a few ways of looking at what this data and analytics could mean to customers. We are constantly innovating, and one such innovation is the Savings tracker, developed as part of our Ather Labs offering (more on that in an upcoming blog).

Psst. We have some ideas on leveraging the personalised efficiency of each rider/bike to make the best of the juice in the battery as well.

Here’s to clocking more revolutions!

tanishqkhare · January 20, 2022, 3:52pm

Thank you for this insightful blog. Though it was somewhat technical but did get the thing across.

thejus.g · January 20, 2022, 10:19pm

This was quite an insightful read of how the data is read through and processed , kudos to the Ather Team , Wishing all the best for more innovations .

harish.gautham · January 21, 2022, 3:16am

It’s a very useful data , it’s been made user friendly and all these are customer delight Well designed software all credit goes to Team Ather under guidance of Tarun Mehta. Hitesh Bhatt well explained we were able to understand behind the screen activity

abhishek.balaji · January 27, 2022, 5:30am

A post was split to a new topic: Ride stats not showing up since purchase