Data analysis. The taxi sector

Big Data

Recently, I had the chance to have a conversation with a taxi driver from Madrid, and he compleined how new technologies where damaging business, the taxi sector loss clients, …  after he finished, I asked him  … but … do you use #BigData that can help you when taking decisions?

A clear answer: no.

Next question was … ¿does the taxi sector save their data, position, times, ….?

Anwer: no.

After my questions, he asked eme: saving and analysing data can benefit us?

Saving data

The main objective should be:

If you want to maximize your work time, the ideal position is to have the flag down as more time as possible. This can mean more or less runs, but the flag down, and of course, the waiting time as less as possible.

The data can be saves with the data stored on GPS devices, with the indicaterfree or busy.

Maybe, for only one car, data is not a valid source, but … what if you have all data from all taxi cars from a city for several years? Things change!

Another useful information can be the number of occupants for each service, data that should be introduced to the system by the driver before starting the running.

Data Analysis

The first one answer to analize is: where are the more probably starting points?

I understand that the mayority of runs should start from the official stops, but … is it possible to give a value to the average time that a car should wait on a certain stop?

With data you can know the waiting time on average for every stop with the conditions of the moment (working day, weekend, punctual event, …) . If you can combine this information with how many taxy cars are on a stop on real time, maybe a free taxy can  decide to stay or to go to a different stop.

Another important issue is what probability do a taxi have of run a street and start a service out of a official stop. Or even more: relocate the official stops in order to maximize benefits.

Analizing geographically the information is another important information. Which neighbourhoods are more likely to take a taxi, by day, by hours, …. The big objective is to know how the city moves, in order to adjust the taxi sector to it.

Punctual events

Punctual evets are events on a concrete part of a city that concentrate people, such a football match, a concert of a famour artist, ….

This punctual events modify the normal behavour (in theory) and should be taken into account, because the velocity of anticipate to demand (with prediction models) is an advantage, allowing to manage the taxi cars more efficiently. As an example, imagine a football match with 100.000 people, and you have already studied that 1% of the public uses taxi for coming back home. With this information, you need at least 1.000 taxi cars ready after the match finished surrounding the stadium, and with all these data, you can know which parts of the city are using your services.

Benefis of data analysis

Maybe, being an atomic sector like taxi is a disadvantage for collecting data and data analysis, but the truth is that the benefit for a single can be enormous, and also, for the sector, offereing several strategies at short and long run.

As an example of studying routes, you have a study of where runners run in Madrid and Barcelona, just analyzing theis GPS data: http://geospatialtraininges.com/2014/12/09/runners-bigdata-y-gis/ (in spanish).

Here is my reflection after my conversation, and i’m waiting your comments! Have a nice day!