Rearden Commerce deems a decade of Big Data worth crunching to enhance flight search
A project of Deem Labs, Rearden personnel used Apache Hadoop, Apache Pig and MySQL to transform and analyze US Bureau of Transportation Statistics data on the 67 million flights from 24 US airlines from January 1, 2001, to December 31, 2010.
While static infographics for blogs have been a data and marketing craze in the travel industry of late, Rearden has gone a step further and created what it calls an “animated infographic” in the form of a YouTube video, Heat Map, Seasonal On-Time Flight Performance for US Airports.
The video, which is “best viewed at 720p resolution,” depicts a color-coded analysis of flight performance on a daily basis over an average year during the decade studied, Rearden says.
Rearden’s goal in all of this “is to overlay the information onto our Deem Travel booking platform, helping users select the best travel routes for any date or context, based on history,” a spokeswoman says.
The company already has integrated a limited amount of this flight reliability data into Deem — with more to come — for its corporate clients and their employees.
“Fliers will no longer have to rely on superstitions when planning travel,” the spokesperson says. “Through Deem, they’ll have historical data to gauge the best route for any domestic flight, any time of the year.”
That is the idea, anyway.
“Right now, when a traveler sees flight search results on Deem, they see historical on-time performance for each segment — as a probability percentage for that flight arriving on-time,” says Steve Bernstein, Rearden’s head of analytics. “This new analysis gives us the ability to guide travelers toward alternate routes. For example, when one segment within a journey has poor historical performance, we can automatically suggest a new connection with better performance.”
Among the myriad findings gleaned in the Big Data analysis of US flights were:
- There was a 13% chance of a flight departing or arriving late or being diverted or cancelled;
- If you flew during the 10-year period on December 23, it was the worst day of the year to fly, with a 25% chance of the flight being late, cancelled or diverted;
- October 3, on average, was the best day to fly with 7 in 1,000 arrivals and departures diverted, cancelled or late;
- The airport with the worst arrival performance was Dutch Harbor, Alaska, with about a 33% of the flights arriving late;
- Newark Airport, with 25% of flights arriving late, was the worst performing major in the US in terms of arrivals over the 10-year period;
- The summer vacation period is just as bad in terms of on-time flight performance as is the winter; and
- July 4 averages “exceptional on-time performance,” Rearden says.
That is just a taste of what Rearden found — after all the Big Data was huge.
Dennis Schaal was North American editor for Tnooz.