4 years ago

Big travel data becomes fodder for some intriguing hacks at THack Boston

The results from the THack event in Boston are in: over 60 people formed into 10 teams to compete for three prizes of $1,000 each, in addition to $500 worth of hotel rooms as door prize. The teams were given about 10 hours of hack time – plus the night before to come up with potential hacks – to deliver their concept to the packed space at hack/reduce.

Each team was given the following data:

  • 1.2 billion global flight records created in 2011, broken down by month from ARC
  • 6 months of historical search data from Amadeus
  • comprehensive flight records (2011) from FlightStats
  • recent search data from early September via Travelport

The winners were determined by a three-judge panel: Sundar Narasimhan, who is now a Director at Google Travel after the company’s acquisition of ITA Software, and Ryan Betts, who is the field CTO of VoltDB, and our own Gene Quinn, co-founder of Tnooz.

Judge’s feedback

Looking at the competition as a whole the judges were excited with both the results and the potential.

Gene Quinn, of Tnooz:

“Hackathons today are the science fairs of yesterday that built things like the space program. The digital distribution and marketing of travel are dependent on both creative and business intelligence people who use tools that first get dreamed up at events like hacks and meetups.”

Individually, the teams showed a solid effort, especially given only 10 hours to come up with a hack.”

Ryan Betts, from voltDB:

“Nice team work. Folks still struggle with data that’s ‘bigger than a laptop.’ Personally, as a data management tool maker, this is always of interest to me.

My main take away – even though travel is pervasive and broadly ‘online’ now, the hacks and questions raised made it clear consumers are under-informed – definite opportunity here. Possible to disrupt the airline industry though? Unclear.”

Gene Quinn:

“All of the projects showed different levels of the hackers’ skill with data analysis tools and knowledge machinery that makes travel happen.

Several of the teams had difficulty processing some of the vast data sets, so they took samples and developed models and assumption. that was smart and nimble on their part. And most of them tried to visualize their findings in presentations, which is always powerful.”

Sundar Narasimhan, of Google:

“I was impressed at the energy and engagement that travel applications and data still generate given that the industry’s maturity; I thought the teams had great ideas and focused on the right sort of questions in order to understand and expose the right data / patterns to consumers and airlines/airports — even though the data sets and arcana that the airline industry represents and deals with day to day is not exactly trivial to digest in a day.”

“It was a shame that some of the hacks couldn’t use all the data provided, but understandable given the volumes. The scope for innovation and redefining the travel exploration and purchase for users is still rich and wide. Made me wonder what might have happened if we had extended the hack to day 2!”

The project breakdown

Here’s a breakdown of the various projects, followed by a highlight of the winning teams.


This one-man team didn’t have a full working hack to demo, but focused on a very specific travel hacker use case: the ability to determine whose metal (ie. what branded aircraft) would be traveled on via codeshare flights.

Within global alliances, there are many different airlines marketing various flights, and it’s not always clear what plane that one will be traveling on. Other interesting data points from the analysis included which airline marketed the most codeshares, and which was responsible for the most actual flights.

FlyMate Corporation

A group of undergrad statisticians from Harvard was not able to get through the travel data in just one day – as mentioned in our previous article about the big travel data hack, many of the participants had not ever worked with travel data sets before.

So rather than simply skip a presentation, the gregarious William Chen introduced FlyMate Corporation, a company dedicated to providing delay insurance to both airlines and passengers, in the context of the $19 billion in delay-related losses. No longer will delays be uncompensated with FlyMate.

A fun, non-coded concept that highlighted how much weather-related delays affect the travel experience, alongside more standard maintenance, staffing and aircraft availability information.


This one man team was working towards creating a Domain Specific Language that would allow for the interoperability of different data sets. Given some of the challenges related to working with data from different providers in the context of this hackathon, this is indeed an interesting open travel challenge and opportunity. How can we work better and more effectively across different siloed data sets?

Team Awesome

Using the provided data sets, this team attempted to answer the following question: How can we use query volume to come up with a demand profile? The team focused specifically demand for LON-NYC.

By comparing queries of search data with actual ticket sales, this hack promises to see the patterns to predict demand for airlines ahead of time. The airlines, which have their own forecasting tools, could benefit from more industry-wide patterns identified by search data – and could also see the percentages of searches versus actual bookings.

The team definitely felt the time pinch, and said that, “With more time it would be easier to build a predictive model for airlines.”

Inspire to Travel

This team wanted to be the link between the consumers and airlines, which have their own discrete needs that could be combined for better trips and profits. Consumers want to know where to travel, where others are traveling and to find destination information. Airlines want to know where people fly the most, where there are currently delays, and how they can be more effecitve in getting people from A to B.

So the team used the available data to show the popularity of destinations month-by-month, in order to both show consumers where they might be interested in going – related to the popularity of others – and to show airlines where the trend is going as far as consumer interest.


This team was loooking at searches date by date, to eventually be able to show how searches change over time in particular regions.



By showing the historical trends, the tool would be about to look at things like seasonal spikes in search, while also offering the ability to switch by destination or departure point. So the origin or destination becomes the focus of historical trends, offering insight into where’s popular when.

Data Slaves

Despite some technological challenges related to dealing with billions of points of data, this team’s ambitious approach employed Elastic Search to attempt to determine when the best time to book a flight is via various parameters.

By looking at the available data, they had hoped to create a visualization that would allow people to really know just how far out they should book a specific ticket – not necessarily route-by-route, but specifically flight-by-flight. The team was stymied by the ambitious scope of the project, but nonetheless, the use of Elastic Search was very interesting to the judges as a potential way to quickly search through reams of data with less resource use.

Team 420

This team was focused on matching supply with demand for airlines, and creating a robust analytical tool to faciliate this process of travel.

WINNER: SmartTrip Planners

SmartTrip Planners were able to create a visually compelling dashboard that brought in the variety of available data for perusal. This meant that there was more than just time and price: what if we give average load factor, average security time, and other information that gives more than just fares as a basis for travel decisions?


This could lead to customers being incentivized to book flights with lower load factors, and for airlines to promote specific routes that could use more customers.

The interface was very well-thought out, given the short timeframe. The color of the line shows load factor and the line length shows cost of fare. There was also an analysis tool to measure KPIs against competitors. Key data points, such as the look/book ratio from the ARC data, allowed airlines to identify potential opportunities for improvement in marketing.

Another interesting feature for airlines is the “Route Analyser,” which is a tool for carrier route planning that shows the most profitable routes flown, and which have low fares and/or empty seats. The team also proposed a “Simulator View” which can actually include storm and delay data, among other points.

Finally, the team’s comprehensive look at the data included blog and social information to show airlines what people are researching and talking about in open forums.

The team did a fantastic job at pulling together valuable data points into an interface that could be easily used as a data-rich dashboard for data-driven decisions.

FlightR provided a very simple and visually appealing interface for flight intelligence.


The tool demonstrated correlation between time people go on flights and time of day that they make flights. The user can set a variety of parameters, such as fare cost and time of day, using a slider to see how changing one variable affects another.

Sundar Narasimhan, of Google:

“I thought flightR created a search/exploration user experience that focused on simplicity and business value.”

This visualization tool allows ability to see which flights are cheaper during which daypart, especially appealing to those who have a set budget and want to search via budget rather than time of day. A surprisingly fresh twist, as many folks are simply changing dates and times to look for a flight withing their already-defined budget.

The tool also allows simple visualization of pricing trends over time, again allowing fare-minded individuals to see when the most affordable time to travel could be.

AUDIENCE FAVORITE: Amadeus’ pop culture travel concept

A large team from Amadeus came together to first determine the search spikes associated with place-based pop culture content.


By parsing search data following popular television shows, like the Amazing Race, the team was able to show how pop culture exposure delivers a visible boosts in destination-related searches. They found 100 correlations between travel search spikes and pop culture mentions.

They then used that knowledge to put together an interface that would allow users to search according to specific shows. This interface would not only allow travelers to search trips according to their favorite shows, but scheduled events related to showtimes could be integrated into destination marketing campaigns.

The team sees that predictive models could be built based on further analysis, so they would be able to provide more real-time feedback. Also, the input data could also be expanded to include social media details that can inform the pop culture travel destinations.

Judge and Tnooz co-founder Gene Quinn summed the results up very well, focusing on the difficulties of the time limitation – and how future hacks could take advantage of more time to crunch data as a key step towards even more robust final hacks.

“With few exceptions, all the teams presented findings that begged for more time and analysis of the data. The point of this big data hackathon was not to wind up with a product, but with questions for more study that could lead to something like a product or a tool.”

Share on FacebookTweet about this on TwitterShare on LinkedInEmail to someone
Nick Vivion

About the Writer :: Nick Vivion

Nick is the Editorial Director for Tnooz. Prior to this role, Nick has multi-hyphenated his way through a variety of passions: restaurateur, photographer, filmmaker, corporate communicator, Lyft driver, Airbnb host, journalist, and event organizer.



  1. Gene Quinn

    Thanks, Mike.

    From the hackers’ POV, the big takeways were tactical: 1) get us the data earlier, 2) give us more context about the data (most hackers don’t have travel industry experience) and 3) make the hackathon duration longer so we can dig deeper.

    From the data providers, among the takeaways: 1) it’s hard moving such large data sets from home systems to “temporary” computing clusters … lots more learning needed here, 2) take more time up front explaining the data sets to hacking group, 3) offer pre-loaded public data sets that teams could mash up with the travel companies’ data, and 4) have more mentors available to explain the various software tools needed work efficiently with the data.

    Even with all of these issues, it was exciting to witness lots of workarounds and sampling/modeling short cuts by these teams.


  2. Michael Premo

    Hi Nick,

    Very exciting stuff. We at ARC were happy to have our data in the mix as well as some of our staff who worked on the SmartTrip Planner winning concept. Would welcome feedback from you, Gene or others who judged or participated on ARC data and suggestions on how to make it more accessible for future THacks!

    Mike Premo
    President & CEO


Newsletter Subscription

Please subscribe now to Tnooz’s FREE daily newsletter.

This lively package of news and information from Tnooz’s web site provides a convenient digest of what’s happening in technology that drives the global travel, tourism and hospitality market.

  • Cancel