But just like most up-and-coming technologies, there is much confusion as to what it all really means and how it can help make the customer experience better and lead to more sales.
This article is meant to serve as a technical primer and will show some of the potential future applications across the travel ecosphere.
Big Data 101
The term “Big Data” is often applied in several ways, but it is quite self-explanatory. Big data just means “a lot of data” – that is, datasets that are beyond the capabilities of a typical database in terms of size, workload and overall cost — and the technologies that enable the extraction of meaning from it.
So where might we encounter a lot of data in travel?
A prime example would be the analytics logs of an online travel agency. For years, analytics tools have enabled companies to keep track of conversion funnels, detailed demographic statistics, and other pertinent information such as which pages convert the best, have the highest bounce rates, etc.
These reports are then used to optimize sites to ensure the best possible conversions. But in the era of big data, companies are gathering and paying attention to much more data than seemed possible just a few years ago.
For example, some sites are now gathering the detailed mouse movements of customers in real time as they move around pages.
This generates literally millions of coordinates and data points for every user, allowing companies unparalleled insights into what users are doing when they’re on a page.
Just a few years ago, storing the amount of data required for projects like this would’ve been prohibitively expensive.
These days, the proliferation of cheap storage as well as distributed file systems that allow storage across dozens (or hundreds) of commodity computers enables the cheap and efficient storage of petabytes of data without massive cost.
As storage technology improves, the costs of keeping every last byte of data for later analysis will keep going down.
Big data analysis, MapReduce and the extraction of meaning
Having all this data is nice, but the real value lies in the extraction of meaning from it. Big data tools such as MapReduce, based on technology originally invented at Google, enable easy discovery of common trends in data and the action upon those trends.
This is more easily demonstrated by an example:
Imagine you had an Excel spreadsheet of every hotel in the world and wanted to find the ones most commonly described as “awesome”. [NB: Ed - Are you sure? ]
The raw data you have might look something like the following:
Hotel Name, Review
“Hotel A”,”Miserable experience”
“Hotel B”,”Awesome pool!”
“Hotel C”,”Liked it”
“Hotel B”,”Awesome restaurants”
“Hotel A”,”Loved it”
“Hotel C”,”Awesome experience”
Even though we’re only showing a few records here for simplicity, this spreadsheet would have hundreds of thousands of rows.
With MapReduce, you could write a function that maps each hotel name and creates a group of reviews for each name that is commonly discovered.
In the above example, “Hotel B” occurs 3 times, so the Map function would create a collection resembling something like this:
“Hotel B”:”Awesome pool!”, “Awesome restaurants”, “Boring”
Already, the map function has helped us find every review for “Hotel B”. But we’re not done yet — we’ll let the Reduce function do its job.
This function lets us perform any type of analysis we want on the collection that Map created for us. In our case, we wanted to find only reviews containing the word “awesome.” Reduce would contain computer code that did this:
“If the review contains the word awesome, increment an internal counter by 1″
This internal counter would essentially act as a score, or number of times the word “awesome” appeared for a given collection. In our example, “Hotel B” would come out on top with 2 “awesome” reviews, “Hotel C” would come out with 1, and “Hotel A” with 0.
So what did we accomplish in the above example? We found commonality in raw, unstructured data and then analyzed it for a business purpose.
While simple, this is the raw power of MapReduce and similar technologies: Extracting meaning when there previously wasn’t any.
The amazing part of this is that MapReduce can analyze billions – or trillions – of data points and find patterns in them. All on clusters of commodity hardware, and at a cost that even a startup can afford.
Can you see the potential yet?
State of Big Data in travel
The ability of big data technology to enable us to find intelligence in vast amounts of data presents a clear, massive opportunity to reshape the way consumers are marketed and sold to in travel.
It’s not a question anymore that those companies with ongoing big data projects are at the forefront of an entirely new way to sell travel to customers.
Emmanuel Marchal, currently at big data storage company Acunu, and a former-director at LikeCube, a company that leveraged big data for consumer travel sites and was later acquired by TimeOut, spoke with me briefly about the current state of Big Data in the travel industry:
- Big data applications are moving from profiling to true personalization. For example, true personalization would enable a site to recommend a specific hotel to a specific traveler based on their specific wants, needs, and previous purchase patterns, rather than a generic set of recommendations based on the type of traveler.
- True personalization is the main driver and “holy grail” of all big data efforts in travel.
- OTAs and other sellers of travel now see their big data efforts as “must haves” rather than “nice to haves”
- 2009 was the year of talking about it. 2010 was the year of starting big data projects. 2011 was the year of the prototypes. Will 2012 be the year these efforts finally see broad implementation?
In addition to personalization, efforts such as Hopper (NB: Disclosure – Hopper CEO Fred Lalonde is chairman of Tnooz) are demonstrating another prime example of how big data can empower consumers.
Large scale analysis of data related to places combined with natural language processing (NLP) enable search queries such as “nearby beach vacations under $500″. While this represents a combination of technologies, they all are enabled by the application of big data processing.
Into the future
While personalization is surely a holy grail, there are literally thousands of potential uses of big data in travel.
Geo-fencing, the process of knowing when a traveler is near a certain attraction or vendor, is starting to emerge.
An example of this is the recently launched Foursquare Radar feature, which alerts you when you are near a place you at one time wanted to be reminded of.
This technology is pure big data: gathering your coordinates in real time via your mobile phone’s GPS and realizing when you are in a certain boundary.
Enabled across millions of consumers, the amount of data gathering and processing required for efforts like this were unthinkable just a few years ago.
Think about it: how many GPS coordinates do you generate in a given day? The potential of geo-fencing to marketers is nothing short of amazing.
Image data processing: Billions of photographs representing petabytes of storage are uploaded to the internet every day.
Photo sharing apps such as Instagram might seem a great way to share moments with friends, but the true value behind these startups lies in the data they’re collecting on the backend. Each photo generates a mountain of data that, with further analysis, reveals a host of information on the user uploading it.
Color, the much maligned startup that debuted with a stratospheric valuation before even launching their product, surely wowed its investors with the potentials behind all the data it was hoping to collect.
In travel, the startup Jetpac is already using these image processing methods to put a different spin on the social travel concept.
Recently acquired by eBay, the startup Hunch represented one of the finest examples of finding commonality in previously disparate forms of data.
Their API and publicly available test tool allows anyone to find out things like “People who prefer Subaru cars also prefer to stay at 5 star resorts 40% of the time” (this is not a real example!).
While these types of correlations might seem questionably useless, imagine the power of putting these concepts to use when you’re trying to market and cross-sell a traveler.
Taste Graphs turn knowing “something” about your customer into knowing a lot of other things about that customer. The use and application of these Taste Graphs is sure to become much wider in the coming year in everything from online retailers to, you guessed it, travel.
To sum it all up, big data isn’t an amazing new magic box you’ll be able to buy from a vendor.
Big data merely sums up the concept that we each generate billions of data points every day and that with the economically viable application of technology, we can be sold to better than ever before.
Are you ready?