Хранилища данных: получайте ответы быстро14 октября 2011
In November of 2010, IBM completed its acquisition of Netezza, a maker of data warehouse appliances -- integrated sets of servers, storage resources and software specifically designed for data warehousing. The acquisition raised a few eyebrows, but given the value IBM Software and its clients received, the buy looks to me to be a bargain... especially when you consider that at many organizations, data warehousing works less like magic, and more like a Magic Eight Ball.
Perhaps you’re not familiar with this classic children’s toy? Essentially, it was a source of quick but dubious wisdom. The idea behind it was simple: begin with a question, shake the eight ball, and get your answer.
So you could ask it things like: Who stole my cookies during recess?
Then the toy would deliver an answer that was usually both wonderfully vague and wholly unrelated to your question. Example: Signs point to maybe.
This is not an answer steeped in obvious business value. Still, it did have the virtue of arriving in five seconds flat.
And what the Magic Eight Ball lacked in predictive accuracy it made up for in management complexity -- which is to say, it had none. Unless you threw it violently against a wall, as my friend Steve did on two separate occasions in the third grade, the Magic Eight Ball would continue to function without maintenance. So it was both fast and trouble-free.
Unfortunately, that statement doesn’t always apply to today’s enterprise-class data warehousing and analytics architectures.
Too frequently, the architecture is simply too slow, too complex and much too difficult to develop from scratch. Even just keeping it running properly requires excessive time, effort and money. So you get higher operational costs, clumsier business agility and a less competitive and efficient response to market dynamics.
As a result, business leaders could be forgiven if, at times, they feel an urge similar to my childhood friend Steve's.
Transactions-based systems can't match architectures that are designed for analytics
How did this situation come about? Often, it stems from the fact that the data analysis architecture was never designed properly for data analysis in the first place. Instead, it was cobbled together from existing database systems, which were originally intended for a fundamentally different purpose (business transactions).
And while this jerry-rigged design may have worked reasonably well at first, its shortcomings have, over time, become increasingly exposed. In part, this is because both the need for quality analytics and the scope of the data being analyzed have grown to levels unimaginable when the system was originally implemented.
Recently I had an opportunity to speak with Michael Kearney, Senior Product Marketing Director for IBM Netezza, about these and related issues, and he confirmed my suspicions on this topic.
’Big data creates challenges and opportunities for the warehouse,’ said Kearney. ’Database systems designed to process transactions cannot perform at the scale of systems designed exclusively for advanced analytic processing -- and organizations can create value from their data by processing analytic algorithms.’
What, exactly, is meant by the phrase ’designed exclusively for advanced analytic processing?’
In short, systems that were created with huge data volumes and advanced analytics in mind from day one. Instead of general-purpose systems that would be just as well suited as e-mail servers or web hosts, in other words, systems that really take into account what businesses want to achieve via data warehousing and analytics and make that happen. With incredible speed and efficiency.
So, for instance, imagine having this capability: A fleet of as many as a hundred different server blades, each of which has multiple processors, each of which has multiple processing cores. Together, these blades form a vast analytical engine that is both (a) load-balanced to handle unpredictable demand levels, and (b) optimized for performance via massive parallel processing technology.
To that, add even more intelligence in the form of query analysis and execution that is physically close to where data is stored, unlike older architectures that attempt to move data to the query.
Get the answers you need -- not just faster, but orders of magnitude faster
That doesn't sound like your father's data analytics architecture, does it? It's not.
It is, in fact, so much better than the traditional approach that many organizations find the difference almost shocking -- whole orders of magnitude faster than the transactions-based systems that are being replaced.
Said Kearney: ’By replacing a 40-terabyte Oracle warehouse with a 2-petabyte Netezza appliance, T-Mobile now responds to events as they unfold. This is business agility. This is technology supporting, not hindering, the business.’
XO Communications has had a similar experience. Every day, this business telecommunications provider faces a situation typical to the telecom sector: razor-thin margins and the need to establish, in quantified terms, whether its operations and business strategies are actually making money or not.
Obviously, the faster that kind of determination can be made, the better. But the company's prior analytics architecture, which took some two months to finish a profitability analysis, wasn't exactly fast.
Thanks to XO's new architecture and IBM Software, all that has changed…dramatically. How dramatically?
’Netezza gave us the ability to determine profitability on a day-to-day basis,’ said Danny Sangster, Senior Manager of Enterprise Business Intelligence at XO Communications.
Old system: two months delay. New system: immediate result. That's quite a change.
True data warehousing appliances deliver far more business value, yet require far less ongoing maintenance
Of course, some of that acceleration stems from the fact that the ongoing maintenance they used to have to perform is simply no longer required. ’With Netezza, there are no indexes, no summary tables, no partitioning,’ said Sangster. ’Just load and go.’
And vastly reduced ongoing maintenance contributing to far superior performance isn't limited to XO Communications, either. Ideally, in fact, analytics architectures should be de facto turnkey appliances, requiring as little tuning or management as possible.
When they are, it frees time-challenged IT team members to attend to more critical tasks. It also accelerates the business's ability to implement new strategies to respond to customer interests or needs.
’Sure,’ said Kearney. ’Before Netezza disrupted the market with its appliances, operating a data warehouse required a team of administrators dedicated to the care and feeding of the data management system.’
‘Care and feeding’ I found an interesting phrase, suggesting as it does a high-maintenance beast likely to devour almost as much business value as it creates.
What kinds of results can organizations get by moving to a far more autonomous architecture?
’Good question,’ said Kearney. ’A large financial services company analyzed the effort required to keep their Oracle warehouse up-and-running and discovered that 90 percent of their database administrators' time was consumed by waste or non-added-value processing. Consider that, in a two-week period, their team of administrators was available to work alongside their business colleagues for [only] a single day. Freed from this tyranny of complexity, IBM Netezza customers create value from data.’
All of which leads me to ask: What's your data warehousing strategy?