Свежий Магический Квадрант Gartner по системам управления базами данных для Хранилищ Данных
16 февраля 2012 Рынок СУБД для Хранилищ Данных претерпевает серьезные трансформации, связанные с появлением "больших данных" (big data) и соответствующими новыми требованиям к технологиям и методам работы для работы с такими данными. Также в 2011 году возросла важность комплексных решений, объединяющих в себе профессиональные услуги и отдельные ИТ-продукты. (Материал на английском языке).
In 2011, the economic "new normal" became better understood and organizations in nearly every vertical market began to target more holistic and comprehensive efforts to leverage the information available to them as a means of differentiating their business performance.
In 2010, revenue in the relational DBMS market was up almost 10% over 2009, at $20.7 billion (see "Market Share: RDBMS Software by Operating System, Worldwide, 2010" [forthcoming]). There are many factors contributing to the DBMS market growth, only one of which is the implementation of data warehouses supporting analytics. However, the combination of consumerized information management with consumer-driven analytics makes a strong case for asserting that data warehouse implementations were a significant contributor to market growth in 2011.
DBMS licenses can be implemented for any information management use case (for example, analytics, OLTP, metadata management and master data management), which means that the size of the data warehouse market can be estimated, rather than reporting actual revenue.
The LDW demand in the marketplace is significant, but is being pursued primarily by analytics architecture leader organizations. The LDW incorporates a combined infrastructure of repositories, data virtualization, distributed processing, system auditing metadata, end-user service level declarations and a decision engine to determine which of the data solutions available meet the negotiation between the SLAs and the system auditing results best (see "Does the 21st Century "Big Data" Warehouse Mean the end of the Enterprise Data Warehouse?").
During the period from September 2010 through November 2011, Gartner inquiries mentioning some aspect of the LDW design increased from virtually nil to approximately 15% of data warehouse inquiries. We anticipate that inquiries regarding the LDW hybrid environment will increase at a faster rate, with some aspect of the approach appearing in 25% (or slightly higher) of data warehousing advice inquiries by the end of 2012. However, actual market adoption will be uneven and very few (if any) fully deployed LDWs will exist by the end of 2012. We anticipate that it will remain a strong component of vision evaluation criteria for some time.
A more subtle aspect of the LDW is that it completely changes the definition of "size" of a data warehouse away from repository concepts to access and performance. Performance and information asset value defined by ease of access and the ability to apply information to use cases will become the new and most important value-metrics. Even with early adoption, the impact on the Magic Quadrant this year is primarily related to vendor vision. In "Does the 21st Century "Big Data" Warehouse Mean the end of the Enterprise Data Warehouse?" Gartner released the LDW concept after nearly 20 months of tracking the phenomenon.
The volume, variety, velocity and complexity issues which constitute big data quantitative capabilities and being able to address them, constituted a significant portion of the ability to execute in 2011 and we anticipate that its importance will increase by the end of 2012. In "'Big Data' Is Only the Beginning of Extreme Information Management" Gartner defined a twelve dimensional representation of big data solution design, which we call Extreme Information Management issues. Prior to the complete vision, Gartner expressed the quantitative aspects of extreme information management (big data) (see "Findings: 'Big Data' Is More Extreme Than Volume"). Gartner first identified this trend; stating "Many organizations that are creating large amounts of data that need to be analyzed and used are turning to MapReduce-enabled DBMSs to gain performance by processing these large sets of data in a parallel environment" (see "Hype Cycle for Data Management, 2011") and "Top 10 Technology Trends Impacting Information Infrastructure, 2012") .
As part of the answer to a troubled economic environment, the data warehouse has become a central element in information management and analytics for organizations in differentiating their performance relative to their peers. Organizations expect architecture and implementation leadership from vendors' professional services and support organizations and/or their partner and distribution channels.
Gartner first identified this trend as part of the overall data warehouse delivery in its Magic Quadrant analysis in 2010 (for 2009 market research), where we stated vendors have placed "significant and appropriate emphasis on the formalization of professional services to support data warehouse delivery in 2009. Some have purchased consultancy organizations, others have introduced formal approaches for identifying best practices from their existing field delivery teams and are creating standards of delivery based on those experiences." In our 2011 Magic Quadrant analysis, it became an important evaluation execution criteria and under a solution selling model throughout 2011, organizations implementing warehouse attributed significant positive effects to the presence of qualified professional services teams.
Appliances remain popular and most data warehouse environments will eventually include an appliance somewhere. However, the market has not yet determined the acceptable threshold regarding "how much" of it is appliance driven. It is important to note that while appliances continued to be popular in 2011, the No. 1 complaint is inflexibility regarding hardware. Further, the appliance market (even if all of Teradata, Exadata, IBM/Netezza and others are included) after 30 years of Teradata, seven years of Netezza and three years of Oracle Exadata, constitutes less than 15% of the delivered units in the data warehouse total market. Given that most large data warehouses witness a major revision and retrofit between years five to seven of their life cycle, the timing indicates that 2012 to 2013 could see an acceleration of appliance adoption.
In 2011, the draw toward analytics has provided significant new opportunities for entrants into the market or simply new opportunities for some struggling vendors already in the market. The noSQL movement (which is really not only SQL) has opened the door to information repositories that more closely resemble content systems than relational databases (see "Selecting a Database Technical Architecture").
The market also changed in another way. Many of the previous visionaries were acquired by the mega vendors (IBM/Netezza, HP/Vertica, SAP/Sybase and EMC/Greenplum) and hardware and infrastructure vendors found themselves searching for less threatening hardware partners to share and build their own channels. As a result, opportunities for the smaller data warehouse DBMS vendors abound as the market builds a new set of options for configured infrastructures and eco-system partners.
At the same time, some challenges have emerged for the traditional leaders. Oracle has more than three years experience in the market with Exadata, which is an inflection point for managing and scaling most data warehouses, making 2012 a bell-weather year for determining if Oracle's appliance strategy will continue to grow or pause.
Additionally, IBM has begun to leverage the Netezza acquisition by gaining significant new customers (especially relative to Linux). Teradata's appliance strategy for its 2600s and 1600s series of products has resulted in both an "on ramp" for more Teradata customers and a perimeter defense against the incursions of other appliance vendors.
The market demand is clear in that more data miners are competing with more reporting and more basic analytics in a manner that is approaching a 24 hour/day operational window. Some vendors have embraced the dual strategy by developing or acquiring fast replication and synchronization technology between two identical "warehouses," while others advise their customers to scale a single warehouse with more processing capacity and memory as well as load balancing, with most of the leaders offer multiple alternatives.
Gartner clients reported an increasing number of "dual" warehouses in 2011. Sometimes, these warehouses are two-tiered with a base warehouse underneath and a query-optimized second warehouse in production above it (these are complete copies of the warehouse simply stored differently). These are sometimes referred to as side-by-side operations. However, regardless of what it is named, this is an optimization strategy based on separating physical workloads — usually isolating loads and basic reporting or basic OLAP from the more data-intensive data mining efforts.
Organizations are also seeking alternatives to the traditional model where they own software licenses and servers and storage. The managed services warehouse is gaining market traction and companies like IBM's managed services, HP (via EDS) and even Cognizant (a professional services vendor) offer one alternative. Data warehouse database as a service (dbaaS) providers offer a warehouse on a platform from companies such as Kognitio and 1010data. dbaaS vendors such as Kognitio and 1010data offer DBMS implementations hosted on behalf of its customers with the hosted database off-site.
We continue our stance in 2011 and 2012 that POCs are not only mandatory to evaluate implementation options, but should be comprehensive examples of each of the workload types, which regularly occur in your own data warehouse (see "The State of Data Warehousing in 2011" [forthcoming in 1Q12]).
Organizations executing POCs using their own data at their own sites have reported experiences different from common market experiences for their chosen vendor. Additionally, Gartner clients report that one of the most important results of POCs is simply assessing how quickly a solution can be deployed and configured for operations, even though the vendor POC team can overtly influence this experience. For this same reason, while lab-based POCs are acceptable to examine workload mix and performance metrics in general, they are not specific for giving information on your actual time to delivery.
The data warehousing solution space now exhibits two highly distinct populations, traditional data warehouses and hybrid-enabled warehouses combining structured data and content (either in one management system or via database management system-enabled functionality such as UDFs, managed external processing and so on). The traditional data warehouse solution continues to pursue high performance, integrated data analysis, primarily for structured or tabular data. The performance demands in this space continue to rise.
The hybrid warehouse takes many forms, but in general, the market is demanding repository, virtualization and distributed processing capability, managed by a single system and able to respond to various use cases, which is another incarnation of the logical data warehouse.
In addition, we believe the data warehouse DBMS market will continue to change in 2012 to fulfill the demand for high speed, lower latency and large volumes of data brought about by new high-value applications (see "The State of Data Warehousing in 2011" and "Data Warehousing Trends for the CIO, 2011-2012").
As stated in the previous version of this Magic Quadrant analysis, we believe vendors have begun to establish their positions in preparation for a major battle over the data management role in the enterprise. Vendors that do not differentiate their offerings will either leave the market by choice or be forced out by economic necessity.
Once vendors have established their positions, the major tussle will begin, toward the end of 2013. It is becoming clearer that this will represent a major upheaval in the market, one that the larger vendors need to prepare for and that will give smaller vendors a market opportunity.
The new analytics infrastructure is a combination of services, platforms, repositories, metadata and optimization techniques which all work in concert. The "data warehouse" will become "data warehousing" — again. The concept of a single grand repository managing all the information for analytics use cases will be increasingly challenged and near 2017, a new infrastructure of highly-distributed processes and information assets will have emerged.
As described in "The State of Data Warehousing in 2011" several aspects of this battle are emerging:
-
The combination of repositories, data virtualization and data buses is now possible, given the state of hardware technology.
-
The reduced influence of BI platform optimization, in favor of DBMS optimization.
-
The increasing influence of master data management and data quality.
-
The demand for cloud solutions.
-
The rising demand for combining structured and content information.
Organizations have expressed an interest in technical solutions that is starting to erode the 2009 effect where everyone sought vendor financial viability. A spirit of experimentation, pilot schemes and prototypes has re-emerged. Organizations are reminded to closely align their analytics strategies and vendor road maps when choosing vendors.
The data warehouse DBMS market is complex, with a mix of mature and new products. Its complexity reflects many factors, such as:
-
The need for DBMS systems to support database sizes ranging from the 2 terabytes to 1+ petabytes.
-
The complexity of data in data warehouses, not only in terms of interrelationships, but also of desired data types.
-
The fact that data warehouses are built on many different hardware and operating systems, which a DBMS needs to support.
-
The growing and regularly changing variety of operations performed in data warehouses, which requires continuous management of the DBMS.
-
A DBMS has to support workloads ranging from simple to complex and to manage mixed workloads in many different combinations.
-
Users are getting better at creating specific SLAs and the implications of not meeting them are more serious.
The data warehouse DBMS has evolved from being an information store to a support for reporting and traditional BI platforms and now into a broader analytics infrastructure supporting operational analytics, performance management and other new applications and uses, such as operational BI, real-time fraud detection or consumer experience personalization and operational technologies (technologies that stream data from devices such as smart meters). Organizations are adding additional workloads with OLTP access and data loading latency is falling to near-continuous loading.
There are many other aspects to the data warehouse DBMS market, such as pricing models, geographic reach, partner channels, third-party software partnerships and data warehouse services (see "The State of Data Warehousing, 2012" [forthcoming] and "Data Warehousing Trends for the CIO, 2011-2012" for further information on these trends).
Source: gartner.com
