While many organizations are using Hadoop for big data warehousing, it’s also worth considering a reliable alternative: Microsoft SSAS (SQL Server Analysis Services).
Here in Silicon Valley, you can’t have a cup of coffee without overhearing a conversation about big data. “Build-Measure-Learn”, the mantra of today’s corporate world, results in every organization within a company needing a system where data from disparate sources can be easily analyzed and reported on. Enter “data warehousing”.
As familiar a term as “big data” is, there are many misconceptions regarding traditional data warehousing tools such as SQL Server Analysis Services (Microsoft SSAS) and more recent evolutions like Hadoop. In this blog post, we deal with one of these misconceptions – SSAS and its applicability while dealing with larger datasets.
Crocodile, crocodile – which color do you want?
Consider this scenario: Company A has generated reams of data regarding their operations, and needs to generate useful insights to streamline their processes. So which data warehousing tool should they choose? The decision to use one type over the other depends on the type of data, its size and the reporting needs. And no, the answer is not Hadoop every single time. Here’s a look at the benefits of an alternative solution, SQL Server Analysis Services.
SQL Server Analysis Services (SSAS)
SQL Server Analysis Services (SSAS) has existed for a long time, and has proven time and again that it can handle really complex data. One can implement complex business logic into the design of the datamart that uses SSAS. There are many reasons to consider it as your go-to analytical tool:
Multidimensional data models: SSAS works well with multidimensional data models. You can design as many dimensions and measures as you would like to have in your datamart while maintaining a good design that focuses on relevant metrics.
Dimension Hierarchies: SSAS lets you build hierarchies, which let you slice and dice and drill through your dimension data for all reporting needs.
Speed: SSAS cubes have the ability to precompute and physically store the aggregations used to summarize the data. When a query is run against the cube, it does not compute the calculations at run time, but shares precomputed aggregations, which allows results to be shared immediately. A query that was once run against the cube will remain in the cache until the cache is updated next time. You also have control of when to run the cache update.
Built-in time calculations: Analysis services lets you perform many time-based calculations such as rolling averages, and year-to-date metrics.
Historical Data Analysis: The Microsoft SSAS cube allows for historical data storage and comparison through its MDX and DMX calculations.
Integrates well with many reporting tools: SSAS cube solutions can be integrated with many front end reporting tools, and no customized software builds are necessary in order to run reports. It works well with Excel, Tableau, and SQL Server Reporting Services, to name a few. The user does not need extensive training to start using the tool for reporting.
Data security: Microsoft SSAS features some of the most complex data security at a cell level. You can set security for various dimensions, as well as dimension attributes for users with different security levels.
Capacity to handle HUGE data volumes: Despite common misconceptions, it is indeed possible to build multi-terabyte sized cube solutions. The key to building a successful big cube is to keep all the design best practices in mind, and follow best practices for data processing and query performance tuning. There are several techniques one can apply to get the best out of their SSAS cube with multi-terabyte sized data. Microsoft has released several white papers for this purpose. With careful design and adding performance-tuning methods, you can get the most out of the SSAS cube for very large datasets.
In summary, for data sets ranging from a few gigabytes to tens of terabytes, Microsoft’s SSAS could be your tool of choice. Decisions regarding which data warehousing tool needs to be used should be made after detailed analysis of the type of data being reported on and the reporting needs.
“If you do not know how to ask the right question, you discover nothing”.
– W. Edwards Deming
Pravina Parthasarthy is a Senior Business Intelligence Consultant in DesignMind’s Business Intelligence group. She specializes in data warehouse design and implementation.