Posted by: bratboy November 24, 2008
SQL DBAs
Login in to Rate this Post:     0       ?        
Partitioning is basically physically arranging data in one single table across multiple physical database files so that only a subset of data is read for most of the queries. It normally requires a Partition key and partition function. It also requires the use SQL Server Enterprise Edition 2005 or above. First a little background - if you have huge amount of data - say a billion row or above, it is very expensive to have them in one file. In this case you partition them into multiple physical database files and SQL Server will try to read the data - based on your query - from certain partitions only. This is called partition elimination. Auto increment columns by the way make for very poor partitioned column. To give a concrete example . Say you have a fact table of sales data for last two years, based on your need you might want to consider it to partition based on every month - so your table will consist of 24 partition. When a new month comes in you simply slide it in and swap out the oldest partition (it is called a sliding Window). I manage 4 SQL Server enterprise Databases from 3 TB to 13 TB (yes that's Tera byte) with the largest one containing 4.1 billion rows in the biggest fact table with 18 partition for last 18 months (partitioned by a day serial key ). The performance is OK. We are though next year migrating these database either to Vertica, Netezza or TeraData(most probably Vertica if it meets the data load times). My 13 experience of working with TeraData, Oracle and then SQL Server suggests that you will need partitioned tables only for massive data warehouse applications - and seldom in a OLTP application. Also generally you will see it in traditional big names databases (shared everything database in tech parlance) grafted over the traditional file access mechanism (which hinders parallel data loads by the way) . If you need to support really huge BI/Data Warehouses with huge dataload and brief windows of loading time - you are better served if you look at something else - for example Vertica, Netezaa or Teradata. You will save yourself a lot of grief. And at that kind of data volume, the price you will pay - work out to be cheaper for Netezaa than for SQL Server (100,000 /TB for Netezaa - hardware/software inclusive - compared to about 50 K for SQL Server EE licences, OS, Server and Storage ) Cheers Brat.
Read Full Discussion Thread for this article