SQL Server Database and Guest OS Storage Design
The starting point for any storage architecture for SQL Server Databases is actually with our last design principle: KISS (Keep It Standardized and Simple). But all of the principles apply. We will determine the smallest number of templates that are required to virtualize the majority (95%) of database systems, and anything that falls outside this will be handled as an exception.
Your first step is to analyze the inventory of the SQL Server Databases that will be virtualized as part of your project (refer to Chapter 4, “Virtualizing SQL Server 2012: Doing It Right”). From this inventory, you will now put each database and server into a group with similar-sized databases that have similar requirements. The storage requirements for all of these existing and new databases, based on their grouping, will be used to define the storage layouts and architecture for each of the SQL Server Databases, Guest OS, and VM template.
SQL Server Database File Layout
Database file layout provides an important component of database storage performance. If you have existing databases that will be virtualized, you or your DBAs will likely have already developed some practices around the number of database files, the size of database files, and the database file layout on the file system. If you don’t have these practices already in place, here we provide you with some guidelines to start with that have proven successful.
Your SQL Server database has three primary types of files you need to consider when architecting your storage to ensure optimal performance: data files, transaction log files, and Temp DB files. Temp DB is a special system database used in certain key operations, and has a big performance impact on your overall system. The file extensions you’ll see are .mdf (master data file), .ndf (for secondary data files), and .ldf for transaction log files. We will go over all of these different file types later in this chapter.
Number of Database Files
First, we need to determine the number of database files. There are two main drivers for the number of files you will specify. The first driver is the number of vCPUs allocated to the database, and the second is the total capacity required for the database now and in the future.
Two design principles come into play here: The parallelism of access to storage should be maximized by having multiple database files, and storage performance is more than just the underlying devices. In the case of data files and Temp DB files, they are related to the number of CPU cores allocated to your database. Table 6.1 provides recommendations from Microsoft and the authors in relation to file type.
Table 6.1 Number of Data Files and Temp DB Files Per CPU
Microsoft Recommended Setting
Author Recommended Setting
Temp DB Data File
1 per CPU core
< 8 vCPU, 1 per vCPU
> 8 vCPU, 8 total (increase number of files in increments of four at a time if required)
Database Data File
0.25 to 1.0 per file group, per CPU core
Min 1 per vCPU, max 32
Database Transaction Log File
Temp DB Transaction Log File
Microsoft recommends as a best practice that you should configure one Temp DB data file per CPU core and 0.25 to 1 data file (per file group) per CPU core. Based on our experience, our recommendation is slightly different.
If your database is allocated eight or fewer vCPUs as a starting point, we recommend you should configure at least one Temp DB file per vCPU. If your database is allocated more than eight vCPUs, we recommend you start with eight Temp DB files and increase by lots of four in the case of performance bottlenecks or capacity dictates.
We recommend in all cases you configure at least one data file (per file group) per vCPU. We recommend a maximum of 32 files for Temp DB or per file group for database files because you’ll start to see diminishing performance returns with large numbers of database files over and above 16 files. Insufficient number of data files can lead to many writer processes queuing to update GAM pages. This is known as GAM page contention. The Global Allocation Map (GAM) tracks which extents have been allocated in each file. GAM contention would manifest in high PageLatch wait times. For extremely large databases into the many tens of TB, 32 files of each type should be sufficient.
Updates to GAM pages must be serialized to preserve consistency; therefore, the optimal way to scale and avoid GAM page contention is to design sufficient data files and ensure all data files are the same size and have the same amount of data. This ensures that GAM page updates are equally balanced across data files. Generally, 16 data files for tempdb and user databases is sufficient. For Very Large Database (VLDB) scenarios, up to 32 can be considered. See http://blogs.msdn.com/b/sqlserverstorageengine/archive/2009/01/04/what-is-allocation-bottleneck.aspx.
If you expect your database to grow significantly long term, we would recommend that you consider configuring more data files up front. The reason we specify at least one file per CPU is to increase the parallelism of access from CPU to data files, which will reduce any unnecessary data access bottlenecks and lower latency. This also allows for even data growth, which will reduce IO hotspots.
Figure 6.3 shows an example of data files, Temp DB files, and transaction log files allocated to a SQL Server 2012 Database on a sample system with four vCPU and 32GB RAM.
Figure 6.3 SQL Database data file allocation.
Size of Database Files
Let’s start the discussion on data file sizes with some fundamentals that are important to understand. Data files, unlike transaction log files, are accessed in parallel and the IO pattern is more random. Temp DB files are accessed in parallel in a round-robin fashion. This is why having more database files improves the parallelism of IO access to storage. In effect, the IO is striped across the data files.
It is important to configure your database files to be equal size to start with. SQL Server will write data to the available files evenly if they are the same size, as it uses a proportional fill algorithm that favors allocations in files with more free space. If the files are the same size, then each file will have the same free space. Having equally sized files ensures even growth and more predictable performance.
The next important point is that you should preallocate all your data files and transaction log files. This will eliminate the need for the database to constantly grow the files and resize them, which will degrade performance and put more stress on your storage platform. The files can’t be accessed for the period of time they are being extended, and this will introduce avoidable latency.
It is a Microsoft best practice and our recommendation to manually and proactively manage file sizes. Because you are presizing and proactively managing your database files, you shouldn’t need to rely on Auto Grow as much. Even though it may not be needed, we recommend that Auto Grow be left active as a safety net.
If you are unsure what your underlying block size is, set Auto Grow to a multiple of 1MB. To prevent Auto Grow from being active too often, consider configuring it to grow at around 10% of your initial database size rounded up to the nearest 1MB (or block size), up to a maximum of 4GB. In most cases, an Auto Grow amount of 256MB to 512MB should be sufficient. This will ensure the grow operation doesn’t take too long and is aligned to the underlying storage subsystem.
Now that we’ve covered the fundamentals, we can calculate the initial size of the database files. The initial files sizes are fairly easy to determine if you’re migrating an existing system—in which case, we recommend you preset your files to be the same size as the system that is being migrated, which would be the case if you are doing a standard physical-to-virtual migration. If this is a new database being virtualized, you will need to estimate the database files’ initial size.
Data File Sizing
For data files, the preset size you should use is based on the estimated or actual size of your database. You should allow for reasonable estimated growth (three to six months). Once you have the total estimated size of your database, including growth, divide that by the number of files to get the size of each file. For example, if you had a database 200GB in size with four vCPUs configured, you would have four data files, assuming one file per vCPU, with a preset size of 50GB each. Each data file should always be of equal size and be extended at the same rate.
Temp DB File Sizing
The size of your Temp DB files should be based on the high watermark usage you estimate for your queries and the overall size of your database. This can be hard to estimate without knowledge of your workload because different queries will impact your Temp DB usage in different ways. The best way to determine the appropriate size will be to monitor Temp DB usage during a proof of concept test, or benchmarking and baselining activities.
As a starting point, we recommend you consider sizing Temp DB to 1% the size of your database. Each file would then be equal to Total size of Temp DB divided by the number of files. For example, if you had a 100GB database with four vCPUs configured, you would have an initial total Temp DB size of 1GB, and each Temp DB data file would be 250MB in size. If you see significantly more Temp DB use during ongoing operations, you should adjust the preset size of your files.
Transaction Log File Sizing
The total size that your database transaction log file should be preset to will primarily be based on the actual or estimated high water mark of transaction storage required before the next backup or transaction log truncation. We are assuming for the purposes of this section that you care about data protection and preventing data loss of your database and are therefore using the full recovery model. Data loss is a risk when using the other available recovery models.
If you are doing daily backups, you will need to ensure that your log file is sufficiently sized to allow up to at least a day’s worth of transactions. This will allow you to recover back to the point in time your database goes down by using the last backup and replaying the transaction logs. In some large database systems, you will need to back up the transaction logs much more frequently than every day.
When it comes to storage performance and sizing of your transaction log, the total size and how fast you can write transactions to it are important but are not the only considerations. You must also consider the performance of file growth, DB restart, and backup and recovery operations. With this in mind, it is critical that not only is the total size of your transaction log appropriate, but also how you grow your transaction log to that size. The reason this is so critical is that in SQL Server, even though your transaction log may be one physical file, it’s not one physical transaction log.
Your one physical transaction log is actually made up of a number of smaller units called Virtual Log Files (VLFs). VLFs are written to sequentially, and when one VLF is filled, SQL Server will begin writing to the next. They play a critical part in the performance of database backup and recovery operations.
The number of VLFs is determined at the time a file is created or extended by the initial size allocated to the transaction log and the growth amount “chunk” each time it is increased in size. If you leave the default settings with a large database, you can quickly find yourself with tens if not hundreds of thousands of VLFs, and this will cause a negative performance impact. This is why the process of preallocating the transaction log file and growing it by the right amount is so important.
If the VLFs are too small, your maintenance, reboots, and database recovery operations will be excruciatingly slow. If your VLFs are too big, your log backups and clearing inactive logs will be excruciatingly slow and may impact production performance. The reason for the former is that SQL Server must load the list of VLFs into memory and determine the state of each, either active or inactive, when doing a DB restart or recovery. The latter is because a VLF can’t be cleared until the SQL Server moves onto the next one.
As you can see from Table 6.2, if you create or grow a transaction log file by 64MB or less at a time, you will get four VLFs each time. If you need 200GB of transaction log, and it is created or grown by this amount, you end up with 12,800 VLFs, with each VLF being 16MB. At or before this point, you’d start to notice performance problems.
Table 6.2 Number of VLFs Allocated per Chunk Size
Number of VLFs
> 64MB and <=1GB
Let’s take a look at another example: Suppose you have a 128GB log file created as 128GB to begin with. This file will have 16 VLFs, and each VLF will be 8GB. This means that each VLF can only be cleared at more than 8GB and when completely inactive. The process of clearing the log segment will likely have a direct impact on the performance of the database.
To avoid the performance problems covered previously, you should ensure your VLF size is between 256MB and 512MB. This will guarantee that even if your transaction log were to reach the maximum size of 2TB, it will not contain more than 10,000 VLFs. To achieve this, you can preset your log file to either 4GB or 8GB and grow it (either manually or with Auto Grow) by the same amount each time. If we take the example of the 128GB transaction log, you would initially create a 8GB log file and then grow it by 8GB fifteen times. This will leave you with the 128GB log file and 256 VLFs within that log file, at 512MB each. You should set your transaction log file Auto Grow size to be the same as whatever growth increment you have decided upon.
Even if your database were relatively small, we would recommend that you start with a 4GB or 8GB (4,000MB or 8,000MB) transaction log file size. You should proactively and manually manage the size of your transaction log. Proactive management will avoid Auto Grow kicking in during production periods, which will impact performance. This is especially important when considering the transaction log will be growing at 4GB or 8GB at a time and having all those blocks zeroed out. However, just as with data files and Temp DB files, you should have Auto Grow enabled as a safety net and set it to either 4GB or 8GB, depending on the growth size you have selected.
Instant File Initialization
When a database file is created or extended, SQL Server will by default zero out the newly created file space. This will cause performance to degrade if it occurs during periods of intense database write activity, which is most likely if database files are not proactively managed and Auto Grow is extending the files. There is also the length of time required to write zeros to all the new blocks during which access to the file is blocked, as mentioned previously. To greatly improve the speed and reduce performance impacts of file growth operations, you can configure SQL Server to instantly initialize the database files without zeroing them out.
To allow your database instance to leverage Instant File Initialization (IFI), you need to add the SQL Server Service account to the Perform Volume Maintenance Tasks security policy (using Group Policy Editor gpedit.msc or Local Group Policy Editor secpol.msc), as shown in Figure 6.4. We strongly recommend this setting be applied in group policy, especially when AlwaysOn Availability Groups are used or when there are multiple databases that will have this setting enabled.
Figure 6.4 Enabling the Perform Volumes Maintenance Tasks security policy.
After you have made this change, you will need to restart your SQL Server services for it to take effect. We recommend you make this setting a standard for all your SQL Server databases and include it in your base template.
SQL Server File System Layout
We have covered how you determine how many files your databases need and how big each file should be preallocated. We now need to assign these files to the appropriate locations on the file system and configure the file system within the Windows operating system to achieve the best possible performance. The file system layout may differ slightly between database instances that are standalone, that use AlwaysOn Failover Clustering, or that use AlwaysOn Availability Groups. We will discuss the possible differences and give you recommendations based on our experience to help you achieve performance that can be built in to your standardized base templates.
OS, Application Binaries, and Page File
The OS, application binaries, and page file should be separated from the data, Temp DB, and log files in a SQL Server database. These components of the system generally produce little IO, but we don’t want any IO interfering with productive IO from the database. For this reason, we recommend that OS, application binaries, and the page file be on a separate drive (or drives) and IO controller from data files, log files, and Temp DB.
From a database storage performance perspective, any paging is bad and should be avoided. Details of the page file and SQL Server memory configuration will be covered in Chapter 7, “Architecting for Performance: Memory.” Chapter 7 will show you how to avoid paging and optimize performance from the memory configuration of your SQL Server.
File System Layout for Data Files, Log Files, and Temp DB
When considering the design of the file system layout for data files, log files, and Temp DB, our objectives are as follows:
- Optimize parallelism of IO (Principle 1).
- Isolate different types of IO from each other that may otherwise cause a bottleneck or additional latency, such as OS and page file IO from database IO, or sequential log file IO from random data file IO.
- Minimize management overheads by using the minimum number of drive letters or mount points required to achieve acceptable performance (Principle 5).
In order to achieve objectives 1 and 2, we recommend splitting out data files and Temp DB files from log files onto separate drive letters or mount points. This has the effect of killing two birds with one stone. By separating log files into their own drive or mount point, you maintain the sequential nature of their IO access pattern and can optimize this further at the hypervisor and physical storage layer later if necessary. If the log files share a drive or mount point, the access pattern of that device will instantly become random. Random IO is generally harder for storage devices to service. At the same time, you are able to increase the parallelism needed for the IO patterns of the data files and Temp DB files.
To achieve greater IO parallelism at the database and operating system layer, you need to allocate more drive letters or mount points. The reason for this is that each storage device (mount point or drive) in Windows has a certain queue depth, depending on the underlying IO controller type being used. Optimizing the total number of queues available to the database by using multiple drives or mount points allows more commands to be issued to the underlying storage devices in parallel. We will discuss the different IO controllers and queue depths in detail later.
As a starting point for standalone database instances, we recommend that you configure a drive letter or mount point per two data files and one Temp DB file. This recommendation assumes each file will not require the maximum performance capability of the storage device at the same time. The actual number of drive letters or mount points you need will be driven by your actual database workload. But by having fewer drives and mount points will simplify your design and make it easier to manage. The more users, connections, and queries, the higher the IO requirements will be, and the higher the queue depth and parallelism requirements will be, and the more drive letters and mount points you will need.
The example in Figure 6.5 illustrates how you might arrange your database files for a standalone instance. If you start to see IO contention and your database is growing (or is expected to grow) very large or makes a lot of use of Temp DB, then you may wish to separate out Temp DB files onto their own drive letters or mount points. This would remove the chance of Temp DB IO activity impacting the IO activity of your other data files and allow you to put Temp DB onto a separate IO controller (point 2 of our file system layout objectives).
Figure 6.5 Sample SQL Server file system layout—Temp DB with data files.
Having a single Temp DB file on the same drive with two data files in general will balance the IO activity patterns and achieve acceptable performance without an excessive number of drives to manage. The reason for this layout is more likely on a standalone instance instead of with a clustered instance, which will become clear on the next page.
In the example in Figure 6.6, we have split out the Temp DB files onto separate drive letters from the data files of the production database. If you have a very large database or your database will have heavy IO demands on Temp DB, it makes sense to split it out onto its own drives and a separate IO controller.
Figure 6.6 Sample SQL Server file system layout—data files separate from Temp DB.
In databases that make extremely heavy use of Temp DB, such as peaking at more than 50% of total database size, it might make sense for each Temp DB file to be on its own drive or mount point to allow each file access to more parallel IO resources. This assumes that the underlying storage infrastructure can deliver more IO in parallel, which we will cover later in this chapter.
In an AlwaysOn Failover Cluster Instance, an additional reason to separate Temp DB onto different drives or mount points from other data files is that it can be hosted locally to the cluster node, rather than on the shared storage. This makes a lot of sense given that the Temp DB data doesn’t survive instance restarts. This allows you to optimize the performance of Temp DB without impacting the data files and log files that are shared between cluster nodes. If you have extreme Temp DB IO requirements, you could consider locating it on local flash storage, but consider that this would prevent the guest restarting in a VMware HA event. In this case, the cluster node would be unavailable if the local flash storage failed, which would trigger a failover to another node. This is a new feature available with SQL Server 2012 AlwaysOn that wasn’t previously available (see http://technet.microsoft.com/en-us/sqlserver/gg508768.aspx). More details about AlwaysOn Availability Groups and Failover Cluster Instances are provided in Chapter 9, “Architecting for Availability: Choosing the Right Solutions.”
NTFS File System Allocation Unit Size
Now that we have covered the SQL Server database layout on the file system, we need to cover another important aspect of the database file system design: the NTFS Allocation Unit Size (also known as Cluster Size). When you format a drive or mount point in Windows, you have the option of choosing a different NTFS Allocation Unit Size from the default (4KB in most cases). The NTFS Allocation Unit Size is important because it’s the smallest amount of disk space that can be used to hold a file. If a file doesn’t use the entire Allocation Unit, additional space will be consumed.
Having a small (default) Allocation Unit Size means there are many more times the number of blocks at the file system level that need to be managed by the operating system. For file systems that hold thousands or millions of small files, this is fine because there is a lot of space savings by having a smaller Allocation Unit in this scenario. But for a SQL Server database that consists of very few, very large files, having a much larger Allocation Unit is much more efficient from a file system, operating system management, and performance perspective.
For the OS and Application Binary drive, keeping the default of 4KB Allocation Unit is recommended. There is no benefit in changing from the default. If your page file is on a separate drive from the OS, you should use a 64KB Allocation Unit size. For all SQL Server database drives and mount points (data files, log files, and Temp DB files), we recommend you use 64KB as your Allocation Unit Size setting (see Figure 6.7).
Figure 6.7 NTFS Allocation Unit Size.
Each storage device reads and writes data at different underlying block sizes. A block on a storage device is the least amount of data that is read from or written to with each storage option. If your file system partition is not aligned to the underlying blocks on the storage device, you get a situation called Split IO in which multiple storage operations are required to service a single operation from your application and operating system. Split IOs reduce the available storage performance for productive IO operations, and this gets even worse when RAID is involved, due to the penalty of certain operations, which we’ll cover later in this chapter.
Figure 6.8 shows what would be considered a worst-case scenario, where the file system partition and the VMware vSphere VMFS partition are misaligned. In this case, for every three backend IOs, you get one productive IO. This could have the effect of causing each IO operation 3X latency, which is like getting 30% performance from your 100% storage investment. Fortunately, with Windows 2008 and above and with VMFS volumes that are created through VMware vCenter, this problem is much less likely.
Figure 6.8 File system and storage that is not correctly aligned.
Starting with Windows 2008, all partitions are aligned to the 1MB boundary. This means in almost all cases, they will be aligned correctly. The same is true with VMFS5 partitions created through VMware vCenter. They will align to the 1MB boundary. However, if you have an environment that has been upgraded over time, you may still have volumes that are not correctly aligned. The easiest way to check is to monitor for Split IOs in both ESXTOP or in Windows Performance Monitor.
Figure 6.9 shows reading of one frontend block will require only one backend IO operation, thus providing lower latency and higher IO performance.
Figure 6.9 File system and storage that is aligned.
SQL Server Buffer Pool Impact on Storage Performance
The Buffer Pool is a critical region of memory used in SQL Server, and it has a large impact on storage performance. The important thing to note from a storage performance perspective is that a larger Buffer Pool produces less read IO on your storage and lower transaction latency at your database. The Buffer Pool is a big read cache for your database. If you size it incorrectly or if the Buffer Pool is paged out by the operating system, you will start to experience performance degradations and a large amount of additional read IO hitting your storage. How the Buffer Pool is covered in detail in Chapter 7, including how to avoid Windows paging out the Buffer Pool when virtualizing business critical databases.
Updating Database Statistics
The SQL Server Query Plan Optimizer uses statistics compiled from tables to try and estimate the lowest cost execution path for a given query. By default, statistics are updated automatically at defined thresholds (refer to http://msdn.microsoft.com/en-us/library/dd535534%28v=sql.100%29.aspx), such as when 20% of a table changes since statistics were last gathered.
The Query Optimizer’s cost-based algorithm takes into account system resources such as CPU and IO to calculate the most efficient query, and overall table size and distribution of data. For example, it is better to join a three-row table to a million-row table, than to join a million-row table to a three-row table.
The cost to performance if the statistics are outdated and the impact on your storage can be high. Outdated statistics cause suboptimal query execution paths that can result in many more full table scans and therefore higher IO than would otherwise be required. For large databases that have hundreds of millions or billions of rows in a particular table, which can be common with SAP systems, the impact can be very severe. Therefore, it is important that you have up-to-date statistics.
There are two primary methods to deal with the problem of outdated statistics impacting your database and storage IO performance.
Trace Flag 2371—Dynamic Threshold for Automatic Statistics Update
The first method involves using trace flag 2371 by setting startup option –T2371 or DBCC TRACEON (2371, -1). This is documented in Microsoft KB 2754171 (http://support.microsoft.com/kb/2754171). This trace flag tells SQL Server to dynamically change the percentage a table needs to change before the statistics are automatically updated. In very large tables, an automatic update of statistics can now be triggered by a change of less than 1%. Using this option could result in significantly improved performance for situations where you have very large tables.
Updating Database Statistics Using a Maintenance Plan
The second method for addressing out-of-date statistics is by using a maintenance plan. If you need more control over when database statistics updates occur, you can schedule a maintenance plan task for your databases. Ideally, the maintenance plan would be scheduled to happen when it would have the least impact on the database, and run only as frequently as needed. To determine when and how often it should run requires you to know your database workload patterns and to monitor query plan execution efficiency. Depending on your database, you may wish to schedule it to initially happen daily and adjust the schedule based on observed performance. Figure 6.10 shows the Update Statistics option in the Maintenance Plan Wizard. A full step-by-step example is provided in Chapter 11.
Figure 6.10 Maintenance Plan Wizard’s Statistics Update option.
Data Compression and Column Storage
Data Compression and Column Storage (also known as xVelocity memory optimized column store indexes) are features available only with SQL Server Enterprise Edition. They are not available in other editions. (See http://technet.microsoft.com/en-us/library/cc645993.aspx for a list of which features are supported in which SQL Server editions.) If you are licensed and using SQL Server Enterprise Edition, we would recommend you make use of these features where appropriate.
Data Compression was originally introduced in SQL Server 2008 and has improved markedly in 2012. One of the most important things to understand about Data Compression is that it’s not just about space savings, although the savings can be significant. Using Data Compression can also have a very positive impact on storage performance and Buffer Pool usage by reducing the number of IOPS and allowing the database to store more pages in memory in the Buffer Pool in compressed form. Using compression can also dramatically reduce query execution time, as fewer pages need to be read from cache or disk and analyzed for a given query.
In SQL Server 2012, you can choose to compress either a table or index using row or page compression. By default, when you choose page compression, it automatically does row compression at the same time. Based on our experience, space savings and performance improvements of up to 75% with SQL Server 2012 can be achieved in many cases. Data Compression can be used with both OLTP and OLAP type workloads, including Data Warehouse and Batch.
Column Storage, also known as xVelocity memory optimized column store index, is a new feature of SQL Server 2012 aimed at data warehouse workloads and batch processing. Column Storage is much more space and memory efficient at storing and aggregating massive amounts of data. Leveraging this feature can greatly improve the performance of data warehouse queries. However, to use it you must make some tradeoffs.
When using Column Storage, you will not be able to use Large Pages and Lock Pages in Memory (trace flag 834) because this will increase the work the translation look-aside buffer (TLB, see Chapter 7) has to do. Also, the tables using the column store index will be read-only. Any time you need to write data to the table, you need to drop and re-create the column store index, but this can easily be done with scheduled batch jobs. For the types of workloads that Column Storage is well suited to, these tradeoffs are normally worth the benefits.
The benefits of Column Storage as documented in the link in the following tip include:
- Index compression—Column Storage indexes are far smaller than their B-Tree counterparts.
- Parallelism—The query algorithms are built from the ground up for parallel execution.
- Optimized and smaller memory structures
From a storage perspective, the benefits of Column Storage are far less storage capacity and performance being required to achieve the desired query performance. The improvement in query performance ranges from 3X to 6X on average, up to 50X. See http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-components-postattach-ments/00-10-36-36-43/SQL_5F00_Server_5F00_2012_5F00_Column_2D00_Store.pdf.
Database Availability Design Impacts on Storage Performance
The database availability design you choose will have a direct impact on your storage performance. The choice between the different availability types varies. In this book, we are focusing mainly on standalone instances using VMware vSphere HA, SQL Server 2012 AlwaysOn Availability Groups (AAG), and AlwaysOn Failover Cluster Instances (FCIs). Standalone instances and FCI have relatively the same storage capacity requirements (unless local Temp DB is used in the FCI case) and have the same storage performance requirements. AlwaysOn Availability Groups, which has some advantages from an availability and data protection standpoint, at least doubles the total capacity requirements as well as adds additional IO overhead, dependent on the workload, and specifies how many inserts, updates, and deletes there are, as each database change must be replicated.
Volume Managers and Storage Spaces
When you set up storage within Windows, you have the option of using the Windows Volume Manager with Basic or Dynamic Disks or using Storage Spaces (Windows 2012 onwards). Dynamic Disks and Storage Spaces provide options that include spanning volumes, striping volumes, and fault-tolerant volumes inside the guest operating system. Managing spanned, striped, or fault-tolerant volumes inside Windows adds an unnecessary IO overhead when you are virtualizing your SQL Server—especially as you are being provided with these services through your underlying storage devices and your virtual disks can be expanded online without disruption. Because of the way that SQL Server manages its data files, and effectively accesses them and stripes the files anyway, there is no need to add any additional layers of striping for performance inside Windows. We recommend the use of basic disks in Windows and the GPT (GUID Partition Table) partition format for all SQL Server partitions. Using GPT, you will be able to expand the partitions beyond 2TB in the future (vSphere 5.5 or above required) if the need arises. As a best practice, you should configure only one partition per virtual disk for your database.