Thursday, June 19, 2014

Designing and Implementing Microsoft Clustering SQL Server with VMware

In today’s software market, society and companies demand production environments to be always on 24/7 and also requiring a disaster recovery (DR) plan with the minimal recovery time/point objectives. However, these requirements prove to be expensive and most businesses, if they cannot afford it, have to accept some downtime for maintenance and DR. One technology that has been a part of Microsoft Windows Servers since the NT 4.0 era is Microsoft Clustering. Clustering allows a service, such as SQL server, to run on multiple nodes. This allows for fail over within a data center if one node goes down, as well as making maintenance periods having minimal amounts of downtime. In addition to clustering services, virtualization has become an integral part of IT infrastructure. The age of virtualization grants the ability to run multiple instances of most operating systems, within one physical machine. These physical machines are referred to as hypervisors and the multiple instances running on them are called virtual machines. VMware in particular utilizes many features that allows for higher availability of production systems in a similar fashion like clustering where you can move these virtual machines through a process called vMotion between multiple hypervisors with similar CPU architecture (i.e. Intel vs AMD). I will be going over the designing and implementing of Microsoft Clustering SQL servers using VMware technology and specifically the issues that I ran into with my production and DR sites.

1.       Introduction
a.       Data center Design
                                                               i.      Storage
1.       Fiber Channel (FC)
a.       Expensive to implement
                                                                                                                                       i.      Fabric Switches
                                                                                                                                     ii.      Fiber cables
                                                                                                                                    iii.      SFP+ transceivers
                                                                                                                                   iv.      HBAs (up tp 16 Gbps)
2.       iSCSI
a.       Gigabit Ethernet
                                                                                                                                       i.      Inexpensive as it utilizes standard gigabit infrastructure
1.       Cat5/6 cables
2.       1 Gbps nics
b.      10 Gigabit Ethernet (GbE)
                                                                                                                                       i.      Expensive
1.       Requires additional infrastructure
a.       10 GbE switches, cables and transceivers (XFP)
b.      Can utilize Fiber Channel over Ethernet (FcOE) if migrating from FC back to copper medium
c.       iSCSI initiators
                                                                                                                                       i.      Hardware based
1.       Increased performance as Nic HBA utilizes Tcp offload engine to offload processing of packets away from CPU
2.       Typically would want to stay with homogeneous nics
                                                                                                                                     ii.      Software based
1.       Can utilize heterogeneous nics
2.       Potentially cost saving if network is not bottle neck for IO
3.       VMware recommended for most basic iSCSI setups
d.      Physical or Virtual Raw Device Mappings (RDMs)
                                                                                                                                       i.      Physical
1.       Pass through storage directly to guest
a.       Able to utilize SAN management software to maximize performance
2.       Unable to create VMware snapshot
3.       Higher performance (on paper)
                                                                                                                                     ii.      Virtual
1.       VMkernel only sends read/write commands to the presented storage
2.       Able to utilize VMFS features such as file locking, VMware snapshots, and easier to migrate
e.      Requirements
                                                                                                                                       i.      IOPs
1.       Number of disks
2.       Types of Disks
                                                                                                                                     ii.      Redundancy
1.       Number of controllers (Active/Active [expensive] or Active/Passive [cheaper])
                                                                                                                                    iii.      Load Balancing
1.       Paths to storage
                                                             ii.      Servers
1.       CPUs, RAM, HBAs,
                                                            iii.      WAN
1.       Considerations
a.       Throughput
b.      Synchronous or Asynchronous
c.       Homogeneous or Heterogeneous Production and DR sites
2.       Actual Implementation
a.       Production Environment
                                                               i.      Hardware
1.       SAN - Compellent (active/passive)
a.       Two tier storage
b.      15k SAS and 7.2k SAS-class drives
2.       Servers – 2x Dell r810
a.       Intel E7-4830 – 2.13 ghz octa-core w/ hyperthreading (16 physical cores with 32 logical cores total)
b.      64gb RAM
3.       HBA – qlogic 8 Gbps
4.       2x – 24-port Brocade 300 SAN Switch (FC)
                                                             ii.      Requirements
1.       Current IOPs on leased hardware was ~1500 IOPs
2.       Multipathing – Redundancy
                                                            iii.      Setup
1.       Two hypervisors with a dedicated nic for heartbeat channel
2.       Two port – HBA per host
3.       Physical RDM
a.       No requirement for VMware snapshots and preference for increased performance
b.      Block size format set to 64KB for presented storage dedicated for SQL DB files, logs, etc…
4.       Straight forward install and setup.
                                                           iv.      Caveats
1.       Expanding drive space on physical RDM requires downtime.
2.       VMware Fault Tolerance and vMotion not supported
3.       Round robin not supported with Native Multipathing Plugin. Dell/Compellant provides plugin on esxi for multipathing
b.      Disaster Recovery
                                                               i.      Hardware
1.       SAN – Dell Equallogic iSCSI active/passive
a.       10k SAS and 7.2k sas-class drives
2.       Servers – 2x Dell r810
a.       Intel E7-4830 – 2.13 ghz octa-core w/ hyperthreading (16 physical cores with 32 logical cores total)
b.      64gb RAM
3.       Nics – 4 port Intel and 4 port Broadcom
a.       Broadcom supports HW iSCSI, but there are some known issues with TCPoE with our specific model.
b.      Utilizing SW iSCSI
4.       2x 48 port dell powerswitch
                                                             ii.      Requirements
1.       Handle full disaster recovery situation from production environment.
2.       Multipathing – Redundancy
3.       Ability to test disaster recovery at least 4 times a year
4.       Less than ½ budget compared to production environment
                                                            iii.      Setup
1.       Two ESXi hypervisors with a dedicated nic for heartbeat channel
2.       Two nics dedicated for iSCSI traffic to fulfill redundancy/multipathing requirement
3.       VMware Site Recovery Manager (SRM) for DR solution
a.       Vsphere replication for heterogeneous SAN solutions
4.       iSCSI Storage
a.       Block size format set to 64KB for presented storage dedicated for SQL DB files, logs, etc…
5.       Initial issues
a.       Cannot use iSCSI storage presented to host, then mapped using RDMs to the MSCS nodes
                                                                                                                                       i.      Not supported because VMware cannot pass scsi-3 reservation codes to the SAN
1.       This will not pass the initial tests for creating a MSCS cluster through Microsoft wizards
b.      Had to add virtualized nics to the MSCS nodes that had access to the iSCSI network
c.       Presented storage directly to the guest and bypasses ESXi
d.      Presented nics using either intel or broadcom had to modify settings within Microsoft Server 2008 R2 OS
                                                                                                                                       i.      Issues that were presented was the nodes would randomly lose access to storage, thus destroying the cluster altogether
                                                                                                                                     ii.      After a week working with VMware, we identified the issue was caused by Microsoft was attempting to force TCP offload engine on the virtualized nic and segmentation offload:
1.       netsh int tcp set global chimney=disabled
2.       HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters Value(DWORD): DisableTaskOffload = 1
e.      Jumbo Frames
                                                                                                                                       i.      DO NOT FORGET TO ENABLE ON EACH NIC and SWITCH PORTS (Server end to end SAN)


No comments:

Post a Comment

Note: Only a member of this blog may post a comment.