In today’s software market, society and companies demand
production environments to be always on 24/7 and also requiring a disaster
recovery (DR) plan with the minimal recovery time/point objectives. However,
these requirements prove to be expensive and most businesses, if they cannot
afford it, have to accept some downtime for maintenance and DR. One technology
that has been a part of Microsoft Windows Servers since the NT 4.0 era is
Microsoft Clustering. Clustering allows a service, such as SQL server, to run
on multiple nodes. This allows for fail over within a data center if one node
goes down, as well as making maintenance periods having minimal amounts of
downtime. In addition to clustering services, virtualization has become an
integral part of IT infrastructure. The age of virtualization grants the
ability to run multiple instances of most operating systems, within one
physical machine. These physical machines are referred to as hypervisors and
the multiple instances running on them are called virtual machines. VMware in
particular utilizes many features that allows for higher availability of
production systems in a similar fashion like clustering where you can move
these virtual machines through a process called vMotion between multiple
hypervisors with similar CPU architecture (i.e. Intel vs AMD). I will be going over the designing and implementing of Microsoft
Clustering SQL servers using VMware technology and specifically the issues that
I ran into with my production and DR sites.
1.
Introduction
a. Data center
Design
i.
Storage
1. Fiber
Channel (FC)
a. Expensive
to implement
i.
Fabric Switches
ii.
Fiber cables
iii.
SFP+ transceivers
iv.
HBAs (up tp 16 Gbps)
2. iSCSI
a. Gigabit
Ethernet
i.
Inexpensive as it utilizes standard gigabit
infrastructure
1. Cat5/6
cables
2. 1
Gbps nics
b. 10
Gigabit Ethernet (GbE)
i.
Expensive
1. Requires
additional infrastructure
a. 10
GbE switches, cables and transceivers (XFP)
b. Can
utilize Fiber Channel over Ethernet (FcOE) if migrating from FC back to copper
medium
c. iSCSI
initiators
i.
Hardware based
1. Increased
performance as Nic HBA utilizes Tcp offload engine to offload processing of
packets away from CPU
2. Typically
would want to stay with homogeneous nics
ii.
Software based
1. Can
utilize heterogeneous nics
2. Potentially
cost saving if network is not bottle neck for IO
3. VMware
recommended for most basic iSCSI setups
d. Physical
or Virtual Raw Device Mappings (RDMs)
i.
Physical
1. Pass
through storage directly to guest
a. Able
to utilize SAN management software to maximize performance
2. Unable
to create VMware snapshot
3. Higher
performance (on paper)
ii.
Virtual
1. VMkernel
only sends read/write commands to the presented storage
2. Able
to utilize VMFS features such as file locking, VMware snapshots, and easier to
migrate
e. Requirements
i.
IOPs
1. Number
of disks
2. Types
of Disks
ii.
Redundancy
1. Number
of controllers (Active/Active [expensive] or Active/Passive [cheaper])
iii.
Load Balancing
1. Paths
to storage
ii.
Servers
1. CPUs,
RAM, HBAs,
iii.
WAN
1. Considerations
a. Throughput
b. Synchronous
or Asynchronous
c. Homogeneous
or Heterogeneous Production and DR sites
2.
Actual Implementation
a. Production
Environment
i.
Hardware
1. SAN
- Compellent (active/passive)
a. Two
tier storage
b. 15k
SAS and 7.2k SAS-class drives
2. Servers
– 2x Dell r810
a. Intel
E7-4830 – 2.13 ghz octa-core w/ hyperthreading (16 physical cores with 32
logical cores total)
b. 64gb
RAM
3. HBA
– qlogic 8 Gbps
4. 2x
– 24-port Brocade 300 SAN Switch (FC)
ii.
Requirements
1. Current
IOPs on leased hardware was ~1500 IOPs
2. Multipathing
– Redundancy
iii.
Setup
1. Two
hypervisors with a dedicated nic for heartbeat channel
2. Two
port – HBA per host
3. Physical
RDM
a. No
requirement for VMware snapshots and preference for increased performance
b. Block
size format set to 64KB for presented storage dedicated for SQL DB files, logs,
etc…
4. Straight
forward install and setup.
iv.
Caveats
1. Expanding
drive space on physical RDM requires downtime.
2. VMware
Fault Tolerance and vMotion not supported
3. Round
robin not supported with Native Multipathing Plugin. Dell/Compellant provides
plugin on esxi for multipathing
b. Disaster
Recovery
i.
Hardware
1. SAN
– Dell Equallogic iSCSI active/passive
a. 10k
SAS and 7.2k sas-class drives
2. Servers
– 2x Dell r810
a. Intel
E7-4830 – 2.13 ghz octa-core w/ hyperthreading (16 physical cores with 32
logical cores total)
b. 64gb
RAM
3. Nics
– 4 port Intel and 4 port Broadcom
a. Broadcom
supports HW iSCSI, but there are some known issues with TCPoE with our specific
model.
b. Utilizing
SW iSCSI
4. 2x
48 port dell powerswitch
ii.
Requirements
1. Handle
full disaster recovery situation from production environment.
2. Multipathing
– Redundancy
3. Ability
to test disaster recovery at least 4 times a year
4. Less
than ½ budget compared to production environment
iii.
Setup
1. Two
ESXi hypervisors with a dedicated nic for heartbeat channel
2. Two
nics dedicated for iSCSI traffic to fulfill redundancy/multipathing requirement
3. VMware
Site Recovery Manager (SRM) for DR solution
a. Vsphere
replication for heterogeneous SAN solutions
4. iSCSI
Storage
a.
Block size format set to 64KB for presented
storage dedicated for SQL DB files, logs, etc…
5. Initial
issues
a. Cannot
use iSCSI storage presented to host, then mapped using RDMs to the MSCS nodes
i.
Not supported because VMware cannot pass scsi-3
reservation codes to the SAN
1. This
will not pass the initial tests for creating a MSCS cluster through Microsoft
wizards
b. Had
to add virtualized nics to the MSCS nodes that had access to the iSCSI network
c. Presented
storage directly to the guest and bypasses ESXi
d. Presented
nics using either intel or broadcom had to modify settings within Microsoft
Server 2008 R2 OS
i.
Issues that were presented was the nodes would
randomly lose access to storage, thus destroying the cluster altogether
ii.
After a week working with VMware, we identified
the issue was caused by Microsoft was attempting to force TCP offload engine on
the virtualized nic and segmentation offload:
1. netsh
int tcp set global chimney=disabled
2. HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
Value(DWORD): DisableTaskOffload = 1
e. Jumbo
Frames
i.
DO NOT FORGET TO ENABLE ON EACH NIC and SWITCH
PORTS (Server end to end SAN)
References:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009517
http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CDUQFjAA&url=http%3A%2F%2Fen.community.dell.com%2Fdell-groups%2Fdtcmedia%2Fm%2Fmediagallery%2F20094620%2Fdownload.aspx&ei=42sRUbyWEoXRyQGR8oDIAQ&usg=AFQjCNFNfCMFGNqPuzstAD7Uilbmv6-_6A&sig2=gw4VGTbvVaDU4CkxDg2gbQ&bvm=bv.41934586,d.aWc&cad=rja