VIJAY SWAMI

thoughts and musings regarding enterprise technology & business

Archive for the ‘EMC’ Category

The details behind VMAX Cloud Edition

Posted by Vijay Swami on February 27, 2013

Anytime there is a product with the name “cloud” it tends to stir up a lot of interest from customers & peers. On one end of the spectrum you have vendors that simply take existing products and rebrand them as “cloud” versions while others that actually make something worthy of the name “cloud”. I believe the VMAX Cloud Edition to be the later, albeit a first step towards the eventual goal of Storage As a Service, Software Defined Storage, Enterprise Cloud-Friendly Storage, <insert any buzz word here>, etc. But, what is it REALLY?

To summarize what is a VMAX CE: its a VMAX system, with a self-service portal, REST API support, chargeback/showback reporting and multi-tenancy built-in. It abstracts away storage requirements into a discrete number of “storage offerings” doing away with the traditional methods of provisioning and thinking around storage.

Architecture:

VMAX CE Architecture

VMAX CE Architecture

A few facts:

  • The VMAX Cloud Edition product is based on the VMAX10K hardware. This means 1-4 engines, D@RE (Encryption) capability, etc. One point to note however is that you can have up to 10 VMAX Cloud Edition frames running behind a single management interface provided by two management appliances connected in a HA fashion. These appliances connect via VPN to EMC’s datacenter as well as via FC to the array / switch fabric.
  • As can be seen above, the portal is actually located in EMC’s data centers. If you or your customer strictly will not allow connectivity externally, this product will not work. This may or may not change in the future to where it can be 100% hosted on-prem. But for now, this is a requirement.– the “consumers” of storage connect to EMC’s Data Centers via secure web access. The storage admins connect to EMC’s datacenter via VPN.
  • You cannot manage the VMAX Cloud Edition using Unisphere. You must use the self-service portal or the REST API. However, managing the VMAX CE through Unisphere would defeat the purpose of it in the first place.
  • From a front-end host connectivity perspective, all it supports is FC (Fiber Channel). There may or may not be plans to support iSCSI/FCoE in the future
  • You absolutely cannot upgrade an existing VMAX to VMAX CE
  • Chargeback style reporting is built-in with a granularity on a per-tenant basis (customer, business unit, department, etc)
  • The service catalog specifies the following storage offering, and each offering has a $/GB cost associated with when purchasing the VMAX CE: Diamond 3-4IOPS/GB, Platinum 1-3IOPS/GB, Gold .5-1 IOPS/GB, Silver .25 -5 IOPS/GB, Bronze .05-.15 IOPS/GB as shown below
VMAX CE Storage Offerings

VMAX CE Storage Offerings

As with any movement towards “cloud” based consumption there are two aspects: a technology enablement and a business process change. The technology enablement on the VMAX CE comes in the form of a portal, service catalog, REST API, chargeback style reporting and multi-tenancy. From a business process perspective, it allows customers to purchase the array on a $/GB basis instead of worrying about the cost of individual components such as drives, engines, cache, etc. To go from XXTB to YYTB on a VMAX may require and engine upgrade and its associated cost, but to go from XXTB to YYTB on a VMAX CE will carry a linear $/GB fee that is pre-determined based on the storage band regardless of what extra components are required beyond drives. Storage planning becomes a simple task of understanding the capacity requirements for each service level and immediately determining a cost — this is huge compared to how storage budgets are forecasted today. And lastly, by providing a service catalog (and REST API) for storage provisioning it allows customers to automate & orchestrate the storage tasks.

It is extremely important that customers turn their frame of thinking towards “service levels” and not RAID groups, # of 15K spindles, and so on to truly provide storage as a service. What will also be of tremendous help is when storage best practices white papers get re-written with service levels in mind and not specific storage configurations. For example, while a customer may buy a VMAX CE today, the Exchange 2010 whitepaper best practices still speaks in terms of # of spindles and an “old school” methodology of storage architecture — not in Gold or Silver bands ala VMAX CE verbiage. This can be a challenge when trying to translate an application architecture into a storage requirements in the above model. There is no doubt this is the way of the future in storage, but we have a long way to go until it’s status quo — Software Defined Storage is at its infancy.

Posted in EMC, storage, VMAX | Leave a Comment »

Is vSphere Data Protection the same thing as Avamar?

Posted by Vijay Swami on February 21, 2013

One of the common conversations in my customer base and among SE peers is around vSphere Data Protection and how it compares to Avamar. It is no secret that the latest incarnation of VDP and VDP Advanced have Avamar technology under the covers: in-line deduplication, variable length block & segment size, leveraging VADP and CBT, etc. EMC and VMware teamed up to bring Avamar technology into the VMware data protection portfolio, but the question is, is it the right solution for a particular data protection requirement? As always, the devil is in the details.

First its worth pointing out that VDP comes in one deployment scenario, a software only virtual appliance where as Avamar can be deployed as a SW only virtual appliance (Avamar/VE), or a hardware based solution (“full” Avamar).

Architecturally, VDP looks like this:

VDP Architecture

VDP Architecture

It is a virtual appliance that runs on an ESX host. One of the major pluses to VDP is the UI is integrated with the vCenter (web) client instead of being an external UI like Avamar. And let me tell you the VDP UI is simple, easy to use and very intuitive — this is a huge win for VDP IMO. The installation of VDP is also very straight forward through an OVF. It should also be noted however that VDP requires the vCenter web client and vCenter 5.1. It does not work with previous versions of vCenter and does not work with the “full” thick windows vCenter client. Under the covers both VDP and Avamar work very similarly and you can expect the same de-duplication rates since they share the algorithm.

VDP comes in two editions: VDP and VDP Advanced. The major differences lie in the configuration maximums, scalability and application level agents.

Each VDP appliance can be as large as 2TB where as each VDP Advanced appliance can be as large as 8TB. VDP supports a maximum of 100 VMs per appliance where as VDP Advanced supports 400 (ofcourse the actual number of VMs will vary based on VM size, dedupe rates, retention requirements, etc… but those are upper limits). Both allow up to 10 appliances per vCenter but each appliance is treated independently meaning there is no de-duplication across multiple VDP appliances even in the same vCenter. VDP Advanced also includes agents for application consistent backup of SQL 2008,2012 as well as Exchange 2003, 2007 and 2010. With standard VDP the only option is an image based backup at the VM level. And while VDP is free with vSphere Essentials Plus and higher, VDP Advanced carries a $1095/CPU list price tag (although it can be purchased bundled through some of the higher end suites).

Now with some of the feeds & speeds of VDP out of the way, here are what I see are the major roadblocks to adoption (especially in my customer base):

First and foremost, there is no mechanism to get the data off-site for diaster recovery or compliance purposes. While Avamar supports both off-site replication to a second set of HW or Virtual Appliance, as well as tape out for off-siting through the Avamar Extended Retention capability, VDP offers neither. There are probably a couple of some kludgy, error prone, labor intensive methods of restoring at a DR site such as by replicating the VDP appliance with array based replication, or backing up the VDP appliance via another backup program which is capable of doing tape-out, but none of these would be acceptable in any of my customer environments. The lack of a clean methodology for off-site recovery would be a deal breaker in almost for almost all of my customers. If you are backing up data that is critical to your business, I don’t see how this could be acceptable regardless of organizational size — and yet, VDP Advanced is being marketed towards companies with 200-300VMs — in my customer base this is HUGE.

Another drawback is in the application level consistency. As a reminder this is only available with VDP Advanced and currently there are only two applications supported — SQL and Exchange. While the SQL agent supports most of the features one would expect, the Exchange agent is lacking a very major one — granular level restore at the mailbox & message level. While you can do a restore of an individual mailbox or message with the Avamar Exchange plug-in, the VDP Advance Exchange plug-in only goes down to the Exchange database level — quite a nascence if all that is needed is a single mailbox.

The VDP backup job scheduling also has a limitations — all backups start at the same time. While the backup job frequency, retention, etc can be altered, there is only one backup window per day with VDP. What this means is that there is no good way to stagger backup jobs. In smaller environments this may not pose a problem, but any backup admin managing size will tell you it is one of the most critical components to keeping backups running smooth — customizing the backup start time of various heavy hitter servers to spread the workload. I see this as a potential point of concern for customers of any size & complexity.

I’ve also heard rumors circulating that VDP/VDP-A can be upgraded to “full blown Avamar” and I am being told that simply isn’t true. This could change in the future, but as of right now its not even in the realm of possibility… buyer beware. If this is an environment that will grow beyond the single 8TB appliance limit of VDP Advanced, and taking advantage of global deduplication across all datasets is desired, it is better to look at full Avamar from the beginning. The deduplication across the entire dataset will bring large efficiencies to the compared to simply deduplicating in discreet 8TB silos.

With all of those things said, I don’t want to make it seem like I feel VDP is a bad product — far from it. It does many things very well: image level backup with file level restore capability; UI integrated right into vCenter; ease of use is a 10 out of 10, and I cannot stress this enough; installation is a snap and the functionality it does provide it works very well AND I feel it is a good value — but as always, its about matching up the requirements with the solution and its important to be aware of the limitations of any product.

If the draw backs outlined above are not a requirement, I would fully recommend giving VDP/VDP-A serious consideration. But while it may have Avamar technology under the covers, it is NOT Avamar.

Posted in backup, EMC, Virtualization, vmware | 1 Comment »

A study in VMAX & VNX auto-tiering

Posted by Vijay Swami on December 17, 2012

One of the major differences between a VMAX and VNX are the pooling & FAST-VP (auto-tiering SW) implementations. As more and more VNX customers are considering VMAX systems (thanks largely to the introduction of the VMAXe/VMAX10K price point) these differences are often a topic of conversation. There are some noticeable differences in the theory & operation and subsequently the real-world management which are worth understanding.

Data Movement Frequency & Data Movement Granularity:

The most obvious difference in the FAST-VP implementation between the two systems is how often data relocations can occur & the granularity of that data movement.

  • VMAX: data is collected continuously, analyzed continuously and can be moved continuously. The granularity of this data movement can be as small as 12 Symmetrix tracks or 768K
  • VNX: data is collected continuously, analyzed once an hour, and moved once per 24hrs. The way to think about auto-tiering on a VNX is that the system builds a “plan” based on a data collection window and then executes on that plan once every 24hrs during a specified relocation window. The granularity of data movement is 1GB

It’s important to take note of this major architectural difference between the two systems. The VNX the auto-tiering is designed as more of a slow moving interface and for this reason it should be considered mandatory that all VNX systems utilizing FAST-VP also have the appropriate amount of Fast Cache capacity. Fast Cache operates at 64K granularity and allows instant promotion of data. In the event there were some cold data living on NL-SAS that suddenly became hot, the VNX can immediately promote (or queue for promotion) it to Fast Cache to absorb that incoming workload spike while FAST-VP would come around on the next relocation window to decide if it makes sense to actually up-tier it permanently into the EFD pool drives. If that same scenario were to occur on the VMAX, the data would be promoted (or queued to the DA for promotion) almost instantly to the EFD tier. This is why the VMAX does not need the Fast Cache feature at a fundamental level (gobs of engine cache help too!)

Nerd knobs A.K.A. The “tunability” of FAST-VP across the two systems:

  • VMAX
    • Performance Time Window: when is the relevant time window to look at the data, usually set to 24/7/365 for continuous analysis
    • Data Movement Window: when can data be moved, usually set to 24/7/365 for continuous data movement
    • Workload Analysis Period: used to tune how to aggressively to decay “older” IO compared to “newer” IO for data promotion/demotion ranking purposes
    • Initial Analysis Period: on a newly created LUN, how long is data collected before doing the first movements
    • FAST-VP Relocation Rate: a number from 1 to 10 on how quickly FAST-VP should move data
  • VNX:
    • Data Relocation Rate: Low/Medium/High
    • Data Relocation Schedule: which days of the week & during what window should relocation run

It should be clear which system has a more “end-user tunable” FAST-VP system. To be fair, every customer does not tweak every single FAST-VP knob on the VMAX or any for that matter. The defaults or “best practices” (and how I hate that term!) actually work very well for the majority of the customers.

Taxonomy of storage pools & LUN Configuration Elements:

VNX:

The VNX has 4 main configuration elements as it pertains to this discussion:

  • the physical drives
  • the storage pool
  • the LUN
  • the tiering policy of the LUN
VNX Pool Abstractions (Annotated from pg 11 of VNX Virtual Provisioning White Paper)

VNX Pool Abstractions (Annotated from pg 11 of VNX Virtual Provisioning White Paper)

Based on the above, lets examine how to create some auto-tiered storage on the VNX.

Take the EFD/SAS/NL-SAS drives, create a pool:

VNX Create Pool

VNX Create Pool

Create a LUN in that pool:

VNX Create LUN

VNX Create LUN

Set the tiering policy on that LUN:

VNX Select Tiering Policy

VNX Select Tiering Policy

The tiering policies explained:
  • Start High Then Auto: place all the slices on the highest tier with available capacity (in this case EFD) and then examine the performance as time goes on and move the slices into the correct tiers
  • Highest: place the slices in the highest tiers possible and remain there, with the busiest slices in the highest tier
  • Auto: place the slices in the right tier based on performance thresholds, but initially slices are distributed throughout all 3 tiers (so some slices end up on EFD, some on SAS, and some on NL-SAS)
  • Lowest: place slices on the lowest available tier (in this case NL-SAS)
Assuming the first tiering policy (the recommended one for the majority of workloads), the LUN will have it’s slices start out on the highest tiers and over time auto-tier as appropriate.

VMAX:

The VMAX configuration elements are slightly different due to pooling architectural differences:

  • disk groups: collection of like type disks (I.E. 200GB EFD)
  • Virtual Pools: pooled storage capacity formed from disk groups
  • FAST-VP Tiers: association of a tier with the previously mentioned pooled capacity; multiple pools of like drive type & RAID protection type can be associated with a single FAST-VP Tier
  • FAST-VP Policies: auto-tiering policy specifying how much of each tier can be utilized I.E. 100/100/100 would tell the system that 100% of the EFD, 100% of the FC and 100% of the SATA tier can be utilized for a given LUN/Storage Group
  • storage groups: collection of host LUN
VMAX Pool Abstractions

VMAX Pool Abstractions

On the VMAX each type of drive is associated with a disk group:

VMAX Disk Groups

VMAX Disk Groups

Then virtual pools can be created from each of the drive types. A RAID protection and capacity are specified for each pool (can be full, or a subset of the entire system capacity based on the disk groups):

VMAX Pools

VMAX Pools

Next FAST-VP Tiers need to be created and associated with the Virtual Pools. Typical would be an EFD Tier, a FC Tier, and a SATA Tier. The example below shows the creation of an “EFD” FAST-VP Tier.

Creating a VMAX FAST-VP Tier

Creating a VMAX FAST-VP Tier

VMAX FAST-VP Tiers

VMAX FAST-VP Tiers

All 3 tiers have been created above.

Next comes the FAST Policy. This determines the % of each tier (by capacity) that a LUN can occupy. 100/100/100 is the ideal policy as it basically tells the system “I’m not placing any restrictions on the tier percentages, you decide the best place to land the data”. This gives the system full control and if the VNX had a FAST-VP Policy parameter, this would be the setting (100/100/100):

Creating a FAST Policy

Creating a FAST Policy

idealP FAST Policy Created

idealP FAST Policy Created

Alternatively, different policies can be created such as 20/30/50 which tells the system at most 20% EFD by capacity and 30% FC by capacity can be utilized and the rest needs to live on the SATA tier. So again, the FAST Policy on the VMAX allows plenty of “nerd knob” tweaking if desired. Other uses for this would be in multi-tenant or storage as a service model where internal/external customers are paying different $/GB rates based on expected SLAs.

Next a storage group and host devices (LUN) must be created. This is as expected, but one item that needs to be specified is the initial pool binding. Meaning, which virtual pool to associate the LUN with initially. The best practice is to choose the middle / FC tier for this. Note that if  thin provisioning is used, no space is actually occupied in pool until host writes are sent. Additionally, there is a setting in VMAX 5876 code which allows new writes to land on a tier that FAST-VP decides is best based on the data collected on host IO activity. With this setting, FAST-VP may decide to land the new write on SATA even though the initial binding was on FC if it deems appropriate. This avoids tracks landing on FC first, only to be moved down to SATA later (or up to EFD). This is a system wide setting called “allocate by FAST policy” and the recommendation is to enable it unless there is a good reason not to.

VMAX Create Storage Group

VMAX Create Storage Group

VMAX Create Storage Group

VMAX Create Storage Group

Next the storage group has to be associated with a FAST Policy:

VMAX Associate a FAST Policy

VMAX Associate a FAST Policy

… and now the LUN is auto-tiered via the pools created utilizing the FAST Policy specified (100/100/100 in this case).

The FAST Policy can be changed anytime on the fly. For example it could be changed from 100/100/100 to 20/30/50 or any combination based on business needs. This gives a lot of flexibility in the management of the performance & capacity of the array.

To summarize the data movement process between a VMAX and VNX as it pertains to auto-tiering:

-VMAX: TDEV (LUN) is bound to a pool/tier (best practice FC unless low workload); after the Initial Analysis Period performance metrics are analyzed; extents are marked for promotion / demotion; data movements queued up on the DA (disk adapters); TDEV remains bound to the pool it was originally bound (for statistics purposes) regardless of where the tracks live;new host writes behavior depends on the “allocate by FAST Policy” setting.

-VNX: LUN created in a Pool; initial allocation determined by Tiering Policy of LUN; data collected immediately, analyzed every hr, and if necessarily slices will be moved during next relocation window once every 24hrs

Other notable behavioral differences:

  • on a VNX, extents are only moved down to lower tiers if space is needed on higher tiers to accommodate up-movement. On the VMAX, extents will be proactively moved down if the system deems them of lesser performance regardless of available capacity in higher tiers.
  • during replication the VNX on the target side has no information on FAST-VP movements that have occurred. The VMAX does have a feature called “FAST-VP RDF Coordination” which when enabled at the storage group level will allow the source VMAX to communicate with the target VMAX on FAST-VP movements, and thus the R2 volumes on the target side will have data located on the appropriate tier as per the R1 workload. Note: This is for SRDF ONLY, and does not work in Recoverpoint Environments (as of today)
  • in environments where block side compression is being utilized, the VMAX can automatically compress inactive data under FAST-VP management with a toggle of between 40-400 days of inactivity. The compression on the VNX is more manual in nature with a simple enable/disable mechanism.

Summary:

Its important to understand the differences in pooling & FAST-VP between the two systems. The VMAX offers -much- more flexibility in how the pools & FAST-VP can be configured, however not all customers require it. The reason for many of the differences is due to how much more global memory & processing power a VMAX has compared to the VNX. This is especially true of the VMAX40K and the new VMAX10K “989″ engines.

Comments/questions welcome.

Posted in EMC, storage, VMAX, VNX | 5 Comments »

Simplifying SAN management for VMware Boot from SAN, utilizing Cisco UCS and Palo

Posted by Vijay Swami on May 31, 2011

One of the great features of the Cisco UCS is the Palo or Virtual Interface Card (VIC). When utilizing this card with UCS, it allows the administrator to create many virtual NICs (vNICs) and virtual HBAs (vHBAs) (up to 128 with some limitations). In a VMware environment, the use of vNICs is well understood — you can create individual vNICs for service console, vMotion, VM network traffic, IP storage traffic, and so on. You can then apply QoS policies to them to guarantee service levels. Additionally, you have the ability to utilize dynamic vNICs and Pass-Through-Switching which bypasses VMware’s vSwitch and dynamically assigns vNICs to VMs as they are created. The benefits to creating vNICs is clear, but how about vHBAs?

At first glance, it doesn’t seem that useful to create more than 2 vHBAs (one per SAN fabric); and after all this is something that you can do with the standard UCS mezzanine cards from Qlogic and Emulex. There is one use case where the ability to create more than two vHBAs comes in handy — that is boot from SAN in VMware environments. This applies equally to boot from SAN servers in other clustered environments, but I will be using VMware to illustrate this design option, with EMC’s midrange Clariion/VNX storage.

Read the rest of this entry »

Posted in Cisco, EMC, storage, UCS | 3 Comments »

VMAX on a Clariion Planet, Part2: storage layout and provisioning

Posted by Vijay Swami on April 27, 2011

In part2 of this series, we’ll take a look at the storage layout and provisioning basics comparison between VMAX and Clariion.

First a look at how storage is composed on the two arrays.

Read the rest of this entry »

Posted in EMC, storage, VMAX | Leave a Comment »

VMAX on a Clariion Planet, Part1: A look at architecture and IO flows

Posted by Vijay Swami on April 25, 2011

This article is focuses on understanding VMAX from the perspective of users who are familiar with Clariion arrays, terminology and architecture. Put another way, a guide to VMAX for Clariion users. We’ll take a look at the architecture similarities/differences, terminology and a look at basic storage administrative tasks. When Clariion is mentioned in this article, it applies equally to VNX arrays as well, as they are similar for the purposes of this article.

Part1 will focus on architecture and IO flows, and Part2 will discuss some storage design and provisioning concepts.

With that said, lets examine I/O flow from the host to a back-end disk of each array type.

Read the rest of this entry »

Posted in EMC, storage, VMAX | 5 Comments »

EMC Storage Pool Deep Dive: Design Considerations & Caveats

Posted by Vijay Swami on March 5, 2011

This has been a common topic of discussion with my customers and peers for some time. Proper design information has been scarce at best, and some of these details appear to not be well known or understood, so I thought I would conduct my own research and share.

Some time ago, EMC introduced the concept of Virtual Provisioning and Storage Pools in their Clariion line of arrays. The main idea for doing this is to make management for the storage admin simple. The traditional method of managing storage is to take an array full of disks, create discrete RAID groups with a set of disks, and then carve LUNs out of those RAID groups and assign them to hosts. An array could have dozens to hundreds of RAID groups depending on its size, and often times this would result in stranded islands of storage in these RAID groups. Some of this could be alleviated by properly planning the layout of the storage array to avoid the wasted space, but the problem is that for most customers, their storage requirements change and they very rarely can plan how to lay out an entire array on day 1. There was a need for flexible and easy storage management, and hence the concept of Storage Pools was born.

Storage pools, as the name implies, allows the storage admin to create “pools” of storage. You could even in some cases, create one big pool with all of the disks in the array which could greatly simplify the management. No more stranded space, no more deep architectural design into RAID group size, layout, etc. Along with this comes a complimentary technology called FAST VP, which allows you to place multiple disk-tiers into a storage pool, and allow the array to move the data blocks to the appropriate tier as needed based on performance needs. Simply assign storage from this pool as needed, in a dynamic, flexible fashion, and let the array handle the rest via auto tiering. Sounds great right? Well, that’s what the marketing says anyway. :)

First let’s take a brief look at the difference between the traditional RAID group based architecture and Storage Pools.

Read the rest of this entry »

Posted in EMC, storage | 38 Comments »

Comparing storage virtualization technologies: EMC VPLEX, Netapp VSeries, HDS USP-V

Posted by Vijay Swami on June 14, 2010

The release of VPLEX has spawned some interesting conversations with customers. When exploring the concept of storage federation, which is enabled by storage virtualization, the question that often comes up is, how do the main storage virtualization technologies differ?

The three mainstream storage virtualization technologies in the market today are Netapp’s V-Series, HDS USP-V, and now EMC VPLEX. They all have some commonalities and differences in their architecture, function and use cases……

Read the rest of this entry »

Posted in EMC, HDS, Netapp, storage virtualization, VPLEX | 15 Comments »

Interesting use cases for VPLEX

Posted by Vijay Swami on May 17, 2010

As I’m sure most are aware, EMC released its VPLEX product for federated storage access (a virtual storage engine). The idea being able to share storage across sites using distributed caching algorithms making active/active data centers a reality. Currently available across synchronous distances (100km) and for local site use cases. This product does all the things Invista does, plus cache coherency, plus across distance, plus a scale out cluster model, plus simplified setup and management and not performing the virtualization at the SAN switch level (thus not requiring stretched fabrics between sites), plus doing it with an appliance model, plus….. well you get the point.

The main reason for this technology is to further enable cloud based IT architectures. The ability to dynamically move storage work loads across data centers (or even within different arrays in the same data center) seamlessly without any application downtime or interaction. Once can also make a case for easy tech refreshes since virtualizing the storage volume decouples it from the physical array, allowing you to play the “storage vmotion” game with which back-end array the data actually resides on. Very similar to how VMware decouples the application/OS workload from the physical server.

Those are all pretty standard use cases for storage virtualization, but how about some others?

Read the rest of this entry »

Posted in EMC, Virtualization, VPLEX | 2 Comments »

 
Follow

Get every new post delivered to your Inbox.

Join 774 other followers

%d bloggers like this: