Home > Blogs > Virtual Blocks > Author Archives: Chuck Hollis

Author Archives: Chuck Hollis

Chuck Hollis

About Chuck Hollis

Chuck loves enterprise IT infrastructure and the people who make it work every day -- with a special emphasis on storage. Over the years, he has become a popular industry blogger and industry speaker at chucksblog.typepad.com. When he's not working his true love is playing keyboards in bar bands.

VMware VSAN vs Nutanix Head-To-Head Pricing Comparison — Why Pay More?

WHY-PAY-MOREWhen presented a huge differential in acquisition costs for ostensibly similar products, every IT professional inevitably asks the question “why pay more?

Maybe there’s a good reason for a substantial difference … and maybe not.

We’d like to share with you some preliminary data points we’ve collected around comparative pricing for VMware Virtual SAN and Nutanix.

There’s an important difference in the underlying business models.  Nutanix sells a complete, turnkey appliance.  VMware VSAN (and vSphere) is software that runs on the customer’s choice of hardware — it’s software-defined storage.  Both deliver a hyperconverged solution.

A few caveats before we begin?

In reality, the only prices that matter are the ones that you pay.  Portions of the IT industry can be relatively opaque when it comes to finding typical street pricing.  Bigger buyers tend to get better prices than smaller buyers.  Initial small buys might be priced very competitively if the vendor believes they might lead to larger transactions.

Your mileage may vary — we’ve done our best here!

We’re presenting the data we have, so you can draw your own conclusions.  If someone spots an error, please leave a comment and we’ll address it quickly. Continue reading

VSAN vs Nutanix Head-to-Head Performance Testing — Part 1

drag_stripMany, many factors go into an informed IT purchasing decision.  Priorities may vary for different people, but relative performance can matter greatly to many IT buyers when considering alternatives.  If the differences are significant and meaningful, it can certainly tilt the scale.

The usual baseline is the core question — is the performance fast enough for what I want to do?

But answering that question is not as easy as it sounds.  Real-world workloads can be notoriously difficult to characterize and size reliably. And it’s nice to have a nice margin of performance headroom in case you’ve guessed wrong, or — more likely — the workloads have changed.  Nobody likes to hit a wall.

But there’s a deeper level as well.  A poorly performing product can require more hardware, licenses and environmentals to match the results of a much better performing product.  While not everyone may care about absolute performance, almost everyone would care about having to pay much more to get equivalent work done.

Unfortunately, there is a paucity of good head-to-head performance testing in the marketplace today.  Ideally, that would be done by an independent third party, but while we’re waiting for that, we’ve done our own.

To be clear, this isn’t about vendors beating each over the head with benchmarks — it’s about helping IT professionals make informed choices.  More data is better.

We’ve already published an extensive set of performance testing for VSAN, but nothing in terms of a direct comparison.  We wanted to correct that situation — and here’s what we’re attempting to do. Continue reading

The Collapse Of Storage

change_structureThe storage industry is undergoing rapid structural change that’s not been seen in decades.

My best soundbite is that storage is in the process of collapsing. Once a standalone topic, storage is clearly pulling away from our familiar model of external storage arrays, and disappearing into the fabric of servers and hypervisors.

While we all like to talk about disruptive industry changes, this one perhaps is the ultimate disruption: it impacts every aspect of storage: the core technology, the consumption model, the integration model as well as the operational model.

As a result, most everything we’ve come to know about storage changes going forward. For most people, what you think you know isn’t how it’s going to be before too long.

Let’s take a closer look at each of these “collapses” going on with storage today. Continue reading

IndonesianCloud and VSAN

Indonesian-CloudIT service providers can be some of the most demanding IT users on the planet.

Why?  They serve very demanding customers.  And there are plenty of competitors doing similar things.

At the same time, they have to keep a watchful eye on costs, as margins can be thin.  Better, faster, cheaper — they demand all three!  As a category, IT service providers generally spend serious effort evaluating technology as part of their core business model.

Recently, Duncan Epping posted a deeper look at what IndonesianCloud is doing today with VSAN.  So far, they’re quite happy — and are planning to do much more.

An interesting read!

20+ Common VSAN Questions

questions2VMware’s Virtual SAN radically simplifies storage in vSphere clusters.  Even though the product has been available for about a year, we still continue to get all sorts of interesting questions.

What follows are the list of interesting questions we encounter most often, along with quick answers.

I hope you find this helpful! Continue reading

Cache and All-Flash Virtual SAN

flash_cardMost of us are familiar with the role that flash cache plays with hybrid storage systems that are a mix of flash and traditional disk.  Cache is there as a performance accelerator: storing recent reads, and buffering writes to disk.

But when VSAN 6.0 announced its new all-flash configuration, there was a still a recommendation for cache in addition to flash devices used for capacity.  Why is this — aren’t the capacity flash devices fast enough?  And why the 10% recommendation?

With all-flash VSAN, cache is used to extend the life of less-wear-endurant (and less expensive!) capacity flash devices.  Unlike hybrid configs, cache is not about performance — it’s about economics.

Recently, Cormac Hogan put together an excellent post explaining how cache works differently with all-flash VSAN, and — more importantly — explains the logic behind the 10% usable capacity sizing recommendation.

If you’re into optimized configuration of VSAN — or just want to understand how things work behind the scenes — it’s excellent reading!

http://cormachogan.com/2015/05/19/vsan-6-0-part-10-10-cache-recommendation-for-af-vsan/

The Future Of Hyperconverged Is Already Here

One of my favorite William Gibson quotes is “the future is already here, it’s just not evenly distributed yet”.  That statement could easily be applied to the work Gabriel Long is doing on behalf of his employer.

ulltradimmsHere’s the story in a nutshell: he’s built a serious VSAN cluster that uses Diablo’s Memory Channel Storage™  technology for flash storage, which means he’s well ahead of the pack.  No, it’s not officially supported yet.  Regardless, very impressive stuff — and a great example of hyperconverged architectures to come.

Gabriel (or Gabe as he prefers) was kind enough to speak to me about what he’s doing: the motivations, the thinking and the experience that resulted.  It was an amazing story.

I hope you’ll agree as well …
Continue reading

Intel and VSAN Team Up At EMC World

At EMC World 2015, Intel brought serious game to the industry’s largest storage show — demonstrating a slick, all-flash 32-node VSAN configuration — complete with a cool, animated bezel.

In initial testing, the configuration shows ~3.25 million IOPS for 4K random reads, and an impressive ~2 million IOPS with a 70r/30w 4K random mix.  Amazing performance in a single, dense rack designed to support 3200 VMs.

You could almost feel the “drool factor” as people crowded around to ask questions about the configuration.  Our CEO Pat Gelsinger even stopped by for a quick photo!

PatG_VSAN_Intel

Click here for more details on this powerful combination of Intel and VMware technology.

 

Video: Virtual SAN From An Architect’s Perspective

Video: Virtual SAN From An Architect’s Perspective

Have you ever wanted a direct discussion with the people responsible for designing a product?

Recently, Stephen Foskett brought a cadre of technical bloggers to VMware as part of Storage Field Day 7 to discuss Virtual SAN in depth.  Christos Karamanolis (@XtosK), Principle Engineer and Chief Architect for our storage group went deep on VSAN: why it was created, its architectural principles, and why the design decisions were important to customers.

The result is two hours of lively technical discussion — the next best thing to being there.  What works about this session is that the attendees are not shy — they keep peppering Christos with probing questions, which he handles admirably.

The first video segment is from Alberto Farronato, explaining the broader VMware storage strategy.

The second video segment features Christos going long and deep on the thinking behind VSAN.

The third video segment is a run-over of the second.  Christos presents the filesystem implementations, and the implications for snaps and general performance.

Our big thanks to Stephen Foskett for making this event possible, and EMC for sponsoring our session.

 

How To Double Your VSAN Performance

How To Double Your VSAN Performance

VSAN 6.0 is now generally available!

Among many significant improvements, performance has been dramatically improved for both hybrid and newer all-flash configurations.

VSAN is almost infinitely configurable: how many capacity devices, disk groups, cache devices, storage controllers, etc.  Which brings up the question: how do you get the maximum storage performance out of VSAN-based cluster?

Our teams are busy running different performance characterizations, and the results are starting to surface.  The case for performance growth by simply expanding the number of storage-contributing hosts in your cluster has already been well established — performance linearly scales as more hosts are added to the cluster.

Here, we look at the impact of using two disk groups per host vs. the traditional single disk group.  Yes, additional hardware costs more — but what do you get in return?

As you’ll see, these results present a strong case that by simply doubling the number of disk -related resources (e.g. using two storage controllers, each with a caching device and some number of capacity devices), cluster-wide storage performance can be doubled — or more.

Note: just to be clear, two storage controllers are not required to create multiple disk groups with VSAN.  A single controller can support multiple disk groups.  But for this experiment, that is what we tested.

This is a particularly useful finding, as many people unfamiliar with VSAN mistakenly assume that performance might be limited by the host or network.  Not true — at least, based on these results.

For our first result, let’s establish a baseline of what we should expect with a single disk group per host, using a hybrid (mixed flash and disks) VSAN configuration.

Here, each host is running a single VM with IOmeter.  Each VM has 8 VMDKs, and 8 worker tasks driving IO to each VMDK.  The working set is adjusted to fit mostly in available cache, as per VMware recommendations.

More details: each host is using a single S3700 400GB cache device, and 4 10K SAS disk drives. Outstanding IOs (OIOs) are set to provide a reasonable balance between throughput and latency.

VSAN_perf_1

On the left, you can see the results of a 100% random read test using 4KB blocks.  As the cluster size increases from 4 to 64, performance scales linearly, as you’d expect.  Latency stays at a great ~2msec, yielding an average of 60k IOPS per host.  The cluster maxes out at a very substantial ~3.7 million IOPS.

When the mix shifts to random 70% read / 30% writes (the classic OLTP mix), we still see linear scaling of IOPS performance, and a modest increase in latency from ~2.5msec to ~3msec.  VSAN is turning it a very respectable 15.5K IOPS per host.  The cluster maxes out very close to ~1m IOPS.

Again, quite impressive.  Now let’s see what happens when more storage resources are added.

For this experiment, we added an additional controller, cache and set of capacity devices to each host.  And the resulting performance is doubled — or sometimes even greater!

VSAN_perf_2

Note that now we are seeing 116K IOPS per host for the 100% random read case, with a maximum cluster output of a stunning ~7.4 million IOPS.

For the OLTP-like 70% read / 30% write mix, we see a similar result: 31K IOPS per host, and a cluster-wide performance of ~2.2 million IOPS.

For all-flash configurations of VSAN, we see similar results, with one important exception: all-flash configurations are far less sensitive to the working set size.  They deliver predictable performance and latency almost regardless of what you throw at them.  Cache in all-flash VSAN is used to extend the life of write-sensitive capacity devices, and not as a performance booster as is the case with hybrid VSAN configurations.

In this final test, we look at an 8 node VSAN configuration, and progressively increase the working set size to well beyond available cache resources.  Note: these configurations use a storage IO controller for the capacity devices, and a PCI-e cache device which does not require a dedicated storage controller.

On the left, we can see the working set increasing from 100GB to 600GB, using our random 70% read / 30% OLTP mix as before.

Note that IOPS and latency remain largely constant:  ~40K IOPS per node with ~2msec latency.  Pretty good, I’d say.

On the right, we add another disk group (with dedicated controllers) to each node (flash group?) and instead vary the working set size from an initial 100GB to a more breathtaking 1.2TB.  Keep in mind, these very large working set sizes are essentially worst-case stress tests, and not the sort of thing you’d see in a normal environment.

VSAN_perf_3

Initially, performance is as you’d expect: roughly double of the single disk group configuration (~87K IOPS per node, ~2msec latency).  But as the working set size increases (and, correspondingly, pressure on write cache), note that per-node performance declines to ~56K IOPS per node, and latency increases to ~2.4 msec.

What Does It All Mean?

VSAN was designed to be scalable depending on available hardware resources.  For even modest cluster sizes (4 or greater), VSAN delivers substantial levels of storage performance.

With these results, we can clearly see two axes to linear scalability — one as you add more hosts in your cluster, and the other as you add more disk groups in your cluster.

Still on the table (and not discussed here): things like faster caching devices, faster spinning disks, more spinning disks, larger caches, etc.

It’s also important to point out what is not a limiting factor here: compute, memory and network resources – just the IO subsystem which consists of a storage IO controller, a cache device and one or more capacity devices.

The other implication is incredibly convenient scaling of performance as you grow — by either adding more hosts with storage to your cluster, or adding another set of disk groups to your existing hosts.

What I find interesting is that we really haven’t found the upper bounds of VSAN performance yet.  Consider, for example, a host may have as many as FIVE disk groups, vs the two presented here.   The mind boggles …

I look forward to sharing more performance results in the near future!

———–

Chuck Hollis
http://chucksblog.typepad.com
@chuckhollis