Category Archives: Storage

Overview of Modern Storage for the Enterprise

By | CTO, IT, Storage, SysAd | No Comments

Hyper-converged, scale-out, black-box, roll your own; on-premises enterprise data storage is a bit more than a hobby of ours here at Symbio. Until recently, most medium scale enterprises bought their storage from one of a few vendors; HP, Dell/Compellent, Nimble, Netapp, EMC, etc. These products arrive as a more-or-less plug and play appliances, and are usually fully supported by their respective vendors. They are also expensive. Symbio got it’s start using a home-brewed Linux based storage appliance we built ourselves because we couldn’t afford anything else.

This was great until I found myself debugging production storage issues during (literally) the birth of my first child. After that experience and at the urging of my wife, we committed to a Compellent investment. The cost was extreme for a company of Symbio’s size (the initial purchase price was nearly 20% of our annual revenue, annual support renewals were almost 10% of revenue the first couple years.) The Compellent served us well, and it’s stability allowed us to reliably quadruple the size of our customer base. However, after the first couple years, we outpaced it’s performance abilities.

As technology decision makers, we’re used to basing a storage platform choice on metrics like cost-per-GB, IOPS, features and support reputation. However, we wanted to take a deeper look at our options this time as it’s clear the very ways we think about storing data are changing. This presents fantastic opportunities for cost-control and adding new capabilities, but also present new risks for businesses to contend with. This series of articles explores a few of the emerging trends in on-premises enterprise storage, ideal applications for each technology, and details our specific experiences of each of the approaches. Symbio is currently running all of these systems in a production capacity.

Hyper-converged – VMware vSAN: Solutions like vSAN or Nutanix place the storage directly inside your processing hosts, then use software on the “back end” to provide for performance and data redundancy. These solutions can offer extreme performance at a moderate cost, but dramatically change failure models and require very careful and experienced planning to implement reliably. Symbio uses vSAN as our primary storage for high performance needs; specifically database and virtual desktops.

Traditional “Black Box” SAN – Nimble Storage: The “usual” enterprise approach with an appliance provided for and supported by a vendor. This approach offers moderate performance, generally very high reliability, and is generally compatible with existing thinking regarding failure modes (storage and compute can be thought of as isolated components of an overall system). Cost is often high compared to alternative approaches, but the “one-ass-to-kick” nature of the support can be of tremendous value to shops that lack deep IT talent. Symbio uses Nimble for our “general purpose” workloads; things that don’t demand extreme performance or capacity but where we derive value from some of the “nice to have” features that aren’t available on our other solutions.

Open Source, Scale Out – RedHat Ceph: Ceph is rapidly emerging as a favorite low-cost, high capacity solution for shops with strong technical capability. Ceph uses a mathematical model to decide where to place data on the underlying disks, and clients talk directly to the disks to request data. This means your controller is no longer a bottleneck or failure point as with a traditional SAN. Ceph can scale to Petabytes simply and without the enormous cost a traditional SAN would require. Ceph is open source, community supported (though enterprise support is available), and will run on commodity hardware. Symbio re-purposed all our old Compellent hardware into Ceph clusters (which, yes, we will write a blog post about) and is used as a low-performance high-capacity storage for backups and our SymbioVault off-site backup product. Ceph is presently limited in some very important ways, however.

We’ll explore each of these technologies in depth in the coming series of articles.

Practical vSAN: Increasing Congestion Thresholds

By | IT, Storage, SysAd, VMware | No Comments

As I described in this post, Practical vSAN: Measuring Congestion, the vsan uses congestion as one of the primary metrics in determining whether or not to add latency to incoming writes. As congestion increases, the vsan’s overall performance will decline until eventually it will stop accepting incoming writes all together and machines on the vsan will effectively hang, similar to an APD condition using the core storage (NFS/iSCSI) subsystem in ESXi.

One incredibly useful technique to “buy yourself time” is to increase the buffer sizes. The upper limit according to PSS is 128GB. Remember, this will only buy you time if you don’t resolve the underlying problem. The “LowLimitGB” is the threshold at which latency will start to be added. The “HighLimitGB” is the threshold at which incoming writes will be halted. 64/128 appear to be the limits for these values. You will need to execute these commands on every host in the cluster that is experiencing congestion. We suggest changing them on all hosts to be identical. Also, don’t set limits larger than the size of your cache devices.

esxcfg-advcfg -s 64 /LSOM/lsomLogCongestionLowLimitGB
esxcfg-advcfg -s 128 /LSOM/lsomLogCongestionHighLimitGB

In one recent case, we were able to use these commands to buy ourselves a few hours while we diagnosed an underlying hardware issue, and simultaneously finish up the working day without any further performance complaints. We ended up leaving the values at these levels rather than reverting them to default as recommended by PSS, as I don’t really see a downside if you’re proactive about monitoring for congestion generally. In our next post, Practical vSAN: Part 3 – Adjusting the number of background resync processes, if your storage woes are being exacerbated by a background resync we’ll show you how to change the number of simultaneous resync processes.

Practical vSAN: Measuring Congestion

By | IT, Storage, SysAd, VMware | No Comments

VMware’s vSAN, while an amazing product in many respects, leaves something to be desired when it comes to troubleshooting issues while in production. Many of the knobs and dials are “under the hood” and really not exposed in a way that is obvious in a crisis. In this series of posts we’ll document some of the troubleshooting techniques and tools we’ve gathered over the last several months. As always, use these tools at your own risk; we highly advise engaging VMware PSS if you can.

Part 1: Measuring Congestion

In vSAN, as writes are committed to the cache tier other writes are concurrently destaged to the capacity tier. In essence, the cache tier is a buffer for incoming writes to the capacity tier. If there is an issue with the underlying capacity tier, this buffer can start to fill. In vSANese, this is known as “log congestion.” In my opinion congestion is one of the primary metrics of health of a vSAN. If during a resync or other intensive IO operation you start to experience persistent log congestion, that’s a very good sign that some underlying component has a fault or there is an issue with drivers/firmware. I should also note that log congestion does not necessarily always indicate a problem with the capacity tier, logs are also used when persisting data to the caching tier.

As an aside, the health check plugin reports on these logs on what appears to be a scale of 0 to 255, with 255 being 100% full (though support is unclear on this exactly).

As the log buffer fills up, the vSAN starts to “add latency” to the incoming write requests as a way to throttle them. When a resync is triggered, if the underlying storage cant keep up for whatever reason, these buffers WILL fill. Once they hit a certain threshold (16GB by default) latency is added. More and more latency is added until a second threshold is reached (24GB by default) in which incoming writes are completely halted until enough data has been destaged to resume operation. At this point your entire cluster may enter a kind of bizarro-world APD type state, where individual hosts start dropping out of vCenter, virts will hang and pause. Note, the only indication you will have from vCenter or the vSAN itself at this point is that log congestion is too high.

You can check the current size of these log buffers by running this command from each host in your vsan:
esxcli vsan storage list |grep -A1 "SSD: true" |grep UUID|awk '{print $3}'|while read i;do sumTotal=$(vsish -e get /vmkModules/lsom/disks/$i/info |grep "Log space consumed"|awk -F \: '{print $2}'|sed 'N;s/\n/ /'|awk '{print $1 + $2}');gibTotal=$(echo $sumTotal|awk '{print $1 / 1073741824}');echo "SSD \"$i\" total log space consumed: $gibTotal GiB";done

Another useful tip is to save this into shell script and use the unix command “watch” to observe these values over time (e.g. watch ./congestion.sh).

Again, any value below 16GB should not cause the vSAN to introduce latency. Any value between 16GB and 24GB is bad news. If you hit 24GB, you’re really having a bad time. Also watch these values over time, if they are generally moving upwards, that’s bad. If they are generally decreasing you can start breathing again.
These values can be increased, which can buy you some breathing room in a crisis. You can read about that in our next post: Practical vSAN Part 2: Increasing Congestion Thresholds