We’ve just announced two new NVMe Flash storage systems: the PowerMax 2000 and the PowerMax 8000. Working with NVMe flash is exciting, and of course strongly motivated by high performance and low latencies. However, this post is about another feature that PowerMax adds: the ability to perform both deduplication (dedupe) and compression and how it applies to Oracle databases.
As you may know, VMAX All Flash already had hardware compression module in each director (and there are two directors per engine). PowerMax, not only brings more powerful compression modules, but they can also generate hash IDs and therefore open the door for supporting both compression and dedupe.
In fact, with PowerMax, when compression is enabled on a group of devices (storage group), dedupe is also enabled. It is not possible to enable one but not the other.
What I want to explore in this blog is how PowerMax dedupe and compression interact with Oracle databases under different circumstances.
Before I do that. Let’s take a quick look at the PowerMax architecture that allows working with compression and dedupe, while maintaining high performance, efficiency, and the rest of the storage data services.
A brief overview of PowerMax data reduction architecture
All PowerMax devices are thin. First, it means that they are merely pointers. The actual data can be anywhere in the storage array, and can be moved transparently from one media type to another (e.g. NVMe flash, or DRAM) while the host is none the wiser.
Second, until the thin devices’ capacity is allocated, they don’t actually consume storage (though they each consume minimal metadata space such as for their pointers). Note that when PowerMax stores actual data, its extent granularity (aka thin device extent, or cache-slot track size) is 128 KB.
This works very well with Oracle ASM, as ASM stores data in Disk Groups by allocating its own extents. The ASM extents have a minimum size of an Allocation Unit (AU), which is 1, 2, 4, …, 64 MB. While the ASM disk group reports its total capacity based on the devices comprising it, it only allocates extents that are written-to by the database. In other words, an empty 4 TB ASM disk group barely consumes any capacity in the storage array (just ASM-related metadata).
As a result, the combination of ASM and PowerMax thin devices keeps storage allocations to just what is actually consumed by the database! I’ll dedicate another post to discuss different ways of freeing deleted capacity, including while the ASM disk group is online, but that’s for another time.
Another critical aspect of PowerMax architecture is that it has a large persistent cache (512 GB – 16 TB raw). The cache is mirrored and protected from power failures (it can vault). That allows for many cool features such as acknowledging all host writes as soon as they registered in the cache (no need to destage them first), very fast host reads for frequently accessed data, or performing optimized writes to the storage media (periodic, larger writes, containing both data and parity), etc.
This is important because in the first step of our data reduction story, the Oracle data I/Os arrive to the PowerMax, get stored in its persistent cache, and immediately get acknowledged to the host. The thin devices’ pointers point to the appropriate cache slots containing the Oracle data.
At some point later, PowerMax determines that it’s time to destage the data from cache to storage. If the storage group has compression enabled, the cache slots are sent to the hardware compression modules where the data is split into 4 chunks, get compressed in parallel, and Hash IDs are generated. Next come the question – is the data new (unique)?
If it is, then its compressed version is stored, and the thin devices’ pointers are updated appropriately. All this happens transparently to the database activity.
But what if the data is not unique?
In that case, we already have a compressed version of the data stored. Therefore, just update the pointers appropriately, without consuming any additional capacity! This is what dedupe is – a single copy of duplicate data.
That’s pretty much it!
For completion, I’ll add just a few more aspects of PowerMax data reduction:
- If compression is enabled on a storage group prior to the Oracle data allocations, the mechanism described above of inline compression and dedupe is in effect. If, however, the storage group contained existing data, it will get compressed/deduped over time as a low priority task.
- Compressed data that haven’t been touched in over 30 days get Extended Data Compression (EDC) to provide for stronger compression ratio.
- Based on typical data access patterns, where the most recent data is accessed much more frequently than older data, while capacity allows, PowerMax keeps the most active 20% of the allocated capacity uncompressed. Over time, when the ‘hot’ storage extents cool-down (and others heat-up), if compression is enabled on that storage group, the extents will automatically get compressed. This mechanism is called Activity Based Compression (ABC) and helps in avoiding unnecessary compression and decompression of the most active data. Of course it also helps in preventing performance overhead.
Let’s move on to the tests! (continue reading in part II)