Building a Sound Foundation for DB2 BLU Acceleration

by Michael Kwok

DB2 with BLU Acceleration is an innovative columnar technology for creating and running your analytic warehouse. It is easy to implement and is self-optimizing. In addition to column-organized storage, it leverages new technologies such as actionable compression, parallel vector processing, and data skipping. DB2 with BLU Acceleration gives you an order of magnitude of improvement in performance, storage savings and time-to-value. But, how can you get the most out of it?

The first step is to choose a sound platform.

It is important to note that no matter what platform you choose, BLU Acceleration in-memory technology will transparently leverage its hardware characteristics with no user intervention required! But with this said, BLU Acceleration has demonstrated excellent performance characteristics running on the following platforms:

  • IBM POWER8 for Power/AIX
  • IBM POWER8 for Power/Linux (Little Endian)
  • Intel Xeon E5 / E7 v3 for x64/Linux
  • Intel Xeon E5 / E7 v3 for x64/Windows
  • IBM z13 Enterprise Linux Server for zLinux

Of course, BLU Acceleration also supports other hardware platforms, and you can find the details here.

In recent years, the number of cores and the amount of memory available both continue to increase, as costs decrease.  For example, you can have 16 TB of RAM on an IBM POWER8 E880 server.  As a good rule of thumb, I’d recommend 16 GB of RAM per core based on field experience and internal benchmark testing.

For optimal performance of BLU Acceleration, memory is more critical than the core count.  However, this does not mean that BLU Acceleration requires every piece of data to be in memory in order to process a query. BLU Acceleration has a dynamic list prefetching algorithm which is designed to keep the CPUs running at full speed by prefetching the next set of columnar data while processing the current set.  Keep in mind that I/O is still much slower than memory; therefore, I’d suggest sizing the bufferpool (memory) to at least 50% of the size of the typical active data set.

Step two is to choose high performance storage. 

When BLU Acceleration does I/O, it is typically non-sequential, i.e., skip sequential or random I/O.  Hence, BLU Acceleration can take full advantage of high performing storage (e.g., SSD), especially on workloads where the active data size is much larger than the amount of memory available, and/or when queries are relatively complex and likely induce a lot of temporary activities.

Another good rule of thumb is to define the tablespaces for active data and temporary tablespaces on a high performance SSD storage system such as IBM FlashSystem 900 which provides extremely fast random-read and random-write I/O.

And finally, step three is to consider your network, operating system and DB2 software release levels.

The network throughput is important if client-server traffic is high, and so I recommend a 10Gbs network. How about software such as the operating system and DB2 version? Follow the best practices recommendations and keep to the current maintenance level.

The above recommendations will help you build a solid foundation and get the most out of DB2 with BLU Acceleration.  You can then sit back and enjoy the speed, storage saving and time-to-value from it.

About Michael

Michael Kwok 86 x 109 Michael Kwok is the Program Director and Architect of Analytic Warehouse Performance (dashDB, BLU Acceleration and DB2 Warehouse) in the IBM Analytics Platform.  He focuses on performance in the analytic warehouse space, helping to ensures that the products continue to deliver the best performance. He works extensively with customers and provides technical consultation.  He is also one of the authors of the Best Practices paper on “Optimizing Analytic Workloads using DB2 10.5 with BLU Acceleration.” Michael Kwok holds a Ph.D. degree in the area of scalability analysis of distributed systems.