Enterprise, Graphics, Hardware, Software Programs

Regex and CUDA Showcase Impressive GPU Performance


To those of us who took theory of computability classes, regular expressions are a common sight. We know them to be fundamental to a number of programming languages. However, most of us consider them on a theoretical basis, tracing them using DFAs (deterministic finite automatons) and NFAs (non-deterministic finite automatons), we do not necessarily think about the complexity that goes into computing them.

The way a traditional CPU works requires any search among regular expressions to process each DFA and NFA individually and in order. Multi-threading assists in this process, but even when using multiple threads, the OS kernel is limited by the number of physical cores that can be utilized at one time (after all, that is what task scheduling is for). The advantage of GPU computing using CUDA (Compute Unified Device Architecture) is the ability to implement parallel processing.

Graphics Processing Units have thousands of processing cores that can be utilized concurrently. This ability makes searches and regular expression evaluations much faster to compute, as various expressions can be tested at the same time.

Thanks to research by David Lehavi from HP Labs, we see just how much of a difference using effective GPU computing instead of CPU computing to evaluate regular expressions makes. Keep in mind, this comparison was made using Fermi-based Quadro4000 GPUs vs 2.67 Ghz Xeon CPUs (suspected model X3450), and the CPU used single threaded regular expression engines. Searches on the Xeon took almost 270 seconds to complete, whereas the Quadro completed the search in under 28 seconds. While that is only an improvement of 10x, that number jumps to about 30x when one includes factors like latency and performance per watt. Everything we have seen to date about Kepler leads us to believe that given Fermi?s performance of 30x that of a CPU, using a Kepler GPU would increase that ratio significantly.

One must also consider the other implications of such a drastic difference in performance because unlike most tasks done on a GPU, regular expressions provide an imbalanced workload. The ability to effectively and efficiently process these imbalanced workloads will undoubtedly lead to GPU computing becoming more commonplace for general use and a wider variety of workloads.