Swarm64 prepares to announce service-provider partners for database-acceleration stack
Swarm64, which raised $12.5m in a funding round led by Intel Capital at the start of the year, says it’s making good progress toward establishing its relational database accelerator, both on-premises and in the cloud. The company promises to announce a large EMEA cloud provider as a customer and partner in the October timeframe.
The 451 Take
It’s taken a while for the market to catch up with Swarm64’s plans, and we’ve seen some necessary repositioning over the past five years as things have evolved. But today the idea of specialist acceleration is well established. Swarm64 has had the time to fill out its stack of software technologies that can take full advantage of FPGA characteristics while hiding the fearsome complexities of FPGA programming. And significantly, FPGA instances in the cloud are here, and require higher-level software layers for customers to take advantage of them. Service providers in particular now have an urgent motivation to partner with companies such as Swarm64.
Swarm64 is a Norwegian company that was founded in 2013, with offices in Oslo, but its headquarters is in Berlin. When 451 Research first covered the company in 2015, it was positioning itself as a chip startup providing FPGA acceleration for a broader set of big-data and analytics applications. But since then it has increasingly focused its parallel processing expertise specifically on speeding up open source relational database software, building out its software stack into a much more complete and integrated offering. The original CEO, Karsten Rönner, is still in place, as is cofounder Thomas Richter, now COO. Investors include Alliance Ventures, Target Partners and Investinor, which together contributed to an $8m series A round in September 2015. That was followed in January by a $12.5m series B round led by Intel Capital. Together with seed funding, that takes total funding up to $21m to date. Swarm64 is still a lean operation, with about 30 staff, but it’s been expanding in the US by opening a new subsidiary in Palo Alto, California.
The Swarm64 SDA (Scalable Data Accelerator) requires an Intel Programmable Accelerator Card with a 20nm Arria 10GX FPGA (derived from Intel’s acquisition of Altera back in 2015). The PCI-based PAC plugs into standard servers and runs the SDA software stack. Swarm64 uses the storage engine interfaces of open source databases to offload the main CPU and take advantage of FPGA chips. It’s effectively running its own storage engine and associated software stack alongside the original storage engine. The advantage is that existing databases can be used with no application changes required. MariaDB, MySQL and PostgreSQL 10 (or 11 beta) are the three databases supported. The strong multiple of performance-to-price ratio is what’s driving interest. Swarm64 claims a fourfold performance increase over standard PostgreSQL, with a price increase of 1.7 or 1.8. Hardware costs (which are coming down) amount to a few thousand dollars, while Swarm64 charges about $1,000 per month for the software stack.
Swarm64 says that its software stack addresses six technology areas that enable the acceleration. It’s reluctant to list them, but does call out three in particular. First, optimized columns use a proprietary indexing mechanism made for fast data ingestion, scalability and incremental query- speed improvements, based on an approximate data layout with a custom algorithm. Second, data compression is used to keep all data compressed until it is being processed, which increases the data throughput and helps mitigate the PCI bottleneck – the larger the dataset, the more acceleration can be provided. And third, the stack implements query preprocessing and cooperative computing between FPGA and CPU. In contrast to GPUs, which typically offload and replace the CPU, the FPGA- based accelerator operates as a co-processor to the CPU. Aside from compression and decompression, which are handled directly on the FPGA, everything else supports algorithms that continue to run on the CPU, simplifying them by reducing the amount of data to cross-check.
Intel has claimed that FPGAs, used in conjunction with the Swarm64 acceleration stack, can boost real-time data-analytics performance by a factor of 20. While the Intel Arria 10, introduced in 2013, is a relatively mature part, Swarm64 says it will be taking advantage of newer FPGAs as they emerge, and not just from Intel – it promises additional support for Xilinx in 2019. The current version supports eight-lane PCIe Gen 3, but 16-lane Gen4 is coming, which will potentially boost data rates fourfold. The company is also working on increasing the degree of parallelism that the database can achieve on the CPU, more fully utilizing all the CPU cores to provide a linear performance increase.
For MySQL and MariaDB, the SDA storage engine sets next to InnoDB or any other storage engine. For PostgreSQL, the interface is through the Foreign Data Wrapper. Tables can be shared between the FPGA and CPU efficiently. Customers don’t access the FPGA directly, and don’t even need to know it exists (aside from the purchasers and the database administrator, who just has to create the relevant database tables as Swarm tables and define optimized columns so that SDA can select just the data needed for a specific query).
Swarm64’s technology is equally applicable for cloud deployments as it is for on-premises servers. Its first major cloud win – with a large EMEA cloud provider that also has some US datacenters – is expected to be revealed in September or October. Customers will be able to rent an FPGA instance and then run the Swarm64 software stack on top of it. Running Swarm64 in a virtualized cloud environment will be an easier alternative for customers that don’t want to procure their own server and Intel Programmable Acceleration Card. There are further opportunities here for Swarm64 to have its software stack embedded in managed database or data-warehouse services (such as Amazon RedShift). The goal for Swarm64 is to become a platform play. As for on-premises deployments, system OEMs such as Dell, Fujitsu, HPE and Lenovo are now incorporating Aria 10 GX FPGAs into their mainstream server lines. They are key target platforms and potential partners for the Swarm64 stack, and some vendors might package SDA in appliance form.
Direct support for Oracle was once being considered as an option, until Swarm64 realized that most of the customer interest was coming from those looking to replace Oracle with open source alternatives. Postgres and MariaDB both provide Oracle compatibility layers and tools. Relational databases remain the backbone for data analytics in the enterprise, for both operational and customer data. Swarm64 has also seen interest from industrial users that want to control processes and assets in near real time. Healthcare, government and retail are among the key verticals.
GPUs, the majority of them from NVIDIA, are the incumbent acceleration technology in the marketplace today, and their adoption is gaining momentum at cloud service providers. GPUs are great for streaming applications and for new workloads such as machine learning and AI, but harder to optimize for in-memory applications. There are database software houses that use GPUs as their engine, and they claim some impressive performance gains. But there’s a lot of custom coding to be done to support that, and because of that, standard distribution open source platform support is not an option. SQream DB, for instance, had to build its columnar database from the ground up. Others include BlazingDB, MapD, Kinetica and Brytlyt – the latter is based on PostgreSQL.
Mostly, however, Swarm64 competes as an alternative to traditional database software houses, such as Teradata, Oracle Exadata, legacy Netezza (IBM), Vertica (now owned by Micro Focus) and Pivotal Greenplum. The long-term threat – as well as partnership opportunities – will come from cloud providers that are implementing their own database and data-warehousing services in the cloud, and have GPUs, FPGAs and other specialist resources within their own datacenters. (Amazon, Microsoft Azure, OVH in Europe, and Alibaba and Tencent is Asia all now offer FPGA instances.) Their choice is either to partner with the likes of Swarm64 or to build their own software layers. An interesting example is the Acceleration-as-a-Service offering from Alibaba, which is also powered by Intel FPGAs. Swarm64 says that type of service doesn’t use the more sophisticated co-processor capabilities it can offer – nor do the cloud providers have an option for on-premises deployment.
Other companies that overlap with Swarm64 include Accelize (an OVH partner bringing together hardware, platforms, drivers, applications and services for FPGAs as a package for service providers), Bitfusion (a disaggregation platform for GPUs and FPGAs) and Ryft (a partner with Amazon, using its F1 FPGA accelerated instances to accelerate business analytics in the cloud and on-premises).
Source: 451 Research, Tuesday August 21st 2018, All rights reserved.