Highspeed Stateful Networking Redefined With Synogate HashCache

The World's Largest, Fastest Single-Device Stateful Firewall changing the Game for Heavy-Duty Networks

At FPGA Conference Europe 2023, Synogate presented a stateful firewall built around Synogate HashCache, proving its capability of statefully filtering billions of concurrent connections at an insert rate of hundreds of millions of new connections per second on a single device.

Drive revenue with faster, smarter traffic processing

Unlock future proof, DDoS resilient stateful processing

Bring down hardware and operating costs

FPGA

The hostless Intel® Agilex™ 7 FPGA devboard running the stateful firewall

Booth

The Synogate booth at FPGA Conference Europe 2023

Dashboard

The dashboard showing performance numbers and control inputs

Server Rack

The servers used to generate and receive network traffic

Synogate HashCache is a revolutionary enabler for hardware-accelerated, stateful networking devices. At its heart is a patent-pending, fully hardware-implemented algorithm using DRAM to offer previously unseen storage speed and capacity on a single device. It provides a missing keystone in the technology landscape until now: very large capacity of very fast state storage. It opens the door to better, more efficient network processing at present and future line rates that no other solution is able to provide.

Interested in learning more? Then contact us to book a demo or get in touch, or follow the small tour below about stateful network processing and HashCache! Because in order to understand what Synogate HashCache is, and why it is cool, we need to take a small detour and look into how network appliances function and what stateful processing is.

Packet Processing

What network appliances see

Network traffic is transmitted as packets. Any network appliance sitting between endpoints will only see a stream of those packets, and since a network appliance is usually servicing many endpoints, these packets are usually interleaved for many different connections.

Any two packets belonging to the same connection can be potentially millions of packets apart. At the same time, many protocols allow content to be arbitrarily split across packets, making it difficult for network appliances to keep track of what each connection is doing or which state it is in.

Because of this, network processing can be roughly split into two categories: Stateless and stateful processing.

Stateless Processing

Processing network traffic in easy mode

Because of the difficulty of having relevant information of a connection spread across packets, many network appliances limit themselves to stateless processing: processing where decisions are made for each individual packet in isolation. This greatly simplifies the implementation: When a packet arrives, relevant information such as header fields can be extracted, used in the decision making process, and be immediately discarded when the decision has been executed.

However, since no information from prior packets is used, the capabilities of stateless processing are severely limited. Information about connection states is not available. TLS encrypted streams spanning multiple packets cannot be decrypted by the device. Most protocols can not be verified, translated, or terminated. The list goes on.

Stateless Capabilities:

Routing
Basic filtering
Load balancing without stable connection pinning
Echo service
No rate limiting
Nothing that is connection based

Stateful Processing

More capable but more difficult network processing

Contrary to that, in stateful processing, information or decisions relating to a connection (i.e. state) is stored and used to inform future decisions that pertain to the same connection. The stored state can be as simple as the connection state of a flow, but could also be the state of a TCP state machine or a regex parser.

Being able to exploit state unlocks a plethora of capabilities as it allows the network appliance to actively participate and speak or verify protocols, rather than looking at individual packets. But it also increases complexity, as state now needs to be managed and updated at very high rates. The component responsible for managing this added complexity, the state storage, is usually implemented in the form of a key-value store.

Stateful Capabilities:

Stateful Firewalls
Load Balancers
High fan-in or fan-out Gateways
VPN Aggregators
Protocol Parsers
Multi-stream Regex Filters
TCP/TLS Termination Devices
and many more

Key-Value Stores

The required component to build stateful network appliances

While such a key-value store is conceptually simple, implementing it such that it can sustain modern network bandwidths with commodity memory is quite difficult.

Modern and future network appliances are expected to handle 100-400 GBit/s of traffic. This translates to 10s-100s of millions of packets per second, which a key-value store must be able to look up in real time. In order to be resilient to DoS attacks or connection-heavy machine2machine communication, these implementations must also support inserts at similar rates. Finally, if active sessions are to be preserved even under heavy load or attack, the key-value store needs to also have a large storage capacity, as well as resilient replacement policies to decide which sessions to drop once the capacity is reached.

When exploring the state of the art, we found that none of the existing approaches offered these ideal properties for high speed, internet facing network appliances.

Required Key-Value Store Properties:

High Lookup Rate: (10s - 100s of millions/sec)
Resilient Policy: Guaranteed state retention time
High Insert Rate: (10s - 100s of millions/sec)
High Capacity: (100s - 1000s of millions)
Thread Scalability: The ability to trade area (more workers) for more performance

Type	Special	Policy		Tree	Hash-Tables
	CAMContent-addressable memory, a very expensive type of memory that can do very fast lookups.	RndRandom replacement, a replacement policy that is easy to implement but discards active entries surprisingly quickly.	LRULeast recently used, a replacement policy that keeps active entries alive but is hard to parallelize.	RBRed-black trees, a balanced binary search tree with guarantees on insert and retrieval speeds.	Cu.Cuckoo hashing, a concept for resolving hash collisions in hash tables that allows for very fast lookups, but suffers from complicated inserts.	Lin.P.Linear probing, a concept for resolving hash collisions in hash tables with decent lookup speed and a simple insert mechanism.	Synogate HashCache
High Lookup Rate
(10s - 100s of millions/sec)
Resilient Policy							Guaranteed retention time
(Guaranteed old/inactive)
High Insert Rate							240M/s
(10s - 100s of millions/sec)
High Capacity							16G
(100s - 1000s of millions)
Thread Scalability
(Scales well with parallelization)

Synogate HashCache

Our solution for implementing network appliances without sacrificing speed or DoS resilience

Because of this lack of an ideal state storage solution, we developed Synogate HashCache.

Synogate HashCache is the key-value store that satisfies the demanding requirements of high speed, internet facing network appliances. Its key ingredients are twofold:

A novel algorithm that allows high lookup but also high insert speeds, while guaranteeing that active connections are not discarded from the state table.
An efficient RTL implementation for FPGAs or ASICs that can scale to high speeds while simultaneously using commodity DRAM for the state table, enabling very large numbers of concurrent connections.

Synogate HashCache is a complete solution for state management inside hardware accelerators. State insertion does not require any CPU or host intervention, allowing complete offloading into the accelerator (FPGA or ASIC), and even hostless setups.

Any questions so far? Get in touch

Benefits:

Very high lookup and insertion rates
Resilient to DoS attacks
Can utilize commodity memory (DRAM)
Can scale easily to future bandwidth demands

Demo Setup

What we showed live at FPGA Conference Europe 2023

To demonstrate the capabilities of Synogate HashCache and highlight its impact on network appliances, we implemented a stateful firewall that uses Synogate HashCache for its connection tracking.

The stateful firewall with Synogate HashCache was implemented on a hostless Intel® Agilex™ 7 FPGA. Four servers utilize their combined 192 hardware threads to generate and send random UDP packets (as used in QUIC, DNS, VoIP, …), i.e. “requests”, as quickly as possible. This packet flow is passed through the FPGA, i.e. the stateful firewall, to another two servers which receive the packets and “respond” to them by reflecting them back. These “responses” are passed back through the FPGA to the original four servers.

The randomized source and destination addresses of the “requests” ensure that every request is a new connection, a worst case scenario for stateful firewalls. Requests must adhere to a certain port range to be allowed through by the firewall (stateless check) while responses are only allowed back if they belong a connection that was first established by a request packet (stateful check). Because of the nature of UDP, the firewall is not aware when a connection is terminated.

The servers can be switched to produce illegal requests or illegal responses to verify the correct filtering behavior of the firewall. The packet size can be modified to control the connection rate from 8 million per second to 140 million per second.

The demo shows that Synogate HashCache in the implemented configuration on a single device can handle around 120 million state table inserts (request packets) and 120 million state table updates (response packets) per second. Since both operations are the same complexity, this demonstrated a peak rate of 240 million new connections per second while using commodity DRAM to hold 16 billion state entries.

Demo Specs:

Device	(sponsored by Intel®)
(FPGA)	Intel® Agilex™ 7
External Memory	(4x4 channels)
(type and amount)	3x 64GB DDR4 + 1x 16GB DDR4 (HPS channel)
Utilization	(ALMs)
(of FPGA)	6.8 %
Frequency	(EMIF clock domain)
(of Synogate HashCache implementation)	333 MHz
Power Draw	(according to Quartus, including transceivers but excluding DRAM and board components)
(estimated maximum)	35 W
Capacity	(per entry: 8 bytes + ca. 6 bytes overhead)
(entries in state table)	16 billion
Speed	(1x insert + 1x update per connection)
(connection rate)	120 million/s (x2)

High Insert Rate

How our demo firewall compares to the best on the market in terms of insert speed

Because of this high insert rate, Synogate HashCache keeps accepting new connections even under DoS attacks.

Comparing to flagship network appliances on the market, the stateful firewall using Synogate HashCache outperforms the best that money can buy by orders of magnitude.

In all fairness, these network appliances are fully featured, production ready devices while our stateful firewall is just a demonstrator. But outperforming 1000-2000W multi-CPU & multi-FPGA boxes with a single < 100W hostless FPGA board at 6.8 % utilization shows the potential that Synogate HashCache can have in network appliances.

The fact that these network appliances have much higher stateless processing speeds underlines the industry’s need for a key-value store such as Synogate HashCache.

Large State Table

How our demo firewall compares to the best on the market in terms of capacity

The 16 billion state entries, together with Synogate HashCache’s DoS resilient replacement strategy, guarantee that even under a worst case DoS attack scenario, every connection is kept alive as long as it experiences activity every 8 seconds.

The benefit of being able to utilize commodity DRAM for state storage (of which almost arbitrary amounts can be attached) becomes evident when comparing to network appliances on the market, where Synogate HashCache again outperforms its competition by orders of magnitude.

Interested?

We are more than happy to discuss your project and how we can help.

Call us: +49-30-62932062

You can also schedule a demo directly:

Book Meeting

As a little treat for reading the entire presentation, here is a one-pager with the key facts about Synogate HashCache for you to download:

Product Brief

Get in touch!

mail@synogate.com
+49-30-62932062
linkedin.com/company/synogate
https://github.com/synogate
Synogate UG (haftungsbeschränkt)
Wegedornstr. 32
12524 Berlin
Germany
Handelsregister: Amtsgericht Charlottenburg, HRB 232733
UStID-Nr.: DE347409176

The development of the Synogate HashCache IP-Core in general and this demo in particular is being sponsored by the German Federal Ministry of Education and Research:
Logo of the German Federal Ministry of Education and Research

Navigation

Get in touch

Github: https://github.com/synogate
linkedin: https://www.linkedin.com/company/synogate
Email: mail@synogate.com

Address: Synogate UG (haftungsbeschränkt)
Wegedornstr. 32
12524 Berlin
Germany
Registration: Handelsregister: Amtsgericht Charlottenburg, HRB 232733
Tax ID: UStID-Nr.: DE347409176
Phone: +49-30-62932062

Synogate wurde im Rahmen des EXIST-Programms in 2021 durch das Bundesministerium für Wirtschaft und Energie und den Europäischen Sozialfonds gefördert. Ziel der Europäischen Union ist es, dass alle Menschen eine berufliche Perspektive erhalten. Der Europäische Sozialfonds (ESF) verbessert die Beschäftigungschancen, unterstützt die Menschen durch Ausbildung und Qualifizierung und trägt zum Abbau von Benachteiligungen auf dem Arbeitsmarkt bei. Mehr zum ESF unter: www.esf.de