Beyond Bloom Filters: Approximate Concurrent State Machines Michael Mitzenmacher Joint work with Flavio Bonomi, Rina Panigrahy, Sushil Singh, George Varghese Goals of the Talk Survey some of my recent work on Bloom filters and related hashing-based data structures. But lots of other people currently working in this area an area of research in full bloom. Highlight: new results from SIGCOMM, ESA, Allerton 2006. For more technical details and experimental results, see papers at my home page. Motivation: Router State Problem Suppose each flow has a state to be tracked. Applications:

Intrusion detection Quality of service Distinguishing P2P traffic Video congestion control Potentially, lots of others! Want to track state for each flow. But compactly; routers have small space. Flow IDs can be ~100 bits. Cant keep a big lookup table for hundreds of thousands or millions of flows! Approximate Concurrent State Machines Model for ACSMs We have underlying state machine, states 1X. Lots of concurrent flows. Want to track state per flow. Dynamic: Need to insert new flows and delete terminating flows. Can allow some errors. Space, hardware-level simplicity are key.

Problems to Be Dealt With Keeping state values with small space, small probability of errors. Handling deletions. Graceful reaction to adverarial/erroneous behavior. Invalid transitions. Non-terminating flows. Could fill structure if not eventually removed. Useful to consider data structures in well-behaved systems and ill-behaved systems. Results Comparison of multiple ACSM proposals. Based on Bloom filters, d-left hashing, fingerprints. Surprisingly, d-left hashing much better! Experimental evaluation. Validates theoretical evaluation. Demonstrates viability for real systems. New construction for Bloom filters. New d-left counting Bloom filter structure. Factor of 2 or better in terms of space.

Review: Bloom Filters Given a set S = {x1,x2,x3,xn} on a universe U, want to answer queries of the form: Is y S . Bloom filter provides an answer in Constant time (time to hash). Small amount of space. But with some probability of being wrong. Alternative to hashing with interesting tradeoffs. Bloom Filters Start with an m bit array, filled with 0s. B 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 Hash each item xj in S k times. If Hi(xj) = a, set B[a] = 1.

B 0 1 0 0 1 0 1 0 0 1 1 1

0 1 1 0 To check if y is in S, check B at Hi(y). All k values must be 1. B 0 1 0 0 1 0 1

0 0 1 1 1 0 1 1 0 Possible to have a false positive; all k values are 1, but y is not in S. B 0 n items

1 0 0 1 0 1 0 0 m = cn bits 1 1 1 0

1 1 0 k hash functions False Positive Probability Pr(specific bit of filter is 0) is p ' (1 1 / m) kn e kn / m p If is fraction of 0 bits in the filter then false positive probability is (1 ) k (1 p' ) k (1 p ) k (1 e k / c ) k Approximations valid as is concentrated around E[]. Martingale argument suffices. Find optimal at k = (ln 2)m/n by calculus. So optimal fpp is about (0.6185)m/n n items

m = cn bits k hash functions False positive rate Example 0.1 0.09 0.08 m/n = 8 0.07 0.06 0.05 0.04 0.03 Opt k = 8 ln 2 = 5.45... 0.02 0.01 0 0

1 2 3 4 5 6 7 8 9 10 Hash functions n items m = cn bits

k hash functions Handling Deletions Bloom filters can handle insertions, but not deletions. xi xj B 0 1 0 0 1 0 1 0

0 1 1 1 0 1 1 0 If deleting xi means resetting 1s to 0s, then deleting xi will delete xj. Counting Bloom Filters Start with an m bit array, filled with 0s. B 0

0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 Hash each item xj in S k times. If Hi(xj) = a, add 1 to B[a]. B 0 3 0 0 1 0 2 0 0

3 2 1 0 2 1 0 To delete xj decrement the corresponding counters. B 0 2 0 0

0 0 2 0 0 3 2 1 0 1 1 0 Can obtain a corresponding Bloom filter by reducing to 0/1.

B 0 1 0 0 0 0 1 0 0 1 1 1

0 1 1 0 Counting Bloom Filters: Overflow Must choose counters large enough to avoid overflow. Poisson approximation suggests 4 bits/counter. Average load using k = (ln 2)m/n counters is ln 2. Probability a counter has load at least 16: e ln 2 (ln 2)16 / 16!6.78E 17 Failsafes possible. We assume 4 bits/counter for comparisons. ACSM Basics Operations

Insert new flow, state Modify flow state Delete a flow Lookup flow state Errors False positive: return state for non-extant flow False negative: no state for an extant flow False return: return wrong state for an extant flow Dont know: return dont know Dont know may be better than other types of errors for many applications, e.g., slow path vs. fast path. ACSM via Counting Bloom Filters Dynamically track a set of current (FlowID,FlowState) pairs using a CBF. Consider first when system is well-behaved. Insertion easy. Lookups, deletions, modifications are easy when current state is given. If not, have to search over all possible states. Slow,

and can lead to dont knows for lookups, other errors for deletions. Direct Bloom Filter (DBF) Example 0 0 1 0 2 3 0 0 2 (123456,3) 0 0

0 0 1 3 1 0 1 1 2 0 0 1 2

0 0 (123456,5) 0 0 3 1 1 1 Timing-Based Deletion Motivation: Try to turn non-terminating flow problem into an advantage. Add a 1-bit flag to each cell, and a timer. If a cell is not touched in a phase, 0 it out. Non-terminating flows eventually zeroed. Counters can be smaller or non-existent; since

deletions occur via timing. Timing-based deletion required for all of our schemes. Timer Example Timer bits 1 0 0 0 1 0 1 0 3 0

0 2 1 0 1 1 RESET 0 0 0 0 0 0

0 0 3 0 0 0 1 0 1 0 Stateful Bloom Filters Each flow hashed to k cells, like a Bloom filter. Each cell stores a state. If two flows collide at a cell, cell takes on dont know value. On lookup, as long as one cell has a state value, and there

are not contradicting state values, return state. Deletions handled by timing mechanism (or counters in well-behaved systems). Similar in spirit to [KM], Bloom filter summaries for multiple choice hash tables. Stateful Bloom Filter (SBF) Example 1 4 3 4 3 3 0 0 2 (123456,3)

1 4 5 4 5 3 1 0 1 4 ? 0 2

4 ? 0 2 (123456,5) 0 0 2 1 0 1 What We Need : A New Design These Bloom filter generalizations were not doing the job. Poor performance experimentally.

Maybe we need a new design for Bloom filters! In real life, things went the other way; we designed a new ACSM structure, and found that it led to a new Bloom filter design. Interlude : The Main Point There are alternative ways to design Bloom filter style data structures that are more effective for some variations, applications. The goal is to accomplish this while maintaining the simplicity of the Bloom filter design. For ease of programming. For ease of design in hardware. For ease of user understanding! A New Approach to Bloom Filters Folklore Bloom filter construction. Recall: Given a set S = {x1,x2,x3,xn} on a universe U, want to answer membership queries. Method: Find an n-cell perfect hash function for S. Maps set of n elements to n cells in a 1-1 manner. Then keep bit fingerprint of item in each cell. Lookups have false positive

.2 (1 / ) Advantage: each bit/item reduces false positives by a factor of 1/2, vs ln 2 for a standard Bloom filter. Negatives: Perfect hash functions non-trivial to find. Cannot handle on-line insertions. Perfect Hashing Approach Element 1 Element 2 Element 3 Element 4 Element 5 Fingerprint(4)Fingerprint(5)Fingerprint(2)Fingerprint(1)Fingerprint(3) Near-Perfect Hash Functions In [BM96], we note that d-left hashing can give near-perfect hash functions. Review: d-left Hashing Split hash table into d equal subtables. To insert, choose a bucket uniformly for each subtable. Place item in a cell in the least loaded bucket, breaking ties to the left. Properties of d-left Hashing Analyzable using both combinatorial

methods and differential equations. Maximum load very small: O(log log n). Differential equations give very, very accurate performance estimates. Maximum load is extremely close to average load for small values of d. Example of d-left hashing Consider 3-left performance. Average load 6.4 Average load 4 Load 0 1.7e-08 Load 0 2.3e-05 Load 1 5.6e-07 Load 1

6.0e-04 Load 2 1.2e-05 Load 2 1.1e-02 Load 3 2.1e-04 Load 3 1.5e-01 Load 4 3.5e-03 Load 4 6.6e-01

Load 5 5.6e-02 Load 5 1.8e-01 Load 6 4.8e-01 Load 6 2.3e-05 Load 7 4.5e-01 Load 7 5.6e-31 Load 8

6.2e-03 Load 9 4.8e-15 Near-Perfect Hash Functions In [BM96], we note that d-left hashing can give near-perfect hash functions. Useful even with deletions. Main differences Multiple buckets must be checked, and multiple cells in a bucket must be checked. Not perfect in space usage. In practice, 75% space usage is very easy. In theory, can do even better. First Design : Just d-left Hashing For a Bloom filter with n elements, use a 3-left hash table with average load 4, 60 bits per bucket divided into 6 fixed-size fingerprints of 10 bits. Overflow rare, can be ignored. False positive rate of 12 2 10 0.01171875

Vs. 0.000744 for a standard Bloom filter. Problem: Too much empty, wasted space. Other parametrizations similarly impractical. Need to avoid wasting space. Just Hashing : Picture Bucket Empty Empty 0000111111 1010101000 0001110101 1011011100 Key: Dynamic Bit Reassignment Use 64-bit buckets: 4 bit counter, 60 bits divided equally among actual fingerprints. Fingerprint size depends on bucket load. False positive rate of 0.0008937 Vs. 0.0004587 for a standard Bloom filter. DBR: Within a factor of 2.

And would be better for larger buckets. But 64 bits is a nice bucket size for hardware. Can we remove the cost of the counter? DBR : Picture Bucket 000110110101 111010100001 101010101000 101010110101 010101101011 Count : 4 Semi-Sorting Fingerprints in bucket can be in any order. Semi-sorting: keep sorted by first bit. Use counter to track #fingerprints and #fingerprints starting with 0. First bit can then be erased, implicitly given by counter info. Can extend to first two bits (or more) but

added complexity. DBR + Semi-sorting : Picture Bucket 000110110101 111010100001 101010101000 101010110101 010101101011 Count : 4,2 DBR + Semi-Sorting Results Using 64-bit buckets, 4 bit counter. Semi-sorting on loads 4 and 5. Counter only handles up to load 6. False positive rate of 0.0004477 Vs. 0.0004587 for a standard Bloom filter. This is the tradeoff point. Using 128-bit buckets, 8 bit counter, 3-left hash table with average load 6.4. Semi-sorting all loads: fpr of 0.00004529

2 bit semi-sorting for loads 6/7: fpr of 0.00002425 Vs. 0.00006713 for a standard Bloom filter. Additional Issues Futher possible improvements Group buckets to form super-buckets that share bits. Conjecture: Most further improvements are not worth it in terms of implementation cost. Moving items for better balance? Underloaded case. New structure maintains good performance. Improvements to Counting Bloom Filter Similar ideas can be used to develop an improved Counting Bloom Filter structure. Same idea: use fingerprints and a d-left hash table. Counting Bloom Filters waste lots of space. Lots of bits to record counts of 0. Our structure beats standard CBFs easily, by factors of 2 or more in space. Even without dynamic bit reassignment.

End of Interlude : Back to ACSMs How do we use this new design for ACSMs? Fingerprint Compressed Filter Each flow hashed to d choices in the table, placed at the least loaded. Fingerprint and state stored. Deletions handled by timing mechanism or explicitly. False positives/negatives can still occur (especially in ill-behaved systems). Lots of parameters: number of hash functions, cells per bucket, fingerprint size, etc. Useful for flexible design. Fingerprint Compressed Filter (FCF) Example Fingerprint State 10001110011111100 3 01110100100010111 1 01110010010101111 6

11110101001000111 11110111001001011 00011110011101101 11111111110000000 2 2 1 4 10101110010101011 2 01110010001011111 3 11100010010111110 1 x : 11110111001001011 : State 2 to State 4 Experiment Summary FCF-based ACSM is the clear winner. Better performance than less space for the others in test situations. ACSM performance seems reasonable: Sub 1% error rates with reasonable size. Conclusions and Open Questions

Approximate concurrent state machines are very practical, potentially very useful. Natural progression from set membership to functions (Bloomier filter) to state machines. What is next? Surprisingly, d-left hashing variants appear much stronger that standard Bloom filter constructions. Leads to new Bloom filter/counting Bloom filter constructions, well suited to hardware implementation. Lots more to understand. Tradeoffs of different errors at the data structure level. Impact of different errors at the application level. On the fly dynamic optimization of data structure. Reduce fingerprint bits as load increases?