Max is curious

Does software design matter?

2025-05-07T20:20:00+00:00

Does it make a difference? Can spending time on it be justified?

It’s hard to argue if it matters for a company directly, if it affects its bottom line, which is by and large the main question for most companies. While an argument can be made that thoughtful design makes it easier and faster to modify, extend, and scale software, it’s hard to measure how much this impacts a company’s profit versus an approach of carelessly written code due to the lack of quantifiable evidence.

On the other hand, building as fast as possible with minimal resource expenditure seems like a no-brainer for the goal of maximizing profit, but these tactics directly conflict with the slower, more long-term approach of software design. Does software design then actually hurt profit by standing in contradiction to these tactics, and can we conclude that software design shouldn’t ever be considered because it makes no tangible difference and only hurts company outcomes? Is there maybe some example of when software design does make a positive difference in outcome that we can use as a starting point for rationalizing its existence?

The first thing that comes to mind is contract work. When building software for a client, there’s an understanding that, once built, you hand over the software to the client. The client inherits the software, so if it’s not designed well, they inherit its fragility, rigidity, and issues. In this case, good design has a direct benefit on the long-term outcome for the software and the client inheriting it.

Abstracting this example a bit, it seems anytime software is inherited or changes hands, good design is a worthwhile investment. So then, when does software change hands? Is it only in the case of contract work? It’s obviously much broader than that: software inheritance happens when a new developer onboards to a team, a developer is assigned to extend/modify the functionality of existing software that they didn’t build, an engineer on support needs to understand a piece of software to triage a bug, and so on. Even these few examples occur frequently enough to make a case for spending time on good software design, because on the side of well-designed software the experiences of the people affected by it are intuitive, simple, pain-free, and empowering, and on the other side, they’re frustrating, demotivating, difficult, and never-ending. This makes it feel like good software design isn’t just a self-indulgent activity but makes a real impact, no matter the company or product.

While it may not matter for the company’s goals directly, good design matters to the people working with and inheriting software. The fellow engineers who don’t have to wake up to alerts, or spend weeks reverse engineering a system with no documentation in order to extend its functionality, or deep dive to understand a system to triage some bug, or newcomers who have a faster and easier time onboarding, or the entire development team who is able to build faster and with more confidence. It may not make a direct impact on profit, but it makes an impact on the number of headaches for years to come. If we assume that empathy is a worthwhile value to uphold, then the benefits of good software design are self-evident, and maybe empathy becomes a measure of how good a design is. The more people a design impacts, the better it is.

These smaller impacts of good software design build up to the larger nurturing of a tech culture of collective ownership and consensus, a culture ultimately rooted in empathy. The question then becomes, do empathy and healthy culture matter? While I can’t speak for a company’s leadership, I can say for myself that empathy is a worthwhile value and aspiration and has a ripple effect on the people around me, so I will continue to fight for and champion good software design as one way of serving and supporting my fellow peers and making the impact I can.

3D Models and Animations

2021-11-23T20:20:00+00:00

Here is a short curation of 3D modeling and animation work I did in Blender.

Candy Controls

NYC Skyline

Ice Cream

Lillies

Subway

Dandies

Two-part tutorial on making an ice cream cone in Blender

Minecraft animation made in Blender

Whimsical Digital drawings

2021-11-22T20:20:00+00:00

Just two drawings I made in Photoshop a while ago :)

Girl and Boy

Blueberries and Firefly

CU-me Class Scheduler

2021-11-21T20:20:00+00:00

Project Link

Status

This project used to be live and publicly accessible but is currently archived.

Description

By senior year of college I had honed my class scheduling workflow. I used it every semester and it spanned four browser tabs:

CUNY DegreeAudit to see what courses I still need to take in my college career, including core and major requirements.
CUNYFirst for next semester’s course catalog. I mainly used the course search and primarily needed course session dates, times, and session professors.
RateMyProfessors for professor ratings. When choosing a session I needed to factor which professor was teaching it in addition to the session date/time because, when possible, it was worth comprimising on date for a better instructor.
Free College Schedule Maker for a simple interface to experiment with different schedules and keep track of chosen sessions.

Working across many tabs and experiencing CUNYFirst’s objectively bad user experience made the class scheduling process a cumbersome chore. With the skills I had accumulated by my last year of college, I decided to make a web app that integrated these three services in one experience.

Say hi to CU-me.

CU-me is a one-stop for class scheduling for all CUNY students. It works for all 22 campuses for undergraduate and graduate students. Using your CUNY login you get access to a single-page application with degree requirements on the left side, course search on the right side, and a schedule canvas in the middle. Course search results include RateMyProfessors ratings. Underlying the visual experience is a publicly queryable REST API for course searching. Without further ado, here’s a gallery:

Login page

First-time tutorial modal

Main interface

Course search results with RateMyProfessors ratings

A course session added to the schedule

Tech writeup

This is a Django application with a PostgreSQL database.

Database

The database stores a user’s saved schedule by linking a Django user with selected courses. User actions trigger background AJAX requests, which update database entries, so changes are automatically saved.

Database diagram

API

The REST API has one GET endpoint for CUNY class searching. It wraps the CUNYFirst search API without requiring login.

I tried finding various ways to query CUNYFirst without using an automated browser, but could only find an unauthenticated page that allows interactive search. I use Selenium to interact with the form on this page filling in values from the GET params of the search request. I then scrape, format, and return the results as JSON.

Frontend

When a user logs in with their CUNY credentials, I use them to log into DegreeWorks using Selenium and scrape all degree requirements that I then present on the main screen. This is also when the user’s schedule data is fetched from the database or initialized if the user is new. The Selenium process takes a long time, so I utilize Celery, an asynchronous task queue, to run it as a background task with progress status updates to the user. The login process still takes a long time, but progress is communicated transparently to the user.

The main schedule consists of multiple overlaid HTML canvases with mouse event listeners. The schedule can be downloaded as an image at the click of a button.

A more complete tech description is available here.

Enterprise-scale Firewall Decision Trees

2021-11-18T20:20:00+00:00

One of my highlight projects at Bloomberg was building a tool to conduct contextual firewall policy comparison. I ended up modifying the construction algorithm for an existing data structure, the firewall decision tree, for it to efficiently process enterprise-scale policies. This is a generalized overview of my work starting with an overview of the domain, terminology, and problem. Then, I go over various approaches culminating in the most efficient solution. A discussion of solution pitfalls and further work concludes this post.

A brief primer on firewall policies

Definition and representation

Firewall policies are collections of rules that state what action to take for certain flows. A flow is a 5-tuple consisting of (source, destination, protocol, source-port, destination-port) and some examples of actions are allow and block.

I will use CIDR notation for the source and destination IPs/subnets. Most policy rules do not contain source ports, so I will omit them in examples, instead writing a 4-tuple (source, destination, protocol, destination-port).

An example of a rule: (1.1.1.0/24, 2.2.2.0/24, tcp, 80) block

This rule blocks traffic from 1.1.1.0/24 to 2.2.2.0/24 on tcp-80, i.e. http.

Usually, to declare rules more efficiently, groups are defined, but no matter how rules are represented, they can be broken down into simple 5-tuples. For example, the composite rule

([1.1.1.0/24, 3.3.3.3], 2.2.2.0/24, tcp, [80, 443]) block, which means block http and https traffic from 1.1.1.0/24 and 3.3.3.3 to 2.2.2.0/24 can be broken down into four simple rules:

(1.1.1.0/24, 2.2.2.0/24, tcp, 80) block
(1.1.1.0/24, 2.2.2.0/24, tcp, 443) block
(3.3.3.3, 2.2.2.0/24, tcp, 80) block
(3.3.3.3, 2.2.2.0/24, tcp, 443) block

Breaking down composite rules into simple rules can be done via Cartesian product:

\(sources \times destinations \times protocols \times destination\_ports\).

Processing

Firewall rules are processed in order for matches. When a packet/flow comes in the first matching rule decides what action to take.

For example, for the set of rules in the previous example the flow (1.1.1.1, 2.2.2.2, tcp, 443) will be matched by the 2nd rule: (1.1.1.0/24, 2.2.2.0/24, tcp, 443) block, so the flow will be blocked.

Problem

A consequence of the rules being processed in order are shadowing. Shadowing is a form of redundancy when one rule partially or fully overlaps the flows of another rule.

An example of shadowing:

(1.1.1.0/24, 2.2.2.0/24, tcp, 80) allow
(1.1.1.1, 2.2.2.2, tcp, 80) block

Packets destined to 2.2.2.2 from 1.1.1.1 on tcp-80 will be allowed despite rule 2 stating they should be blocked because that flow will be matched by rule 1.

For the shadowing of one rule by another to occur there has to be an intersection between each of the fields for the two rules, so the following example will not cause shadowing since the d-ports are different:

(1.1.1.0/24, 2.2.2.0/24, tcp, 80) allow
(1.1.1.1, 2.2.2.2, tcp, 443) block

Shadowing can cause unintended changes when adding/removing/changing rules in a policy as the true effects of the change depend on the rule’s position and may not be visible via a surface-level comparison, especially when there are tens of thousands of rules, whose flow values may be abstracted behind objects with names, which is common to make rule management more understandable and intentful. Deleting a rule seems like a simple operation, but it may uninentionally expose the flows below it. Similarly, adding a rule may not only be ineffective if that rule is shadowed by preceding rules, it too can uninentionally shadow rules unless it is appended at the end of the list of rules.

A second consequence of shadowing is unoptimized policy, and as the number of rules grows longer it not only makes managing the policy cumbersome and more error-prone, but the policy size may exceed the storage capacity of some equipment like TCAMs, which have very limited storage capacity to maintain packet-processing speed.

Implementing a tool to do a flow comparison betwen two policies and return the set of unshadowed flows present in both policies, whose actions differ across the policies, is the subject of the solution.

Solution

The task of comparing two policies comprises two steps:
a. Comparing intersecting flows between the policies for differences in action
b. Reducing the differing flows to only unshadowed ones

These steps need not be completed in this order; step b) can precede step a) by first reducing the set of flows for each policy to unshadowed only and then comparing these flows with each other.

A naive approach

Implementation

An intuitive approach to attaining the list of unshadowed flows for a policy is to start with an empty set of unshadowed flows, iterate over each rule in order, and subtract the set of unshadowed flows from the flows of this rule. The resulting flows can be added to the set of unshadowed flows.

The time complexity of this solution is \(O(N^2)\) in the total number of flows. For each rule, we subtract at most \(N\) flows.

For two interscting flows, subtracting one flow from another involves subtracting the source CIDRs and destination CIDRs. CIDR subtraction is a time-consuming bit operation, which can yield more disjoint CIDRs than started with, the product of which yields more flows.

As an example, consider subtracting the flows (protocol, port omitted): (1.1.1.0/30, 1.1.10/30) and (1.1.1.1, 1.1.1.1). (1.1.1.0/30, 1.1.1.0/30) - (1.1.1.1, 1.1.1.1) = (1.1.1.0, 1.1.1.0/30), (1.1.1.1, 1.1.1.0), (1.1.1.1, 1.1.1.2/31), (1.1.1.2/31, 1.1.1.0/30).

The subtraction of two flows resulted in four new disjoint flows. The problem size expands rapdidly with each successive subtraction.

Improvements

There are several improvements we can make for each rule to improve the performance of this apprach.

Instead of subtracting all unshadowed flows from the current rule’s flow, some of which do not overlap with the flow, we can prune this set beforehand. We achieve this by using radix trees to store the unshadowed sources and destinations. This way instead of linearly scanning all the flows, we can narrow down the overlapping sources and destinations in \(log(N)\) time.

Since we must also consider that the protocols and d-ports overlap, we can bucket the flows by (protocol, d-port) pair, all flows for one (protocol, d-port) being in the same bucket. Now, for a given rule flow we only need to process the flows in the flow’s (protocol, d-port) bucket [if such a bucket exists]. And of those flows, we only need to subtract the ones, whose sources and destinations overlap with the given flow’s sources and destinations, which we find using the radix trees.

The amount of CIDR subtraction done for each flow is greatly reduced with these methods and there is an added benefit of allowing parallelization. Previously, since each rule’s unshadowed flows depended on the preceding flows, we could not parallelize the work, but, now, we can bucket all rule flows by (protocol, d-port) in advance and then process all of the buckets in parallel since there is no overlap between buckets. The rule flows within a bucket must still be processed synchronously.

Unfortunately, the above methods do not improve the asymptotic runtime of the solution and do not reduce enterprise policy processing time enough for practical use. One simple worst-case scenario that occurs frequently is the rule flow (0.0.0.0/0, 0.0.0.0/0, all, all). This usually appears at the end of a policy, the final “default drop” rule. This flow overlaps with every flow and since it is at the end we must subtract all the unshadowed flows from it. This single operation takes an impractical amount of time to seriously consider this solution.

Using firewall decision trees

Original structure and algorithms

To solve this problem I conducted an extensive literature review for existing solutions. A series of papers (Liu and Gouda 1237) described a data structure called the firewall decision tree [FDT], which had several uses:

Policy comparison
Policy optimization (shadowing analysis)
Policy testing (more efficient flow matching and coverage analysis)

This was a promising approach with benefits beyond solving this problem.

Using the FDT for policy comparison described in the paper:

Construct an FDT for each policy
Find differing flows between the policies by merging the constructed FDTs

The FDT construction process removes shadowed flows. It is more efficient than the aforementioned naive approach. The key to its efficiency is representing the fields of a flow as numeric intervals, [start, end] (Liu and Gouda 1241). Instead of representing sources and destinations as CIDRs, they are represented as numerical intervals in the range 0-2^32. Protocols can be represented numerically in the range 0-255 via this mapping. Finally, ports are numbers in the range 0-2^16.

The example flow (1.1.1.0/24, 2.2.2.0/24, tcp, 80) becomes ([16843008, 16843263], [33686016, 33686271], 16, 80).

Interval subtraction is less costly than CIDR subtraction. Indeed, no arithmetic or bit manipulation is needed to subtract two intervals: [1, 10] - [2, 4] = [1, 1], [5, 10]. For a set of overlapping intervals, we can use this algorithm to merge them while removing overlaps. The runtime complexity is \(O(NlogN)\) in the number of intervals.

The result of the construction algorithm is a tree whose levels comprise flow field intervals, i.e. level 1 is source intervals, level 2 is destination intervals, level 3 is protocol intervals, and level 4 is destination port intervals. At the end of each tree path is an action, so the actions can be considered the last level and they make up the leaves of the tree. Hence, every path of the tree is a flow, and the total number of unshadowed flows is equal to the number of paths/leaves of the tree.

A visual representation of a firewall decision tree. Internally protocols are stored as numeric intervals with equal start and end.

The construction is described on page 1243 of Liu and Gouda, but I have abridged it here:

Append each rule flow in order into the tree. Start from the tree’s root level of source intervals.
Merge flow’s intervals with overlapping tree’s intervals and create nodes for the resulting disjoint intervals accordingly.
Copy old interval subtrees to new interval nodes.
Proceed to the children (next level) of interval nodes covered by the flow’s intervals.
Repeat steps 2-3 until entire flow is merged.

Once FDTs for both policies are constructed, the paper describes the comparison algorithm, which consists of making the trees isomorphic to each other to make them comparable. Essentially, at the end of this process, all the flows in one tree exist in the other with potentially a different action (so, in fact, the trees are semi-isomoprhic). It is then a matter of iterating over all the flows to find ones with differing actions (Liu and Gouda 1245).

There are several pitfalls to this algorithm in its current state:

Deep copying subtrees is time-consuming.
Constructing an FDT for each policy and then shaping them to be semi-isomorphic is time-consuming.
There is no room for parallelization in FDT construction due to the synchronous [appending new paths to the tree] nature of the construction algorithm.

Vertical construction algorithm for a firewall policy. Compare with horizontal below.

Even though the overall efficiency of this solution is superior to the naive method present above (complexity analysis can be found at Liu and Gouda 1247), the policies it was benchmarked on (Liu and Gouda 1249) have orders of magnitude fewer rules than the enterprise policies I was working with, so I needed to rethink portions of the algorithm to improve performance.

New modifications to improve performance

My contribution to the solution was redesigning the implementation of this method to avoid any copying, avoid making the trees isomorphic to each other altogether, and utilize parallelization.

Modification 1: Instead of constructing two FDTs, one for each policy, and then making them isomorphic to each other, I immediately construct a combined tree that includes the flows of both policies. In this tree, flows have two action leaves, one from each policy. If only one of the policies contains a flow, the other action is empty. Since the construction algorithm is responsible for the majority of the runtime, this signficantly improves performance and space.

Modification 2: Instead of following a vertical construction approach of merging one flow after another to the tree, I use a horizontal level-based approach. I merge all the source intervals of all the flows first, then, for each source interval, merge all the destination intervals, then protocols, then ports. This simultaneously avoids subtree copying and opens up the potential to parallelize.

Horizontal construction algorithm for the same policy. Compare with vertical above.

The result of the construction algorithm is a tree, whose differing flows are trivial to find by iterating over all flows to filter ones with more than one action leaf and whose actions differ.

Further speed enhancements can be made by parallelizing construction and storing interval merging results in a cache so as not to solve the same interval merging problem more than once because intervals are often repeated, especially for protocols and ports, both of which have small sets of commonly used values. Target protocols in policy rules are typically tcp (6), udp (17), and icmp (1) and target ports are http (80), https (443), and dns (53). These auxiliary changes prove effective in practice.

Method	Sources	Destinations	Protocols	Ports	Total
serial	0.07	8.87	0.86	4.38	14.18
targeted parallel	0.07	1.07 (parallel)	0.86	4.38	6.38
parallel and cache	0.07	1.07 (parallel)	0.86	1.19 (cache)	3.19

A summary of FDT + speed enhancements performance on an enterprise policy with ~100,000 rules [results in minutes].

Simplifying the tree

Merging overlapping intervals during the construction process can make the flows overly granular. Presenting results is then cumbersome and the firewall decision tree takes up more space than necessary.

A tree with 24 flows even though all of the flows can be encapsulated by 2 flows with unions of intervals.

To simplify the reporting of differing flows and structure of the tree, we can union interval nodes whose subtrees are the same. Starting at the bottom of the tree at the port level, if two port nodes have the same action leaves, we can union the ports into one combined port node. Advancing up the tree to the protocol level, we check if any protocol nodes have equal subtrees, in which case we union them. We continue in this manner all the way up to the tree root. The idea of comparing child contents in this process is inspired by how Merkle trees work. The runtime complexity of this simplification algorithm is \(O(N)\) where \(N\) is the number of nodes in the tree since we visit each node once.

The result is a simpler, more understandale set of flows.

After simplification, the above tree reduces to 2 flows.

Discussion and further work

The described modifications to the firewall decision tree construction algorithm significantly improve performance and space. They enable parallelization through level-based construction and simplify the presentation of results. These enhancements make it feasible to conduct contextual firewall policy comparison at a massive scale.

A consequence of building a tree from the rules of multiple policies as described is that this process supports more than two policies. In fact, each port node can have as many action leaves as there are policies, and the algorithm simply works. The other aforementioned benefits of using firewall decision trees are maintained as well.

Further speed improvements could be made by preprocessing the flows before tree construction and splitting by (protocol, d-port) combo as mentioned in the improvements for the naive implementation. Another approach to consider is merging overlapping intervals for each field and then connecting intervals to their children to construct the levels of the tree.