Enigma’s Applied Technologies team is responsible for delivering scalable, highly available solutions to our largest customers. We interviewed senior software engineer Clinton Monk about his experiences developing a screening API to prevent money-laundering for some of the largest U.S. financial institutions.
Broadly speaking, we're a very highly collaborative team — we share all the responsibilities. At the same time, everyone on the team is expected to own at least some features, which means you’re working with Product to understand the requirements, you are writing the engineering designs, and creating the plan for implementation, all while ensuring it meets our customers’ needs.
The stakes are uniquely high for our team and our work. For one thing, we're integrated into our customers’ critical systems. If something goes wrong with our API, then it's a big deal; it’s going to affect our customer’s bottom line. We process PII and need to handle it carefully and responsibly according to proper data handling policies. Lastly, we also have a very high throughput SLA for our API because we need to process many transactions and accounts at the same time. These all present difficult, distinct technical challenges.
It was about scalability. We needed to meet an SLA of 2000 requests per second.
First, we set up a distributed load testing suite by deploying Locust to an ECS cluster in AWS. We then added additional CloudWatch metrics to measure latency of specific Python modules in our API. This let us measure current throughput as well as measure the effects of changes.
Next, we started making changes. Testing with only one API instance, we looked for bottlenecks in the API request handler. We identified a few, refactored those functions, ran acceptance tests to ensure the functionality remained the same, and then ran the load tests to measure the improvements. We applied this same process to the gunicorn application server as well, testing different worker types as well as number of workers and threads. We continued these changes until we had addressed all of the easy or obvious improvements.
We then started scaling out the number of API instances. In doing so, new challenges arose. We needed to scale out other parts of our stack as well, including our message queues and our Elasticsearch cluster.
The challenge came down to a lot of investigation and exploration. We had the freedom to think about the problem and then find and select tools and approaches that helped us solve it.
Single sign-on (SSO) and authentication comes to mind. Our customers want to use single sign-on to access our front end. So the challenge was, how do we enable that? I needed to read up on what our options were. This led me to reading a lot about AWS Cognito user pools, identity pools, and different identity provider configurations. We ended up going with SAML for SSO, and designing a standalone UI with separate URLs for each customer.
For API authentication, we needed an authentication scheme more secure than just API keys. We went with a public-secret key pair, where the secret is never sent over the wire. We felt like this was the safest option. We can give a key pair for each integration, which reduces security risk and provides more granular support.
With the challenges I’ve mentioned, it's not like there were always best practices that you just adopt and follow. There were many options. It was about doing a survey of what is possible — exploring and understanding all of those options. I had to pick the option that best met our needs and then determine the path to implement it.
All of our challenges require a lot of problem solving and an investigatory sensibility. You have to be able to holistically examine problems and options for solutions.
The work we take on is incredibly diverse. Our team owns the entire stack for our platform. An engineer might spend one day working in the Python application itself. The next day, they might be working on a data workflow in Airflow. The next day, they might be deploying new AWS infrastructure using terraform and CI. There’s a big range.
We’re not sales engineers. This isn’t a role where you take a pre-built solution and deploy it. We build the system. In my case, we built the API, the platform, and the architecture for our customers. If you join the team, you have ownership over the platform and the roadmap ahead. Likewise, if there are problems, we fix them. This is a team for people who want to be actively building software.
People who enjoy solving problems do well here. We look for people who are not afraid to ask questions, to probe into systems, to test their assumptions. Someone who’s curious to look into things and understand systems well enough to make better recommendations will thrive here. It’s a lot of responsibility for those who want it.
If you’re interested in learning more about career opportunities with the Applied Technologies team, check out our current openings.