When you first start out using Pandas, it's often best to just get your feet wet and deal with problems as they come up. Then, the years pass, the amazing things you've been able to build with it start to accumulate, but you have a vague inkling that you keep making the same kinds of mistakes and that your code is running really slowly for what seems like pretty simple operations. This is when it's time to dig into the inner workings of Pandas and take your code to the next level. Like with any library, the best way to optimize your code is to understand what's going on underneath the syntax.
It can be hard to know where to start, though. There are tools out there that can help boost productivity—but what exactly are these tools, and where can you find them?
In the spirit of learning—and sharing!—I recently culled together some of the Pandas tips and tricks I’ve come across over the years. Some of these methods I’ve learned at conferences, others I’ve picked up in books or from colleagues. After running a few tutorials at Enigma—including a session with our Software Craftsmanship Guild, an internal club that promotes the learning and practice of software engineering skills—I realized this information was worth sharing more broadly.
So, here is The Enigma Guide to Avoiding an Actual Pandas Pandemonium, which digs into coding best practices, common silent failures, how to speed up your runtime, and ways to lower your memory footprint. This is a bunch of suggestions for optimizing your Pandas code, conveniently packaged together in one place.
Have thoughts on the tutorial, or tips you want to share? Let us know!