I recently encountered a situation where I wanted to consolidate or group rows per group value into a Python list. There seems to be various solutions in pandas (a few resources at the bottom), but how can you do this in Polars? There are probably multiple ways you can do it in Polars as well….
Category: Data Engineering

How to Calculate Percent of Total in Polars
Have you ever need to calculate the percent of total for your data? I bet you’ve encountered situations like that many times. It’s a common analysis, that if you know how to do, would benefit you as a data professional. In this blog post, I’ll demonstrate how to do that in Polars. Here’s the link…

Conditional Logic or If-Else in Polars
One of the most common patterns in data is the need for conditional logic or if-else statements. You may want to assign a specific value when a certain condition is met. This is easily done in Pandas, using numpy.where() or pandas.where(). But how about Polars? The short answer is yes, you can do the same…

Write Better Code with Pipe in Polars
Have you encountered situations where you’re applying so many transformation on a Polars dataframe that your code is hard to follow? It’s a double edged sword where you can do a lot in Polars, but it could also create a mess. In this post, I’ll introduce a Polars functionality, pipe(), that helps make your code…

How to Add Custom Functionality in Polars
Polars being one of the best Python libraries to work with data, it’s still new and it lacks some functionalities you may find in pandas, for example. But did you know there is a way to add your own custom functionality/method to Polars without having to go through the process and complexity of contributing to…

How to Use SQL in Polars
Introduction There are a few ways you can use SQL in Polars. One option is to use other libraries such as DuckDB and pandas. And another option is to actually run SQL without using other libraries. I’ll be demonstrating the latter in this blog post. Please refer to this post on how to use DuckDB…

Convert DataFrame to Series in Polars
There are basically 2 ways to do this. Some may think that df.select(‘Your Column’) alone would work but nope. It doesn’t return a series object, it returns a dataframe object instead. Source code: Github repo

Pandas vs Polars – Speed Comparison
Pandas and Polars Pandas is probably the most popular library for data manipulation and analysis. If you work in data, I’m sure you have heard about it or you may use it on a daily basis. Pandas is very useful and versatile in various data tasks, however, the main issue that’s being talked about is…

Polars with DuckDB – Using SQL in Polars
Polars and DuckDB We see that Polars and DuckDB seem to be the same kind of tools, where we can use them for data analytics, data science, and data engineering tasks, with great performance. I’m not going into which one is better than the other, but I’d like to show you a good way to…

Read CSV Files with Polars in Python
Read a CSV file Using read_csv() If you know some pandas syntax already, then this will look very familiar to you. Polars has the same syntax to read csv files. You can look at the Polars documentation for read_csv() on some of the parameters options. For example, in the code above, notice that the date…