Learn how to inspect and optimize query plans in Python Polars
Category: Data Engineering

Learn Python Polars with Polars Cookbook
Polars has been gaining popularity more and more since its birth. You may have already used it or you may be considering giving it a try. This blog post will cover what Polars Cookbook is, who it is for as well as what Polars is in the first place, and why you might benefit from…

DuckDB vs Polars – Which One Is Faster?
This article is about an unofficial benchmarking on DuckDB and Polars

Upsert and Merge with Delta Lake Tables in Python Polars
Learn how to implement upsert and merge with Delta Lake Tables in Python Polars

How to Convert String to Date or Datetime in Polars
We’ve all been there. You get your data and start analyzing it then you realize a date column is actually set to string data type. Whether that’s because the issue in the data source or your program didn’t read it right, it’s beneficial to know how to convert string to date or datetime. I’ll be…

Aggregations Over Multiple Columns in Polars
Aggregations such as sum and mean are frequently used in data science and analysis. There are cases where you might want to aggregate values over columns instead of rows. Meaning that if you have two columns A and B, you want to aggregate values on horizontally (columns), not vertically (rows). Polars allows this type of…

DuckDB with Polars, Pandas, and Arrow
One of the features in DuckDB is its integration with other data libraries such as pandas. DuckDB makes it seamless when we convert to and from other dataframes and table formats. This flexibility gives the users the ability to implement DuckDB in their data pipelines with ease. In this post, I’ll walk you through how…

Read from and Write to Amazon S3 in Polars
How do you work with Amazon S3 in Polars? Amazon S3 bucket is one of the most common object stores for data projects. Polars being a fairly new technology, there is not a ton of resources that explain how to work with S3. In this post, I’ll walk you through reading from and writing to…

Handling Missing Values in Polars
Checking and filling missing values is an important piece in data science and analytics projects. A popular dataframe library, pandas, provides a method like fillna(), for example. Polars has built-in methods and expressions to work with missing values as well. This post covers ways to check missing values as well as ways how you can…

LazyFrame vs DataFrame in Polars – Performance Comparison
One of the features in Polars is LazyFrame. Polars is fast as is, and LazyFrame gives you even more optimizations. But you may wonder, “How is it different from the typical DataFrame or EagerFrame?” or “What is LazyFrame in the first place”? What is LazyFrame in Polars In order to understand LazyFrame, it’s good to…