Menu
Stuff by Yuki
  • Home
  • Data Engineering
    • Python
  • Business Intelligence
    • Power BI
    • Tableau
  • Perspectives
  • About
  • Contact
Stuff by Yuki

Author: Yuki

DuckDB with Polars, Pandas, and Arrow

One of the features in DuckDB is its integration with other data libraries such as pandas. DuckDB makes it seamless when we convert to and from other dataframes and table formats. This flexibility gives the users the ability to implement DuckDB in their data pipelines with ease. In this post, I’ll walk you through how…

Read from and Write to Amazon S3 in Polars

How do you work with Amazon S3 in Polars? Amazon S3 bucket is one of the most common object stores for data projects. Polars being a fairly new technology, there is not a ton of resources that explain how to work with S3. In this post, I’ll walk you through reading from and writing to…

Handling Missing Values in Polars

Checking and filling missing values is an important piece in data science and analytics projects. A popular dataframe library, pandas, provides a method like fillna(), for example. Polars has built-in methods and expressions to work with missing values as well. This post covers ways to check missing values as well as ways how you can…

LazyFrame vs DataFrame in Polars – Performance Comparison

One of the features in Polars is LazyFrame. Polars is fast as is, and LazyFrame gives you even more optimizations. But you may wonder, “How is it different from the typical DataFrame or EagerFrame?” or “What is LazyFrame in the first place”? What is LazyFrame in Polars In order to understand LazyFrame, it’s good to…

Group Rows into List in Polars

I recently encountered a situation where I wanted to consolidate or group rows per group value into a Python list. There seems to be various solutions in pandas (a few resources at the bottom), but how can you do this in Polars? There are probably multiple ways you can do it in Polars as well….

How to Calculate Percent of Total in Polars

Have you ever need to calculate the percent of total for your data? I bet you’ve encountered situations like that many times. It’s a common analysis, that if you know how to do, would benefit you as a data professional. In this blog post, I’ll demonstrate how to do that in Polars. Here’s the link…

Conditional Logic or If-Else in Polars

One of the most common patterns in data is the need for conditional logic or if-else statements. You may want to assign a specific value when a certain condition is met. This is easily done in Pandas, using numpy.where() or pandas.where(). But how about Polars? The short answer is yes, you can do the same…

Write Better Code with Pipe in Polars

Have you encountered situations where you’re applying so many transformation on a Polars dataframe that your code is hard to follow? It’s a double edged sword where you can do a lot in Polars, but it could also create a mess. In this post, I’ll introduce a Polars functionality, pipe(), that helps make your code…

How to Add Custom Functionality in Polars

Polars being one of the best Python libraries to work with data, it’s still new and it lacks some functionalities you may find in pandas, for example. But did you know there is a way to add your own custom functionality/method to Polars without having to go through the process and complexity of contributing to…

How to Use SQL in Polars

Introduction There are a few ways you can use SQL in Polars. One option is to use other libraries such as DuckDB and pandas. And another option is to actually run SQL without using other libraries. I’ll be demonstrating the latter in this blog post. Please refer to this post on how to use DuckDB…

  • Previous
  • 1
  • 2
  • 3
  • 4
  • 5
  • …
  • 9
  • Next
©2025 Stuff by Yuki | Powered by SuperbThemes