Menu
Stuff by Yuki
  • Home
  • Data Engineering
    • Python
  • Business Intelligence
    • Power BI
    • Tableau
  • Perspectives
  • About
  • Contact
Stuff by Yuki

DuckDB with Polars, Pandas, and Arrow

Posted on June 26, 2023June 26, 2023
Image by Artem Bryzgalov on Unsplash

One of the features in DuckDB is its integration with other data libraries such as pandas. DuckDB makes it seamless when we convert to and from other dataframes and table formats. This flexibility gives the users the ability to implement DuckDB in their data pipelines with ease.

In this post, I’ll walk you through how to work with pandas, polars, and pyarrow in DuckDB.

You can find the full code in my GitHub repo.

DuckDB with Polars

Execute SQL on Polars in DuckDB – Polars to DuckDB

You can simply run a sql query specifying the dataframe name.

Copy Copied Use a different Browser

import polars as pl
import duckdb

data = {'ID': [1,2,3,4,5], 'Name': ['Microsoft', 'Apple', 'Netflix', 'Spotify', 'Intel']}

# duckdb on polars dataframe
pl_df = pl.DataFrame(data)
rel = duckdb.sql('select * from pl_df')
print('\nDuckDB relation from Polars df: \n', rel, type(rel))
"""
DuckDB relation from Polars df: 
 ┌───────┬───────────┐
│  ID   │   Name    │
│ int64 │  varchar  │
├───────┼───────────┤
│     1 │ Microsoft │
│     2 │ Apple     │
│     3 │ Netflix   │
│     4 │ Spotify   │
│     5 │ Intel     │
└───────┴───────────┘
class 'duckdb.DuckDBPyRelation'
"""

DuckDB to Polars

To convert from DuckDB relation object to Polars dataframe, you’d use .pl().

Copy Copied Use a different Browser

# duckdb to polars
pl_df_from_duckdb = rel.pl()
print('\nPolars df from DuckDB: \n', type(pl_df_from_duckdb))
"""
Polars df from DuckDB: 
class 'polars.internals.dataframe.frame.DataFrame'
"""

DuckDB with Pandas

Execute SQL on Pandas in DuckDB – Pandas to DuckDB

Copy Copied Use a different Browser

import pandas as pd
import duckdb

data = {'ID': [1,2,3,4,5], 'Name': ['Microsoft', 'Apple', 'Netflix', 'Spotify', 'Intel']}

# duckdb on pandas dataframe - pandas to duckdb
df = pd.DataFrame(data)
rel = duckdb.sql('select * from df')
print('\nDuckDB relation from Pandas df: \n', rel, type(rel))
"""
DuckDB relation from Pandas df: 
 ┌───────┬───────────┐
│  ID   │   Name    │
│ int64 │  varchar  │
├───────┼───────────┤
│     1 │ Microsoft │
│     2 │ Apple     │
│     3 │ Netflix   │
│     4 │ Spotify   │
│     5 │ Intel     │
└───────┴───────────┘
class 'duckdb.DuckDBPyRelation'
"""

DuckDB to Pandas

For pandas, you use .df().

Copy Copied Use a different Browser

# duckdb to polars
df_from_duckdb = rel.df()
print('\nPandas df from DuckDB: \n', type(df_from_duckdb))

"""
Pandas df from DuckDB: 
class 'pandas.core.frame.DataFrame'
"""

DuckDB with Arrow

Execute SQL on Arrow in DuckDB – Arrow to DuckDB

Copy Copied Use a different Browser

import pyarrow as pa
import duckdb

data = {'ID': [1,2,3,4,5], 'Name': ['Microsoft', 'Apple', 'Netflix', 'Spotify', 'Intel']}

# duckdb on arrow table - arrow to duckdb
arrow = pa.Table.from_pydict(data)
rel = duckdb.sql('select * from arrow')
print('\nDuckDB relation from Arrow table: \n', rel, type(rel))


"""
DuckDB relation from Arrow table: 
 ┌───────┬───────────┐
│  ID   │   Name    │
│ int64 │  varchar  │
├───────┼───────────┤
│     1 │ Microsoft │
│     2 │ Apple     │
│     3 │ Netflix   │
│     4 │ Spotify   │
│     5 │ Intel     │
└───────┴───────────┘
class 'duckdb.DuckDBPyRelation'
"""

DuckDB to Arrow

For pyarrow, you use .arrow().

Copy Copied Use a different Browser

# duckdb to arrow
arrow_from_duckdb = rel.arrow()
print('\nArrow table from DuckDB: \n', type(arrow_from_duckdb))
"""
Arrow table from DuckDB: 
class 'pyarrow.lib.Table'
"""

Summary

As you just saw, it is super easy to use DuckDB in conjunction with pandas, polars, and pyarrow. Hope this post helps you get started in using DuckDB with other data libraries!

References

  • https://duckdb.org/docs/guides/python/polars.html
  • https://duckdb.org/docs/guides/python/sql_on_pandas
  • https://duckdb.org/docs/guides/python/sql_on_arrow

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Where I’m Headed in the Next 5 Years
  • Open-Source vs Vendor Data Tools
  • Developing the Habit of Writing
  • How to Inspect and Optimize Query Plans in Python Polars
  • Learn Python Polars with Polars Cookbook

Popular Posts

  • A Running Total Calculation with Quick Measure in Power BI
  • A Complete Guide to Git Integration in Power BI
  • How To Copy And Paste Report Page in Power BI
  • Handling Missing Values in Polars
  • How to Convert String to Date or Datetime in Polars

connect with me

  • LinkedIn
  • Twitter
  • Github
  • Website

Search Articles

©2025 Stuff by Yuki | Powered by SuperbThemes