Menu
Stuff by Yuki
  • Home
  • Python
  • Power BI
  • Tableau
  • Projects
    • Makeover Monday
    • Workout Wednesday
  • About
  • Contact
Stuff by Yuki

How to Convert String to Date or Datetime in Polars

Posted on September 8, 2023
Image by Chris Lawton on Unsplash

We’ve all been there. You get your data and start analyzing it then you realize a date column is actually set to string data type. Whether that’s because the issue in the data source or your program didn’t read it right, it’s beneficial to know how to convert string to date or datetime. I’ll be covering how to do that in Python Polars in this blog post!

There are a few ways to convert string to date or datetime in Polars (precisely it’s data type of Utf8).

  1. str.to_date()
  2. str.strptime()
  3. try_parse_dates or use_pyarrow when reading data (read_csv)

I’m not going to cover “use_pyarrow” parameter option as it does the same thing as “try_parse_dates”.

All my code is available in my GitHub repo.

Convert string to date

You can use all the options mentioned above when converting string to date. Note that “try_parse_dates” parses the string value to date or datetime automatically depending on the original format. e.g. if the format is something like 2023-01-01 then it parses as date, but if it’s something like 2023-01-01 00:00:00 then it parses as datetime.

With these methods, you’d need to specify the format the string date value represents. Here’s a resource you can refer to when getting the correct date/datetime/time format: https://docs.rs/chrono/latest/chrono/format/strftime/index.html

Importing polars library and check data types before converting to date/datetime:

Copy Copied Use a different Browser

import polars as pl

DATES_FILE_NAME = 'dates.csv'
DATETIMES_FILE_NAME = 'datetimes.csv'

dates_df = pl.read_csv(DATES_FILE_NAME)
datetimes_df = pl.read_csv(DATETIMES_FILE_NAME)
print(dates_df, '\n', datetimes_df)  # check original data type, which is str
"""
shape: (100, 1)
┌────────────┐
│ date       │
│ ---        │
│ str        │
╞════════════╡
│ 2023-06-11 │
│ 2023-03-24 │
│ 2023-04-22 │
│ 2023-07-07 │
│ …          │
│ 2023-06-30 │
│ 2023-03-04 │
│ 2023-02-02 │
│ 2023-03-30 │
└────────────┘ 
 shape: (100, 1)
┌─────────────────────┐
│ datetime            │
│ ---                 │
│ str                 │
╞═════════════════════╡
│ 2023-06-11 20:55:57 │
│ 2023-03-24 23:48:28 │
│ 2023-04-22 16:16:38 │
│ 2023-07-07 00:08:36 │
│ …                   │
│ 2023-06-30 02:12:02 │
│ 2023-03-04 22:12:50 │
│ 2023-02-02 21:12:45 │
│ 2023-03-30 02:57:21 │
└─────────────────────┘
"""

Converting string to date using “to_date” and “strptime”:

Copy Copied Use a different Browser

# convert string to date
# 1. to_date and strptime
a = (
    pl.read_csv(DATES_FILE_NAME)
    .with_columns(
        pl.col('date').str.to_date(format='%Y-%m-%d').alias('to_date'),
        pl.col('date').str.strptime(pl.Date, format='%Y-%m-%d').alias('strptime'),
    )
)
print(a)  # check data type
"""
shape: (100, 3)
┌────────────┬────────────┬────────────┐
│ date       ┆ to_date    ┆ strptime   │
│ ---        ┆ ---        ┆ ---        │
│ str        ┆ date       ┆ date       │
╞════════════╪════════════╪════════════╡
│ 2023-06-11 ┆ 2023-06-11 ┆ 2023-06-11 │
│ 2023-03-24 ┆ 2023-03-24 ┆ 2023-03-24 │
│ 2023-04-22 ┆ 2023-04-22 ┆ 2023-04-22 │
│ 2023-07-07 ┆ 2023-07-07 ┆ 2023-07-07 │
│ …          ┆ …          ┆ …          │
│ 2023-06-30 ┆ 2023-06-30 ┆ 2023-06-30 │
│ 2023-03-04 ┆ 2023-03-04 ┆ 2023-03-04 │
│ 2023-02-02 ┆ 2023-02-02 ┆ 2023-02-02 │
│ 2023-03-30 ┆ 2023-03-30 ┆ 2023-03-30 │
└────────────┴────────────┴────────────┘
"""

Notice the output data type is “date” instead of “str”.

Let’s see how we can do the same using “try_parse_dates”:

Copy Copied Use a different Browser

# 2. try_parse_dates when reading
b = (
    pl.read_csv(DATES_FILE_NAME, try_parse_dates=True)
)
print(b)  # check data type
"""
shape: (100, 1)
┌────────────┐
│ date       │
│ ---        │
│ date       │
╞════════════╡
│ 2023-06-11 │
│ 2023-03-24 │
│ 2023-04-22 │
│ 2023-07-07 │
│ …          │
│ 2023-06-30 │
│ 2023-03-04 │
│ 2023-02-02 │
│ 2023-03-30 │
└────────────┘
"""

Convert string to datetime or time

Using str.strptime(), you can also convert string to datetime or time:

Copy Copied Use a different Browser

# convert string to datetime / time
# 1. strptime - datetime and time
c = (
    pl.read_csv(DATETIMES_FILE_NAME)
    .with_columns(
        pl.col('datetime').str.strptime(pl.Datetime, format='%Y-%m-%d %H:%M:%S').alias('strptime_datetime'),
        pl.col('datetime').str.strptime(pl.Datetime, format='%Y-%m-%d %H:%M:%S').alias('strptime_time'),
    )
)
print(c)
"""
shape: (100, 3)
┌─────────────────────┬─────────────────────┬─────────────────────┐
│ datetime            ┆ strptime_datetime   ┆ strptime_time       │
│ ---                 ┆ ---                 ┆ ---                 │
│ str                 ┆ datetime[μs]        ┆ datetime[μs]        │
╞═════════════════════╪═════════════════════╪═════════════════════╡
│ 2023-06-11 20:55:57 ┆ 2023-06-11 20:55:57 ┆ 2023-06-11 20:55:57 │
│ 2023-03-24 23:48:28 ┆ 2023-03-24 23:48:28 ┆ 2023-03-24 23:48:28 │
│ 2023-04-22 16:16:38 ┆ 2023-04-22 16:16:38 ┆ 2023-04-22 16:16:38 │
│ 2023-07-07 00:08:36 ┆ 2023-07-07 00:08:36 ┆ 2023-07-07 00:08:36 │
│ …                   ┆ …                   ┆ …                   │
│ 2023-06-30 02:12:02 ┆ 2023-06-30 02:12:02 ┆ 2023-06-30 02:12:02 │
│ 2023-03-04 22:12:50 ┆ 2023-03-04 22:12:50 ┆ 2023-03-04 22:12:50 │
│ 2023-02-02 21:12:45 ┆ 2023-02-02 21:12:45 ┆ 2023-02-02 21:12:45 │
│ 2023-03-30 02:57:21 ┆ 2023-03-30 02:57:21 ┆ 2023-03-30 02:57:21 │
└─────────────────────┴─────────────────────┴─────────────────────┘
"""

You can convert string to datetime with “try_parse_dates” parameter option only if the string value is in datetime format. You can’t convert date in string date type to datetime with “try_parse_dates”:

Copy Copied Use a different Browser

# 2. try_parse_dates
d = (
    pl.read_csv(DATETIMES_FILE_NAME, try_parse_dates=True)
)
print(d)
"""
shape: (100, 1)
┌─────────────────────┐
│ datetime            │
│ ---                 │
│ datetime[μs]        │
╞═════════════════════╡
│ 2023-06-11 20:55:57 │
│ 2023-03-24 23:48:28 │
│ 2023-04-22 16:16:38 │
│ 2023-07-07 00:08:36 │
│ …                   │
│ 2023-06-30 02:12:02 │
│ 2023-03-04 22:12:50 │
│ 2023-02-02 21:12:45 │
│ 2023-03-30 02:57:21 │
└─────────────────────┘
"""

Summary

Dealing with date or datetime values stored as string is a common problem we encounter. The good thing is that there are multiple ways to convert string to date or datetime in Polars. And that gives you enough flexibility to choose what to use depending on your specific case.

GitHub repo

References

  • https://pola-rs.github.io/polars-book/user-guide/transformations/time-series/parsing/#parsing-dates-from-a-file
  • https://pola-rs.github.io/polars/py-polars/html/reference/expressions/api/polars.Expr.str.to_date.html
  • https://pola-rs.github.io/polars/py-polars/html/reference/expressions/api/polars.Expr.str.strptime.html
  • https://pola-rs.github.io/polars/py-polars/html/reference/api/polars.read_csv.html

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • A Complete Guide to Git Integration in Power BI
  • How to Convert String to Date or Datetime in Polars
  • Aggregations Over Multiple Columns in Polars
  • DuckDB with Polars, Pandas, and Arrow
  • Read from and Write to Amazon S3 in Polars

Popular Posts

  • A Running Total Calculation with Quick Measure in Power BI
  • How To Copy And Paste Report Page in Power BI
  • How to Fill Dates Between Start Date and End Date in Power BI (Power Query)
  • A Complete Guide to Git Integration in Power BI
  • Year-Over-Year Calculation: Time Intelligence in Power BI

connect with me

  • LinkedIn
  • Twitter
  • Github
©2023 Stuff by Yuki | Powered by SuperbThemes & WordPress