This function parses date columns in a data frame using various date formats. It standardizes the output to a specified format and provides verbose feedback on parsing results.
Usage
autoparse_dates(
data,
date_cols,
output_format = "%Y-%m-%d",
additional_format = NULL,
verbose = TRUE
)Arguments
- data
A data frame containing columns with date values to parse.
- date_cols
A character vector specifying the names of columns with date values.
- output_format
A character string specifying the desired output format for the parsed dates. Default is
"%Y-%m-%d".- additional_format
A character string specifying any additional formats that are not included in the default formats (see
available_date_formats). Default is NULL.- verbose
Logical. If
TRUE, prints messages about parsing success or failure for each column. Default isTRUE.
Details
This function supports a wide range of date formats, including:
Basic formats:
"dmY","mdY","Ymd","Ydm","dmy","mdy","ymd","ydm"Formats with time:
"dmY HMS","mdY HMS","Ymd HMS","Ydm HMS","dmy HMS","mdy HMS","ymd HMS","ydm HMS"Additional formats:
"Y-m-d","Y-m-d H:M:S","Y-m-d H:M","d-m-Y H:M:S","d.m.Y H:M:S","d.m.Y H:M","d-m-Y H:M","d.m.Y"ISO 8601 format: e.g.,
"2021-03-20T00:01:00.513+01:00"
Examples
# Define a sample data frame
df <- data.frame(
`dmY` = c("03-10-2023", "11-09-2022", "25-12-2021", "15-08-2020"),
`mdY` = c("10-03-2023", "09-11-2022", "12-25-2021", "08-15-2020"),
`Ymd` = c("2023-10-03", "2022-09-11", "2021-12-25", "2020-08-15"),
`Ydm` = c("2023-03-10", "2022-11-09", "2021-25-12", "2020-15-08"),
`dmy` = c("03-10-23", "11-09-22", "25-12-21", "15-08-20"),
`mdy` = c("10-03-23", "09-11-22", "12-25-21", "08-15-20"),
`ymd` = c("23-10-03", "22-09-11", "21-12-25", "20-08-15"),
`ydm` = c("23-03-10", "22-11-09", "21-25-12", "20-15-08"),
`dmY_HMS` = c(
"03-10-2023 14:30:00", "11-09-2022 05:45:12",
"25-12-2021 23:59:59", "15-08-2020 00:00:00"
),
`mdY_HMS` = c(
"10-03-2023 14:30:00", "09-11-2022 05:45:12",
"12-25-2021 23:59:59", "08-15-2020 00:00:00"
),
`Ymd_HMS` = c(
"2023-10-03 14:30:00", "2022-09-11 05:45:12",
"2021-12-25 23:59:59", "2020-08-15 00:00:00"
),
`Ydm_HMS` = c(
"2023-03-10 14:30:00", "2022-11-09 05:45:12",
"2021-25-12 23:59:59", "2020-15-08 00:00:00"
),
`dmy_HMS` = c(
"03-10-23 14:30:00", "11-09-22 05:45:12",
"25-12-21 23:59:59", "15-08-20 00:00:00"
),
`mdy_HMS` = c(
"10-03-23 14:30:00", "09-11-22 05:45:12",
"12-25-21 23:59:59", "08-15-20 00:00:00"
),
`ymd_HMS` = c(
"23-10-03 14:30:00", "22-09-11 05:45:12",
"21-12-25 23:59:59", "20-08-15 00:00:00"
),
`ydm_HMS` = c(
"23-03-10 14:30:00", "22-11-09 05:45:12",
"21-25-12 23:59:59", "20-15-08 00:00:00"
),
`Y-m-d` = c("2023-10-03", "2022-09-11", "2021-12-25", "2020-08-15"),
`Y-m-d_HMS` = c(
"2023-10-03 14:30:00", "2022-09-11 05:45:12",
"2021-12-25 23:59:59", "2020-08-15 00:00:00"
),
`Y-m-d_HM` = c(
"2023-10-03 14:30", "2022-09-11 05:45",
"2021-12-25 23:59", "2020-08-15 00:00"
),
`d-m-Y_HMS` = c(
"03-10-2023 14:30:00", "11-09-2022 05:45:12",
"25-12-2021 23:59:59", "15-08-2020 00:00:00"
),
`d.m.Y_HMS` = c(
"03.10.2023 14:30:00", "11.09.2022 05:45:12",
"25.12.2021 23:59:59", "15.08.2020 00:00:00"
),
`d.m.Y_HM` = c(
"03.10.2023 14:30", "11.09.2022 05:45",
"25.12.2021 23:59", "15.08.2020 00:00"
),
`d-m-Y_HM` = c(
"03-10-2023 14:30", "11-09-2022 05:45",
"25-12-2021 23:59", "15-08-2020 00:00"
),
`d.m.Y` = c("03.10.2023", "11.09.2022", "25.12.2021", "15.08.2020"),
`iso8601` = c(
"2021-03-20T00:01:00.513+01:00",
"2022-11-05T23:15:59.123+01:00",
"2023-06-15T12:30:45.789Z",
"2020-01-01T00:00:00.000-05:00"
),
`mixed_formats` = c(
"2023-10-03", "11.09.2022",
"25-12-21 23:59", "2020-08-15T00:00:00Z"
),
stringsAsFactors = FALSE
)
# Parse the date columns
parsed_df <- autoparse_dates(df,
date_cols = colnames(df),
output_format = "%Y-%m-%d"
)
#> ✔ All columns have been successfully parsed to the given format
