r - Rename multiple variables at once using dplyr - Stack Overflow

admin2025-04-17  2

I am trying to rename multiple variables at once using Dplyr. Here is what the data looks like:

ID   Satisfaction_Baseline   Satisfaction_FollowUp   Satisfaction_Exit
1               1                      3                     4
2               5                      5                     5
3               5                      3                     4
4               5                      3                     2
5               5                      3                     5           

There is a lot of variables that have '_Baseline', '_FollowUp' and '_Exit'. I really would like a shorthand way to rename variables at once so the data looks more like this:

    ID   Satisfaction_1         Satisfaction_2        Satisfaction_3
1               1                      3                     4
2               5                      5                     5
3               5                      3                     4
4               5                      3                     2
5               5                      3                     5  

I tried this:

rename(D_wide,  Satisfaction_1 = Satisfaction_Baseline)

This works... but I wanted to know if there was a succinct way to write this piece of code so every variable that ends with '_Baseline', '_FollowUp' and '_Exit' end with '_1', '_2' and '_3'. Thank you!

I am trying to rename multiple variables at once using Dplyr. Here is what the data looks like:

ID   Satisfaction_Baseline   Satisfaction_FollowUp   Satisfaction_Exit
1               1                      3                     4
2               5                      5                     5
3               5                      3                     4
4               5                      3                     2
5               5                      3                     5           

There is a lot of variables that have '_Baseline', '_FollowUp' and '_Exit'. I really would like a shorthand way to rename variables at once so the data looks more like this:

    ID   Satisfaction_1         Satisfaction_2        Satisfaction_3
1               1                      3                     4
2               5                      5                     5
3               5                      3                     4
4               5                      3                     2
5               5                      3                     5  

I tried this:

rename(D_wide,  Satisfaction_1 = Satisfaction_Baseline)

This works... but I wanted to know if there was a succinct way to write this piece of code so every variable that ends with '_Baseline', '_FollowUp' and '_Exit' end with '_1', '_2' and '_3'. Thank you!

Share Improve this question asked Feb 1 at 2:57 Reap409Reap409 394 bronze badges 2
  • 4 Even better, make your data tidy by removing the information about time point from the column names and storing it as values in your data. You can do this by pivoting your data into long format with, for example, columns named ID, <variable name>, Timepoint and Value. I believe your life will be easier in the long run. – Limey Commented Feb 1 at 9:26
  • I see. It was in long format originally, I made it wide so some vars would be easier to recode (but you may be right, it might save me some grief down the road). Maybe I'll save this step from long to wide toward the end - thank you! – Reap409 Commented Feb 2 at 4:03
Add a comment  | 

3 Answers 3

Reset to default 5

1) dplyr Since there are only a small number of keywords (Baseline, FollowUp and Exit) we can simply pipe the second argument of rename_with through three sub calls. If the keywords do not appear in other column names as is the situation here we could optionally omit the $ signs.

library(dplyr)

df %>%
  rename_with(. %>%
    sub("Baseline$", 1, .) %>%
    sub("FollowUp$", 2, .) %>%
    sub("Exit$", 3, .)
)

giving

  ID Satisfaction_1 Satisfaction_2 Satisfaction_3
1  1              1              3              4
2  2              5              5              5
3  3              5              3              4
4  4              5              3              2
5  5              5              3              5

2) purrr Alternately the same code works with set_names in place of rename_with . In this case dplyr is not used.

library(purrr)

df %>%
  set_names(. %>%
    sub("Baseline$", 1, .) %>%
    sub("FollowUp$", 2, .) %>%
    sub("Exit$", 3, .)
)

3) Base R To ensure that we only need to write df once at the beginning so as to maintain the left to right nature of the pipe create a list with a single component data. Then we can use data multiple times in the following line.

df |>
  list(data = _) |>
  with(setNames(data, names(data) |>
    sub("Baseline$", 1, x = _) |> 
    sub("FollowUp$", 2, x = _) |>
    sub("Exit$", 3, x = _)
  ))

4) Reduce This Base R variation uses Reduce with a list L whose names are the keywords and whose values are to replace them. Then use Reduce to repeatedly apply sub.

L <- list(Baseline = 1, FollowUp = 2, Exit = 3)

df |>
  list(data = _) |>
  with(setNames(data, 
    Reduce(\(x, nm) sub(nm, L[[nm]], x), names(L), init = names(data))))

5) gsubfn Create a list L as in (4) whose names are the keywords and whose values are to replace them. Then use setNames with new names computed using gsubfn. Pass gsubfn a pattern that matches substrings not containing underscore and ending at the end of the string. It will look for matches to that pattern and any that equal a name in L will be replaced by the corresponding value in L.

library(gsubfn)

L <- list(Baseline = 1, FollowUp = 2, Exit = 3)
df |>
  list(data = _) |>
  with(setNames(data, gsubfn("[^_]+$", L, names(data))))

Note

The input in reproducible form:

df <- data.frame(
  ID = 1:5,
  Satisfaction_Baseline = rep(c(1L, 5L), c(1L, 4L)),
  Satisfaction_FollowUp = c(3L, 5L, 3L, 3L, 3L),
  Satisfaction_Exit = c(4L, 5L, 4L, 2L, 5L)
)

rename_with() lets you apply a function to the column names, and str_replace_all() lets you substitute multiple pattern-replacement pairs:

D_wide |> rename_with(~ stringr::str_replace_all(.x, c(
  "_Baseline" = "_1",
  "_FollowUp" = "_2",
  "_Exit"     = "_3"
)))

If Satisfaction is the first part for all names we can use rename_with like see here:

library(dplyr)

df <- tibble(
  Satisfaction_Baseline = 1:3,
  Satisfaction_Followup = 1:3, 
  Satisfaction_Exit = 1:3
)


df |> 
  rename_with(~paste0("Satisfaction_", 1:ncol(df)))
  Satisfaction_1 Satisfaction_2 Satisfaction_3
           <int>          <int>          <int>
1              1              1              1
2              2              2              2
3              3              3              3
转载请注明原文地址:http://anycun.com/QandA/1744840978a88362.html