r - Gathering column values into a single column with variable name - Stack Overflow

admin2025-04-18  2

I have this simple data frame,

var_1 <- c(10, 5, 6, 0)
var_2 <- c(0, 0, 3, 0)
var_3 <- c(2, 0, 9, 0)
df <- data.frame(var_1, var_2, var_3)
df
> df
  var_1 var_2 var_3
1    10     0     2
2     5     0     0
3     6     3     9
4     0     0     0

I want to create a column gathering information from the columns named with var_. In the first row, I have 10 under var_1 and 2 under var_3. So the entry in the resulting column should be `var_1 (10), var_3(2) and so on. The rows with all zeros should be recorded as NA. The resulting data frame should look like this following:

  var_1 var_2 var_3                   res
1    10     0     2 var_1 (10), var_3 (2)
2     5     0     0             var_1 (5)
3     6     3     9             Var_2 (3)
4     0     0     0                  <NA>

I can somehow gather the variable names but cannot get the values with it. Any idea how to achieve that res column?

I have this simple data frame,

var_1 <- c(10, 5, 6, 0)
var_2 <- c(0, 0, 3, 0)
var_3 <- c(2, 0, 9, 0)
df <- data.frame(var_1, var_2, var_3)
df
> df
  var_1 var_2 var_3
1    10     0     2
2     5     0     0
3     6     3     9
4     0     0     0

I want to create a column gathering information from the columns named with var_. In the first row, I have 10 under var_1 and 2 under var_3. So the entry in the resulting column should be `var_1 (10), var_3(2) and so on. The rows with all zeros should be recorded as NA. The resulting data frame should look like this following:

  var_1 var_2 var_3                   res
1    10     0     2 var_1 (10), var_3 (2)
2     5     0     0             var_1 (5)
3     6     3     9             Var_2 (3)
4     0     0     0                  <NA>

I can somehow gather the variable names but cannot get the values with it. Any idea how to achieve that res column?

Share edited Feb 3 at 19:55 CommunityBot 11 silver badge asked Jan 30 at 14:21 JontroPothonJontroPothon 6004 silver badges10 bronze badges 1
  • 2 Why does row 3 only have var_2? Is that intended? – Andre Wildberg Commented Jan 30 at 15:26
Add a comment  | 

5 Answers 5

Reset to default 3
res <- mapply(sprintf, names(df), df, fmt = "%s (%d)")
res[df == 0] <- NA

res <- apply(res, 1, \(x) paste(na.omit(x), collapse = ", "))
res[res == ""] <- NA

df$res <- res
df
#  var_1 var_2 var_3                             res
#1    10     0     2           var_1 (10), var_3 (2)
#2     5     0     0                       var_1 (5)
#3     6     3     9 var_1 (6), var_2 (3), var_3 (9)
#4     0     0     0                            <NA>

You can try

idx <- which(df > 0, TRUE)
v <- sprintf("%s (%s)", names(df)[idx[, "col"]], df[idx])
df$res <- c(by(v, factor(idx[, "row"], levels = 1:nrow(df)), toString))

which gives

> df
  var_1 var_2 var_3                             res
1    10     0     2           var_1 (10), var_3 (2)
2     5     0     0                       var_1 (5)
3     6     3     9 var_1 (6), var_2 (3), var_3 (9)
4     0     0     0                            <NA>

Apply the indicated anonymous function to each row. It sets y to the non-zero elements in the current row and if that has nonzero length returns the indicated sprintf and otherwise NA.

If there can be other columns not to be included then replace the first argument of apply with pick(starts_with("var_")) .

library(dplyr)

df %>%
  mutate(res = apply(., 1, function(x, y = x[x != 0])
    if (length(y)) toString(sprintf("%s (%d)", names(y), y)) else NA
  ))

giving

  var_1 var_2 var_3                             res
1    10     0     2           var_1 (10), var_3 (2)
2     5     0     0                       var_1 (5)
3     6     3     9 var_1 (6), var_2 (3), var_3 (9)
4     0     0     0                            <NA>

Similar but different. Base R

transform(df, res = {
  n = names(df)
  apply(df, 1, \(x) {
    i = x > 0 
    if(any(i)) toString(sprintf('%s (%i)', n[i], x[i])) else NA 
    }) 
  })
  var_1 var_2 var_3                             res
1    10     0     2           var_1 (10), var_3 (2)
2     5     0     0                       var_1 (5)
3     6     3     9 var_1 (6), var_2 (3), var_3 (9)
4     0     0     0                            <NA>

Keep it as a list column, instead of converting to collapsed character column:

df$res <- apply(df[, grep("^var", colnames(df), value = TRUE) ], 1, function(i) i[ i != 0 ])
#df
#   var_1 var_2 var_3     res
# 1    10     0     2   10, 2
# 2     5     0     0       5
# 3     6     3     9 6, 3, 9
# 4     0     0     0   

df$res[[ 1 ]]
# var_1 var_3 
#    10     2 

sapply(df$res, "[", "var_1")
# var_1 var_1 var_1  <NA> 
#    10     5     6    NA
转载请注明原文地址:http://anycun.com/QandA/1744913634a89400.html