string - How do I replace values from a matrix to another list when certain wordsconditions are met in R programming? - Stack Ov

admin2025-04-17  3

I have a matrix of raw data, we can call it Matrix supermarketproducts with several details of supermarket products, such as the name, brand and description for example. I used the readxl function to import the raw data from an excel sheet.

I want to create a list of these products, but broken down into their respective properties or functions.

For example, i tried to create an empty list H of product properties such as "Heavy Duty", "Floral", "Quick-drying" and etc.

If an item from the matrix, let's say Detergent A, has any mention of the word floral in its description i want this exact 'Detergent A' name to be copied into the "Floral" sublist in my empty list H.

I faced two issues. Firstly, I am not sure how to only create a for loop that passes through only the third column, the 'description' in my matrix for my case only. Secondly, even by running my for loop through the entire matrix, i am unable to copy the words from the matrix into my new H list properly.

I am a totally new beginner in R so please bear with me, and appreciate any help. Thanks

H=list('Heavy-duty'=character(),
       'Floral'=character(),
       'Quick-drying'=character()
)

for (i in 1:dim(supermarketproducts)[1]) {
  for (j in 1:dim(supermarketproducts)[2]) {
   if (supermarketproducts[i,j]=='Floral'){
     H[i] <- supermarketproducts[i,j]
   }
  }
}

As queried, this is the dput of my supermarketproducts.

structure(list(Name = c("Scentclean", "Fluent", "Detergentwash", 
"Washtime", "Simplysuds", "Surftide"), Brand = c("X", "Y", "Z", 
"A", "Brand", "C"), Description = c("Say goodbye to tough stains and hello to fresh, clean laundry", 
"There are two things that let you know your clothes are clean: they smell good and they look good, with a nice floral scent", 
"Thanks to a formula that dissolves completely every time, clothes come out looking bright and fresh, even when washed in cold water.", 
"Combining concentrated detergents, powerful stain removers and color protectors into one convenient laundry pac, providing you the ease of drop-in-and-done. Along with the refreshing, invigorating scent of Spring Meadow.", 
"Say goodbye to tough stains and hello to fresh, clean laundry for your delicates", 
"Containing 10 concentrated cleaning actives, the heavy-duty cleaning agent gets between fibers to clean hidden dirt you didn’t even know was there. Available in Tide’s beloved Original scent that infuses your laundry with floral and fruity notes."
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-6L))

Ultimately, i hope to obtain a H list as such.

[Heavy-duty] "Surftide"
[Floral] "Surftide"     "Fluent"
[Quick-drying] 

I have a matrix of raw data, we can call it Matrix supermarketproducts with several details of supermarket products, such as the name, brand and description for example. I used the readxl function to import the raw data from an excel sheet.

I want to create a list of these products, but broken down into their respective properties or functions.

For example, i tried to create an empty list H of product properties such as "Heavy Duty", "Floral", "Quick-drying" and etc.

If an item from the matrix, let's say Detergent A, has any mention of the word floral in its description i want this exact 'Detergent A' name to be copied into the "Floral" sublist in my empty list H.

I faced two issues. Firstly, I am not sure how to only create a for loop that passes through only the third column, the 'description' in my matrix for my case only. Secondly, even by running my for loop through the entire matrix, i am unable to copy the words from the matrix into my new H list properly.

I am a totally new beginner in R so please bear with me, and appreciate any help. Thanks

H=list('Heavy-duty'=character(),
       'Floral'=character(),
       'Quick-drying'=character()
)

for (i in 1:dim(supermarketproducts)[1]) {
  for (j in 1:dim(supermarketproducts)[2]) {
   if (supermarketproducts[i,j]=='Floral'){
     H[i] <- supermarketproducts[i,j]
   }
  }
}

As queried, this is the dput of my supermarketproducts.

structure(list(Name = c("Scentclean", "Fluent", "Detergentwash", 
"Washtime", "Simplysuds", "Surftide"), Brand = c("X", "Y", "Z", 
"A", "Brand", "C"), Description = c("Say goodbye to tough stains and hello to fresh, clean laundry", 
"There are two things that let you know your clothes are clean: they smell good and they look good, with a nice floral scent", 
"Thanks to a formula that dissolves completely every time, clothes come out looking bright and fresh, even when washed in cold water.", 
"Combining concentrated detergents, powerful stain removers and color protectors into one convenient laundry pac, providing you the ease of drop-in-and-done. Along with the refreshing, invigorating scent of Spring Meadow.", 
"Say goodbye to tough stains and hello to fresh, clean laundry for your delicates", 
"Containing 10 concentrated cleaning actives, the heavy-duty cleaning agent gets between fibers to clean hidden dirt you didn’t even know was there. Available in Tide’s beloved Original scent that infuses your laundry with floral and fruity notes."
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-6L))

Ultimately, i hope to obtain a H list as such.

[Heavy-duty] "Surftide"
[Floral] "Surftide"     "Fluent"
[Quick-drying] 
Share Improve this question edited Jan 31 at 9:43 ThomasIsCoding 104k9 gold badges37 silver badges103 bronze badges asked Jan 31 at 7:20 Darren KDarren K 213 bronze badges 2
  • 1 Let's start at the beginning. You almost certainly don't have a matrix. readxl creates a tibble (a data.frame). Please give us a sample of your data by providing the output of dput(supermarketproducts). If that is too large, give us dput(head(supermarketproducts)). Then, show us the expected output because that is unclear from your description. I suspect the split function could be useful to you. – Roland Commented Jan 31 at 7:28
  • Appreciate the really prompt replies, i have updated my queries with the tibble data, and my intended output that i wanted. Is it because the functions i am trying to use for the loops does not apply to a tibble? – Darren K Commented Jan 31 at 7:46
Add a comment  | 

3 Answers 3

Reset to default 3

This will give you your expected output, based in your test data, assuming the test data is in an object named df.

library(tidyverse)

h <- c("Heavy-duty", "Floral", "Quick-drying")
answer <- lapply(
  h,
  function(term) {
    df %>% 
      filter(str_detect(Description, fixed(term, ignore_case = TRUE))) %>% 
      pull(Name)
  }
)
names(answer) <- h
answer
$`Heavy-duty`
[1] "Surftide"

$Floral
[1] "Fluent"   "Surftide"

$`Quick-drying`
character(0)

You need the ignore_case = TRUE because the strings you need to detect do not match the case used in your Descriptions.

R is a vectorised language (meaning that it is designed to work with objects of length greater than one by default). That means "if I'm thinking of using a for loop, there's probably a better way" is a useful maxim. That's the case here. We can examine every element of Description in a single function call. We loop over the various search terms using lapply.

I believe that lapply usually has many advantages over for loops, not least that it uses forced rather than lazy evaluation and it removes the need for pre-initialisation of the return value.

There are plenty of other Q&As on SO that go into more detail about the differences between lapply (and its siblings) and for.

I do not really understand why you are calling your supermarket a matrix, since you have probably read supermarket with a read-function from {tidyverse}. Hence, the class is

> class(supermarket)
[1] "tbl_df"     "tbl"        "data.frame"

A whole {tdiyverse} is not needed for the reading. That said, it seems you do not try to use {tidyverse} to solve what you are after. Here is how your approach can be realised in base:

lapply(setNames(c("Heavy-duty", "Floral", "Quick-drying"), c("Heavy-duty", "Floral", "Quick-drying")), 
       \(x) supermarket$Name[grep(x, supermarket$Description, ignore.case = TRUE)])

giving

$`Heavy-duty`
[1] "Surftide"

$Floral
[1] "Fluent"   "Surftide"

$`Quick-drying`
character(0)

The

setNames(c("Heavy-duty", "Floral", "Quick-drying"), c("Heavy-duty", "Floral", "Quick-drying"))

is just a trick to avoid cluttering our environment with variables we do not need more than once. If this is different we can change to

H = c("Heavy-duty", "Floral", "Quick-drying")
names(H) = H 
# lapply(H, \(x) .. )
  • Option 1

A possible solution is using outer + grepl (but it is not as time/memory-wisely efficient @Fride's or @Limey's solutions)

with(
    supermarket,
    `dimnames<-`(
        outer(H, Description, Vectorize(grepl), ignore.case = TRUE),
        list(H, Name)
    )
)

which gives

             Scentclean Fluent Detergentwash Washtime Simplysuds Surftide
Heavy-duty        FALSE  FALSE         FALSE    FALSE      FALSE     TRUE
Floral            FALSE   TRUE         FALSE    FALSE      FALSE     TRUE
Quick-drying      FALSE  FALSE         FALSE    FALSE      FALSE    FALSE

  • Option 2

Another implementation, but similar to other existing solutions, is

lapply(
    setNames(H, H),
    \(v) with(supermarket, Name[grepl(v, Description, ignore.case = TRUE)])
)

gives

$`Heavy-duty`
[1] "Surftide"

$Floral
[1] "Fluent"   "Surftide"

$`Quick-drying`
character(0)

data

supermarket <- structure(list(Name = c(
    "Scentclean", "Fluent", "Detergentwash",
    "Washtime", "Simplysuds", "Surftide"
), Brand = c(
    "X", "Y", "Z",
    "A", "Brand", "C"
), Description = c(
    "Say goodbye to tough stains and hello to fresh, clean laundry",
    "There are two things that let you know your clothes are clean: they smell good and they look good, with a nice floral scent",
    "Thanks to a formula that dissolves completely every time, clothes come out looking bright and fresh, even when washed in cold water.",
    "Combining concentrated detergents, powerful stain removers and color protectors into one convenient laundry pac, providing you the ease of drop-in-and-done. Along with the refreshing, invigorating scent of Spring Meadow.",
    "Say goodbye to tough stains and hello to fresh, clean laundry for your delicates",
    "Containing 10 concentrated cleaning actives, the heavy-duty cleaning agent gets between fibers to clean hidden dirt you didn’t even know was there. Available in Tide’s beloved Original scent that infuses your laundry with floral and fruity notes."
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(
    NA,
    -6L
))

H <- c("Heavy-duty", "Floral", "Quick-drying")
转载请注明原文地址:http://anycun.com/QandA/1744876459a88876.html