2

I have something like this in my dataset and I only want to delete a row if it only has NA's, not if it has at least one value.

      [,1] [,2] [,3]
[1,]    1    2    3
[2,]    1   NA    4
[3,]    4    6    7
[4,]   NA   NA   NA
[5,]    4    8   NA

In this example they were able to delete what i want, but when i try to do in the exact same way, it doesn't work.

I've already tried their example:

data[rowSums(is.na(data)) != ncol(data),]

But my row's number don't change like this one.

       [,1] [,2] [,3]
[1,]    1    2    3
[2,]    1   NA    4
[3,]    4    6    7
[4,]    4    8   NA

My NA's are not characters.if i ask for their class:

class(NA)
[1] "logical"

Do you know another way to ask for these, please?

______UPDATE_____

Maybe I said it wrong.

My problem, and it's why there code is not working

mymat[rowSums(is.na(mymat)) != ncol(mymat), ]

Because i have 3 columns with information but after that, is everything NA, like this:

Date         Product    Code   protein   fat
2016-01-01     aaa      0001      NA     NA
2016-01-01     bbb      0003      NA     NA
2016-02-01     ccc      0032      NA     NA

So the row is not entirly NA's, only after the 3rd column... But i want to remove the entire row.. (1:5)

Thank you!

Community
  • 1
  • 1
Ana Raquel
  • 155
  • 3
  • 13

3 Answers3

1

Check if this will work with the updated explanation. It will subset the data.frame to ignore the information columns when checking for NA. I added some additional rows that contain a mix of numbers and NA

df1 <- data.frame(Date=c("2016-01-01", "2016-01-01", "2016-02-01", "2016-03-01", "2016-03-01"),
              Product=c("aaa", "bbb", "ccc", "ddd", "eee"),
              Code=c("0001", "0003", "0032", "0005", "0007"),
              protein=c(NA, NA, NA, 5, NA),
              fat=c(NA, NA, NA, NA, 4))

# place any columns you do not want to check for NA in names.info
names.info <- c("Date", "Product", "Code")
names.check <- setdiff(names(df1), names.info)

df1[rowSums(is.na(df1[, names.check])) != length(names.check), ]

        Date Product Code protein fat
4 2016-03-01     ddd 0005       5  NA
5 2016-03-01     eee 0007      NA   4
manotheshark
  • 4,297
  • 17
  • 30
  • thank you for your help, however i think i didn't explain me very well.. now I updated my question, can you help me please? – Ana Raquel Jan 27 '17 at 08:40
  • @AnaRaquel see updated answer as this will ignore the information columns and will work regardless of the number of data columns used – manotheshark Jan 27 '17 at 17:42
1

First, I would coerce the matrix to a data frame, because this is the typical ("tidy") format to store variables and observations. Then you could use the remove_empty_rows() function from the sjmisc-package:

library(sjmisc)

df <- data.frame(
  a = c(1, 1, 4, NA, 4),
  b = c(2, NA, 6, NA, 8),
  c = c(3, 4, 7, NA, NA)
)

# get row numbers of empty rows
empty_rows(df)

## [1] 4

# remove empty rows
remove_empty_rows(df)

##  A tibble: 4 × 3
##        a     b     c
##  * <dbl> <dbl> <dbl>
##  1     1     2     3
##  2     1    NA     4
##  3     4     6     7
##  4     4     8    NA

There are also functions for columns: empty_cols() and remove_empty_cols().

If you just want to keep complete cases (rows), use complete.cases():

df[complete.cases(df), ]

##   a b c
## 1 1 2 3
## 3 4 6 7
Daniel
  • 7,252
  • 6
  • 26
  • 38
0

You need to delete the as.integer

mymat <- matrix(c(1:3, NA, 4:6, NA, rep(NA, 4)), ncol = 3)

Which translates to

     [,1] [,2] [,3]
[1,]    1    4   NA
[2,]    2    5   NA
[3,]    3    6   NA
[4,]   NA   NA   NA


mymat[as.integer(rowSums(is.na(mymat)) != ncol(mymat)), ]

Gives you

     [,1] [,2] [,3]
[1,]    1    4   NA
[2,]    1    4   NA
[3,]    1    4   NA

But you want

mymat[rowSums(is.na(mymat)) != ncol(mymat), ]

To get

     [,1] [,2] [,3]
[1,]    1    4   NA
[2,]    2    5   NA
[3,]    3    6   NA

Cheers, Marc

Marc Flury
  • 341
  • 1
  • 7