Classic case of `sum` returning NA because it doesn't sum NAs [closed]


I am trying to use sum in a function, but the results are NA, which I think may be due to integer overflow. But the class of the numbers I am using is numeric.

The function is most simply

sum((columnA-columnB)^2)

A value from columnA is 0.1376146 and from columnB is 0.272

Is is the different length of decimal places? I know how to change what is displayed, but I'm not sure that will change what R uses for sum.


Answers:


Following Joshua Ulrich's comment, before saying that you have some overflow problem, you should answer these questions:

  1. How many elements are you summing? R can handle a BIG number of entries
  2. How big are the values in your vectors? Again, R can handle quite big numbers
  3. Are you summing integers or floats? If you are summing floating-point numbers, you can't have an integer overflow (floats are not integers)
  4. Do you have NAs in your data? If you sum anything with NAs present, the result will be NA, unless you handle it properly.

That said, some solutions:

  • Use sum(..., na.rm=T) to ignore NAs from your object (this is the simple solution)
  • Sum only non NA entries: sum(yourVector[!is.na(yourVector)] (the not so simple one)
  • If you are summing a column from a data frame, subset the data frame before summing: sum(subset(yourDataFrame, !is.na(columnToSum))[columnToSum]) (this is like using a cannon to kill a mosquito)