Transform NA values based on first registration and nearest values












12















I already made a similar question but now I want just to restrict the new values of NA.



I have some data like this:



Date 1   Date 2    Date 3    Date 4    Date 5   Date 6
A NA 0.1 0.2 NA 0.3 0.2
B 0.1 NA NA 0.3 0.2 0.1
C NA NA NA NA 0.3 NA
D 0.1 0.2 0.3 NA 0.1 NA
E NA NA 0.1 0.2 0.1 0.3


I would like to change the NA values of my data based on the first date a value is registered. So for example for A, the first registration is Date 2. Then I want that before that registration the values of NA in A are 0, and after the first registration the values of NA become the mean of the nearest values (mean of date 3 and 5).



In case the last value is an NA, transform it into the last registered value (as in C and D). In the case of E all NA values will become 0.



Get something like this:



Date 1   Date 2    Date 3    Date 4    Date 5   Date 6 
A 0 0.1 0.2 0.25 0.3 0.2
B 0.1 0.2 0.2 0.3 0.2 0.1
C 0 0 0 0 0.3 0.3
D 0.1 0.2 0.3 0.2 0.1 0.1
E 0 0 0.1 0.2 0.1 0.3


Can you help me? I'm not sure how to do it in R.










share|improve this question





























    12















    I already made a similar question but now I want just to restrict the new values of NA.



    I have some data like this:



    Date 1   Date 2    Date 3    Date 4    Date 5   Date 6
    A NA 0.1 0.2 NA 0.3 0.2
    B 0.1 NA NA 0.3 0.2 0.1
    C NA NA NA NA 0.3 NA
    D 0.1 0.2 0.3 NA 0.1 NA
    E NA NA 0.1 0.2 0.1 0.3


    I would like to change the NA values of my data based on the first date a value is registered. So for example for A, the first registration is Date 2. Then I want that before that registration the values of NA in A are 0, and after the first registration the values of NA become the mean of the nearest values (mean of date 3 and 5).



    In case the last value is an NA, transform it into the last registered value (as in C and D). In the case of E all NA values will become 0.



    Get something like this:



    Date 1   Date 2    Date 3    Date 4    Date 5   Date 6 
    A 0 0.1 0.2 0.25 0.3 0.2
    B 0.1 0.2 0.2 0.3 0.2 0.1
    C 0 0 0 0 0.3 0.3
    D 0.1 0.2 0.3 0.2 0.1 0.1
    E 0 0 0.1 0.2 0.1 0.3


    Can you help me? I'm not sure how to do it in R.










    share|improve this question



























      12












      12








      12


      1






      I already made a similar question but now I want just to restrict the new values of NA.



      I have some data like this:



      Date 1   Date 2    Date 3    Date 4    Date 5   Date 6
      A NA 0.1 0.2 NA 0.3 0.2
      B 0.1 NA NA 0.3 0.2 0.1
      C NA NA NA NA 0.3 NA
      D 0.1 0.2 0.3 NA 0.1 NA
      E NA NA 0.1 0.2 0.1 0.3


      I would like to change the NA values of my data based on the first date a value is registered. So for example for A, the first registration is Date 2. Then I want that before that registration the values of NA in A are 0, and after the first registration the values of NA become the mean of the nearest values (mean of date 3 and 5).



      In case the last value is an NA, transform it into the last registered value (as in C and D). In the case of E all NA values will become 0.



      Get something like this:



      Date 1   Date 2    Date 3    Date 4    Date 5   Date 6 
      A 0 0.1 0.2 0.25 0.3 0.2
      B 0.1 0.2 0.2 0.3 0.2 0.1
      C 0 0 0 0 0.3 0.3
      D 0.1 0.2 0.3 0.2 0.1 0.1
      E 0 0 0.1 0.2 0.1 0.3


      Can you help me? I'm not sure how to do it in R.










      share|improve this question
















      I already made a similar question but now I want just to restrict the new values of NA.



      I have some data like this:



      Date 1   Date 2    Date 3    Date 4    Date 5   Date 6
      A NA 0.1 0.2 NA 0.3 0.2
      B 0.1 NA NA 0.3 0.2 0.1
      C NA NA NA NA 0.3 NA
      D 0.1 0.2 0.3 NA 0.1 NA
      E NA NA 0.1 0.2 0.1 0.3


      I would like to change the NA values of my data based on the first date a value is registered. So for example for A, the first registration is Date 2. Then I want that before that registration the values of NA in A are 0, and after the first registration the values of NA become the mean of the nearest values (mean of date 3 and 5).



      In case the last value is an NA, transform it into the last registered value (as in C and D). In the case of E all NA values will become 0.



      Get something like this:



      Date 1   Date 2    Date 3    Date 4    Date 5   Date 6 
      A 0 0.1 0.2 0.25 0.3 0.2
      B 0.1 0.2 0.2 0.3 0.2 0.1
      C 0 0 0 0 0.3 0.3
      D 0.1 0.2 0.3 0.2 0.1 0.1
      E 0 0 0.1 0.2 0.1 0.3


      Can you help me? I'm not sure how to do it in R.







      r na






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 10 at 15:52







      user195366

















      asked Jan 10 at 14:19









      user195366user195366

      1187




      1187
























          4 Answers
          4






          active

          oldest

          votes


















          8














          Here is a way using na.approx from the zoo package and apply with MARGIN = 1 (so this is probably not very efficient but get's the job done).



          library(zoo)
          df1 <- as.data.frame(t(apply(dat, 1, na.approx, method = "constant", f = .5, na.rm = FALSE)))


          This results in



          df1
          # V1 V2 V3 V4 V5
          #A NA 0.1 0.2 0.25 0.3
          #B 0.1 0.2 0.2 0.30 0.2
          #C NA NA NA NA 0.3
          #E NA NA 0.1 0.20 0.1


          Replace NAs and rename columns.



          df1[is.na(df1)] <- 0
          names(df1) <- names(dat)
          df1
          # Date_1 Date_2 Date_3 Date_4 Date_5
          #A 0.0 0.1 0.2 0.25 0.3
          #B 0.1 0.2 0.2 0.30 0.2
          #C 0.0 0.0 0.0 0.00 0.3
          #E 0.0 0.0 0.1 0.20 0.1




          explanation



          Given a vector



          x <- c(0.1, NA, NA, 0.3, 0.2)
          na.approx(x)


          returns x with linear interpolated values



          #[1] 0.1000000 0.1666667 0.2333333 0.3000000 0.2000000


          But OP asked for constant values so we need the argument method = "constant" from the approx function.



          na.approx(x, method = "constant") 
          # [1] 0.1 0.1 0.1 0.3 0.2


          But this is still not what OP asked for because it carries the last observation forward while you want the mean for the closest non-NA values. Therefore we need the argument f (also from approx)



          na.approx(x, method = "constant", f = .5)
          # [1] 0.1 0.2 0.2 0.3 0.2 # looks good


          From ?approx




          f : for method = "constant" a number between 0 and 1 inclusive, indicating a compromise between left- and right-continuous step functions. If y0 and y1 are the values to the left and right of the point then the value is y0 if f == 0, y1 if f == 1, and y0*(1-f)+y1*f for intermediate values. In this way the result is right-continuous for f == 0 and left-continuous for f == 1, even for non-finite y values.




          Lastly, if we don't want to replace the NAs at the beginning and end of each row we need na.rm = FALSE.



          From ?na.approx




          na.rm : logical. If the result of the (spline) interpolation still results in NAs, should these be removed?




          data



          dat <- structure(list(Date_1 = c(NA, 0.1, NA, NA), Date_2 = c(0.1, NA, 
          NA, NA), Date_3 = c(0.2, NA, NA, 0.1), Date_4 = c(NA, 0.3, NA,
          0.2), Date_5 = c(0.3, 0.2, 0.3, 0.1)), .Names = c("Date_1", "Date_2",
          "Date_3", "Date_4", "Date_5"), class = "data.frame", row.names = c("A",
          "B", "C", "E"))


          EDIT



          If there are NAs in the last column we can replace these with the last non-NAs before we apply na.approx as shown above.



          dat$Date_6[is.na(dat$Date_6)] <- dat[cbind(1:nrow(dat),
          max.col(!is.na(dat), ties.method = "last"))][is.na(dat$Date_6)]





          share|improve this answer





















          • 1





            Nice! That is the route I took but couldn't quite get the na.approx to work. Can you please explain the arguments?(method and f). FYI na.approx is from zoo package, not Base R

            – Sotos
            Jan 10 at 15:09








          • 3





            @Sotos added some explaination. Best

            – markus
            Jan 10 at 15:26











          • Great Thanks! So f = .5 is not continuous? It's just to tell it to take the mean? i.e. We can not specify (f = .6)?

            – Sotos
            Jan 10 at 15:29






          • 3





            @Sotos f could be a number between 0 (default) and 1. 0 means last observation carried forward, 1 would be next observation carried backwards, for any number in between is will result in y0 * (1 - f) + y1 * f, see ?approx

            – markus
            Jan 10 at 15:32













          • Great! I have one question, if the data has a NA in the last column, how do I transform that last NA as the last registered value?

            – user195366
            Jan 10 at 15:33



















          3














          This is another possible answer, using na.locf from the zoo package.
          Edit: apply is actually not required; This solution fills in the last observed value if this value is missing.



          # create the dataframe
          Date1 <- c(NA,.1,NA,NA)
          Date2 <- c(.1, NA,NA,NA)
          Date3 <- c(.2,NA,NA,.1)
          Date4 <- c(NA,.3,NA,.2)
          Date5 <- c(.3,.2,.3,.1)
          Date6 <- c(.1,NA,NA,NA)
          df <- as.data.frame(cbind(Date1,Date2,Date3,Date4,Date5,Date6))
          rownames(df) <- c('A','B','C','D')

          > df
          Date1 Date2 Date3 Date4 Date5 Date6
          A NA 0.1 0.2 NA 0.3 0.1
          B 0.1 NA NA 0.3 0.2 NA
          C NA NA NA NA 0.3 NA
          D NA NA 0.1 0.2 0.1 NA



          # Load library
          library(zoo)
          df2 <- t(na.locf(t(df),na.rm = F)) # fill last observation carried forward
          df3 <- t(na.locf(t(df),na.rm = F, fromLast = T)) # last obs carried backward

          df4 <- (df2 + df3)/2 # mean of both dataframes

          df4 <- t(na.locf(t(df4),na.rm = F)) # fill last observation carried forward
          df4[is.na(df4)] <- 0 # NA values are 0

          Date1 Date2 Date3 Date4 Date5 Date6
          A 0.0 0.1 0.2 0.25 0.3 0.1
          B 0.1 0.2 0.2 0.30 0.2 0.2
          C 0.0 0.0 0.0 0.00 0.3 0.3
          D 0.0 0.0 0.1 0.20 0.1 0.1





          share|improve this answer

































            3














            Here's another option with base R + rollmean from zoo (clearly easy to rewrite in base R for this case with window size k = 2).



            t(apply(df, 1, function(x) {
            means <- c(0, rollmean(na.omit(x), 2), tail(na.omit(x), 1))
            replace(x, is.na(x), means[1 + cumsum(!is.na(x))[is.na(x)]])
            }))
            # Date1 Date2 Date3 Date4 Date5 Date6
            # A 0.0 0.1 0.2 0.25 0.3 0.2
            # B 0.1 0.2 0.2 0.30 0.2 0.1
            # C 0.0 0.0 0.0 0.00 0.3 0.3
            # D 0.1 0.2 0.3 0.20 0.1 0.1
            # E 0.0 0.0 0.1 0.20 0.1 0.3




            Explanation. Suppose that x is the first row of df:



            #   Date1 Date2 Date3 Date4 Date5 Date6
            # A NA 0.1 0.2 NA 0.3 0.2


            Then



            means
            # [1] 0.00 0.15 0.25 0.25 0.20


            is a vector of 0, rolling means of two the following non-NA elements, and the last non-NA element. Then all we need to do is to replace those elements of x that are is.na(x). We will replace them by the elements of means at indices 1 + cumsum(!is.na(x))[is.na(x)]. That's the trickier part. Here



            cumsum(!is.na(x))
            # [1] 0 1 2 2 3 4


            Meaning that the first element of x has seen 0 non-NA elements, while, say, the last one has seen 4 non-NA elements so far. Then



            cumsum(!is.na(x))[is.na(x)]
            # [1] 0 2


            is about those NA elements in x that we want to replace. Notice that then



            1 + cumsum(!is.na(x))[is.na(x)]
            # [1] 1 3


            corresponds to the elements of means that we want to use for replacement.






            share|improve this answer































              0














              I am finding the function below too complicated but it works, so here it goes.



              fun <- function(x){
              if(anyNA(x)){
              inx <- which(!is.na(x))
              if(inx[1] > 1) x[seq_len(inx[1] - 1)] <- 0
              prev <- inx[1]
              for(i in inx[-1]){
              if(i - prev > 1){
              m <- mean(c(x[i], x[prev]))
              while(prev < i){
              x[prev] <- m
              prev <- prev + 1
              }
              }
              prev <- i
              }
              }
              x
              }

              res <- t(apply(df1, 1, fun))
              res <- as.data.frame(res)
              res
              # Date.1 Date.2 Date.3 Date.4 Date.5
              #A 0.0 0.1 0.25 0.25 0.3
              #B 0.2 0.2 0.20 0.30 0.2
              #C 0.0 0.0 0.00 0.00 0.3
              #E 0.0 0.0 0.10 0.20 0.1


              Data.



              df1 <- read.table(text = "
              Date.1 Date.2 Date.3 Date.4 Date.5
              A NA 0.1 0.2 NA 0.3
              B 0.1 NA NA 0.3 0.2
              C NA NA NA NA 0.3
              E NA NA 0.1 0.2 0.1
              ", header = TRUE)





              share|improve this answer























                Your Answer






                StackExchange.ifUsing("editor", function () {
                StackExchange.using("externalEditor", function () {
                StackExchange.using("snippets", function () {
                StackExchange.snippets.init();
                });
                });
                }, "code-snippets");

                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "1"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: true,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: 10,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54130682%2ftransform-na-values-based-on-first-registration-and-nearest-values%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                4 Answers
                4






                active

                oldest

                votes








                4 Answers
                4






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                8














                Here is a way using na.approx from the zoo package and apply with MARGIN = 1 (so this is probably not very efficient but get's the job done).



                library(zoo)
                df1 <- as.data.frame(t(apply(dat, 1, na.approx, method = "constant", f = .5, na.rm = FALSE)))


                This results in



                df1
                # V1 V2 V3 V4 V5
                #A NA 0.1 0.2 0.25 0.3
                #B 0.1 0.2 0.2 0.30 0.2
                #C NA NA NA NA 0.3
                #E NA NA 0.1 0.20 0.1


                Replace NAs and rename columns.



                df1[is.na(df1)] <- 0
                names(df1) <- names(dat)
                df1
                # Date_1 Date_2 Date_3 Date_4 Date_5
                #A 0.0 0.1 0.2 0.25 0.3
                #B 0.1 0.2 0.2 0.30 0.2
                #C 0.0 0.0 0.0 0.00 0.3
                #E 0.0 0.0 0.1 0.20 0.1




                explanation



                Given a vector



                x <- c(0.1, NA, NA, 0.3, 0.2)
                na.approx(x)


                returns x with linear interpolated values



                #[1] 0.1000000 0.1666667 0.2333333 0.3000000 0.2000000


                But OP asked for constant values so we need the argument method = "constant" from the approx function.



                na.approx(x, method = "constant") 
                # [1] 0.1 0.1 0.1 0.3 0.2


                But this is still not what OP asked for because it carries the last observation forward while you want the mean for the closest non-NA values. Therefore we need the argument f (also from approx)



                na.approx(x, method = "constant", f = .5)
                # [1] 0.1 0.2 0.2 0.3 0.2 # looks good


                From ?approx




                f : for method = "constant" a number between 0 and 1 inclusive, indicating a compromise between left- and right-continuous step functions. If y0 and y1 are the values to the left and right of the point then the value is y0 if f == 0, y1 if f == 1, and y0*(1-f)+y1*f for intermediate values. In this way the result is right-continuous for f == 0 and left-continuous for f == 1, even for non-finite y values.




                Lastly, if we don't want to replace the NAs at the beginning and end of each row we need na.rm = FALSE.



                From ?na.approx




                na.rm : logical. If the result of the (spline) interpolation still results in NAs, should these be removed?




                data



                dat <- structure(list(Date_1 = c(NA, 0.1, NA, NA), Date_2 = c(0.1, NA, 
                NA, NA), Date_3 = c(0.2, NA, NA, 0.1), Date_4 = c(NA, 0.3, NA,
                0.2), Date_5 = c(0.3, 0.2, 0.3, 0.1)), .Names = c("Date_1", "Date_2",
                "Date_3", "Date_4", "Date_5"), class = "data.frame", row.names = c("A",
                "B", "C", "E"))


                EDIT



                If there are NAs in the last column we can replace these with the last non-NAs before we apply na.approx as shown above.



                dat$Date_6[is.na(dat$Date_6)] <- dat[cbind(1:nrow(dat),
                max.col(!is.na(dat), ties.method = "last"))][is.na(dat$Date_6)]





                share|improve this answer





















                • 1





                  Nice! That is the route I took but couldn't quite get the na.approx to work. Can you please explain the arguments?(method and f). FYI na.approx is from zoo package, not Base R

                  – Sotos
                  Jan 10 at 15:09








                • 3





                  @Sotos added some explaination. Best

                  – markus
                  Jan 10 at 15:26











                • Great Thanks! So f = .5 is not continuous? It's just to tell it to take the mean? i.e. We can not specify (f = .6)?

                  – Sotos
                  Jan 10 at 15:29






                • 3





                  @Sotos f could be a number between 0 (default) and 1. 0 means last observation carried forward, 1 would be next observation carried backwards, for any number in between is will result in y0 * (1 - f) + y1 * f, see ?approx

                  – markus
                  Jan 10 at 15:32













                • Great! I have one question, if the data has a NA in the last column, how do I transform that last NA as the last registered value?

                  – user195366
                  Jan 10 at 15:33
















                8














                Here is a way using na.approx from the zoo package and apply with MARGIN = 1 (so this is probably not very efficient but get's the job done).



                library(zoo)
                df1 <- as.data.frame(t(apply(dat, 1, na.approx, method = "constant", f = .5, na.rm = FALSE)))


                This results in



                df1
                # V1 V2 V3 V4 V5
                #A NA 0.1 0.2 0.25 0.3
                #B 0.1 0.2 0.2 0.30 0.2
                #C NA NA NA NA 0.3
                #E NA NA 0.1 0.20 0.1


                Replace NAs and rename columns.



                df1[is.na(df1)] <- 0
                names(df1) <- names(dat)
                df1
                # Date_1 Date_2 Date_3 Date_4 Date_5
                #A 0.0 0.1 0.2 0.25 0.3
                #B 0.1 0.2 0.2 0.30 0.2
                #C 0.0 0.0 0.0 0.00 0.3
                #E 0.0 0.0 0.1 0.20 0.1




                explanation



                Given a vector



                x <- c(0.1, NA, NA, 0.3, 0.2)
                na.approx(x)


                returns x with linear interpolated values



                #[1] 0.1000000 0.1666667 0.2333333 0.3000000 0.2000000


                But OP asked for constant values so we need the argument method = "constant" from the approx function.



                na.approx(x, method = "constant") 
                # [1] 0.1 0.1 0.1 0.3 0.2


                But this is still not what OP asked for because it carries the last observation forward while you want the mean for the closest non-NA values. Therefore we need the argument f (also from approx)



                na.approx(x, method = "constant", f = .5)
                # [1] 0.1 0.2 0.2 0.3 0.2 # looks good


                From ?approx




                f : for method = "constant" a number between 0 and 1 inclusive, indicating a compromise between left- and right-continuous step functions. If y0 and y1 are the values to the left and right of the point then the value is y0 if f == 0, y1 if f == 1, and y0*(1-f)+y1*f for intermediate values. In this way the result is right-continuous for f == 0 and left-continuous for f == 1, even for non-finite y values.




                Lastly, if we don't want to replace the NAs at the beginning and end of each row we need na.rm = FALSE.



                From ?na.approx




                na.rm : logical. If the result of the (spline) interpolation still results in NAs, should these be removed?




                data



                dat <- structure(list(Date_1 = c(NA, 0.1, NA, NA), Date_2 = c(0.1, NA, 
                NA, NA), Date_3 = c(0.2, NA, NA, 0.1), Date_4 = c(NA, 0.3, NA,
                0.2), Date_5 = c(0.3, 0.2, 0.3, 0.1)), .Names = c("Date_1", "Date_2",
                "Date_3", "Date_4", "Date_5"), class = "data.frame", row.names = c("A",
                "B", "C", "E"))


                EDIT



                If there are NAs in the last column we can replace these with the last non-NAs before we apply na.approx as shown above.



                dat$Date_6[is.na(dat$Date_6)] <- dat[cbind(1:nrow(dat),
                max.col(!is.na(dat), ties.method = "last"))][is.na(dat$Date_6)]





                share|improve this answer





















                • 1





                  Nice! That is the route I took but couldn't quite get the na.approx to work. Can you please explain the arguments?(method and f). FYI na.approx is from zoo package, not Base R

                  – Sotos
                  Jan 10 at 15:09








                • 3





                  @Sotos added some explaination. Best

                  – markus
                  Jan 10 at 15:26











                • Great Thanks! So f = .5 is not continuous? It's just to tell it to take the mean? i.e. We can not specify (f = .6)?

                  – Sotos
                  Jan 10 at 15:29






                • 3





                  @Sotos f could be a number between 0 (default) and 1. 0 means last observation carried forward, 1 would be next observation carried backwards, for any number in between is will result in y0 * (1 - f) + y1 * f, see ?approx

                  – markus
                  Jan 10 at 15:32













                • Great! I have one question, if the data has a NA in the last column, how do I transform that last NA as the last registered value?

                  – user195366
                  Jan 10 at 15:33














                8












                8








                8







                Here is a way using na.approx from the zoo package and apply with MARGIN = 1 (so this is probably not very efficient but get's the job done).



                library(zoo)
                df1 <- as.data.frame(t(apply(dat, 1, na.approx, method = "constant", f = .5, na.rm = FALSE)))


                This results in



                df1
                # V1 V2 V3 V4 V5
                #A NA 0.1 0.2 0.25 0.3
                #B 0.1 0.2 0.2 0.30 0.2
                #C NA NA NA NA 0.3
                #E NA NA 0.1 0.20 0.1


                Replace NAs and rename columns.



                df1[is.na(df1)] <- 0
                names(df1) <- names(dat)
                df1
                # Date_1 Date_2 Date_3 Date_4 Date_5
                #A 0.0 0.1 0.2 0.25 0.3
                #B 0.1 0.2 0.2 0.30 0.2
                #C 0.0 0.0 0.0 0.00 0.3
                #E 0.0 0.0 0.1 0.20 0.1




                explanation



                Given a vector



                x <- c(0.1, NA, NA, 0.3, 0.2)
                na.approx(x)


                returns x with linear interpolated values



                #[1] 0.1000000 0.1666667 0.2333333 0.3000000 0.2000000


                But OP asked for constant values so we need the argument method = "constant" from the approx function.



                na.approx(x, method = "constant") 
                # [1] 0.1 0.1 0.1 0.3 0.2


                But this is still not what OP asked for because it carries the last observation forward while you want the mean for the closest non-NA values. Therefore we need the argument f (also from approx)



                na.approx(x, method = "constant", f = .5)
                # [1] 0.1 0.2 0.2 0.3 0.2 # looks good


                From ?approx




                f : for method = "constant" a number between 0 and 1 inclusive, indicating a compromise between left- and right-continuous step functions. If y0 and y1 are the values to the left and right of the point then the value is y0 if f == 0, y1 if f == 1, and y0*(1-f)+y1*f for intermediate values. In this way the result is right-continuous for f == 0 and left-continuous for f == 1, even for non-finite y values.




                Lastly, if we don't want to replace the NAs at the beginning and end of each row we need na.rm = FALSE.



                From ?na.approx




                na.rm : logical. If the result of the (spline) interpolation still results in NAs, should these be removed?




                data



                dat <- structure(list(Date_1 = c(NA, 0.1, NA, NA), Date_2 = c(0.1, NA, 
                NA, NA), Date_3 = c(0.2, NA, NA, 0.1), Date_4 = c(NA, 0.3, NA,
                0.2), Date_5 = c(0.3, 0.2, 0.3, 0.1)), .Names = c("Date_1", "Date_2",
                "Date_3", "Date_4", "Date_5"), class = "data.frame", row.names = c("A",
                "B", "C", "E"))


                EDIT



                If there are NAs in the last column we can replace these with the last non-NAs before we apply na.approx as shown above.



                dat$Date_6[is.na(dat$Date_6)] <- dat[cbind(1:nrow(dat),
                max.col(!is.na(dat), ties.method = "last"))][is.na(dat$Date_6)]





                share|improve this answer















                Here is a way using na.approx from the zoo package and apply with MARGIN = 1 (so this is probably not very efficient but get's the job done).



                library(zoo)
                df1 <- as.data.frame(t(apply(dat, 1, na.approx, method = "constant", f = .5, na.rm = FALSE)))


                This results in



                df1
                # V1 V2 V3 V4 V5
                #A NA 0.1 0.2 0.25 0.3
                #B 0.1 0.2 0.2 0.30 0.2
                #C NA NA NA NA 0.3
                #E NA NA 0.1 0.20 0.1


                Replace NAs and rename columns.



                df1[is.na(df1)] <- 0
                names(df1) <- names(dat)
                df1
                # Date_1 Date_2 Date_3 Date_4 Date_5
                #A 0.0 0.1 0.2 0.25 0.3
                #B 0.1 0.2 0.2 0.30 0.2
                #C 0.0 0.0 0.0 0.00 0.3
                #E 0.0 0.0 0.1 0.20 0.1




                explanation



                Given a vector



                x <- c(0.1, NA, NA, 0.3, 0.2)
                na.approx(x)


                returns x with linear interpolated values



                #[1] 0.1000000 0.1666667 0.2333333 0.3000000 0.2000000


                But OP asked for constant values so we need the argument method = "constant" from the approx function.



                na.approx(x, method = "constant") 
                # [1] 0.1 0.1 0.1 0.3 0.2


                But this is still not what OP asked for because it carries the last observation forward while you want the mean for the closest non-NA values. Therefore we need the argument f (also from approx)



                na.approx(x, method = "constant", f = .5)
                # [1] 0.1 0.2 0.2 0.3 0.2 # looks good


                From ?approx




                f : for method = "constant" a number between 0 and 1 inclusive, indicating a compromise between left- and right-continuous step functions. If y0 and y1 are the values to the left and right of the point then the value is y0 if f == 0, y1 if f == 1, and y0*(1-f)+y1*f for intermediate values. In this way the result is right-continuous for f == 0 and left-continuous for f == 1, even for non-finite y values.




                Lastly, if we don't want to replace the NAs at the beginning and end of each row we need na.rm = FALSE.



                From ?na.approx




                na.rm : logical. If the result of the (spline) interpolation still results in NAs, should these be removed?




                data



                dat <- structure(list(Date_1 = c(NA, 0.1, NA, NA), Date_2 = c(0.1, NA, 
                NA, NA), Date_3 = c(0.2, NA, NA, 0.1), Date_4 = c(NA, 0.3, NA,
                0.2), Date_5 = c(0.3, 0.2, 0.3, 0.1)), .Names = c("Date_1", "Date_2",
                "Date_3", "Date_4", "Date_5"), class = "data.frame", row.names = c("A",
                "B", "C", "E"))


                EDIT



                If there are NAs in the last column we can replace these with the last non-NAs before we apply na.approx as shown above.



                dat$Date_6[is.na(dat$Date_6)] <- dat[cbind(1:nrow(dat),
                max.col(!is.na(dat), ties.method = "last"))][is.na(dat$Date_6)]






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Jan 10 at 21:34

























                answered Jan 10 at 15:06









                markusmarkus

                11.9k1234




                11.9k1234








                • 1





                  Nice! That is the route I took but couldn't quite get the na.approx to work. Can you please explain the arguments?(method and f). FYI na.approx is from zoo package, not Base R

                  – Sotos
                  Jan 10 at 15:09








                • 3





                  @Sotos added some explaination. Best

                  – markus
                  Jan 10 at 15:26











                • Great Thanks! So f = .5 is not continuous? It's just to tell it to take the mean? i.e. We can not specify (f = .6)?

                  – Sotos
                  Jan 10 at 15:29






                • 3





                  @Sotos f could be a number between 0 (default) and 1. 0 means last observation carried forward, 1 would be next observation carried backwards, for any number in between is will result in y0 * (1 - f) + y1 * f, see ?approx

                  – markus
                  Jan 10 at 15:32













                • Great! I have one question, if the data has a NA in the last column, how do I transform that last NA as the last registered value?

                  – user195366
                  Jan 10 at 15:33














                • 1





                  Nice! That is the route I took but couldn't quite get the na.approx to work. Can you please explain the arguments?(method and f). FYI na.approx is from zoo package, not Base R

                  – Sotos
                  Jan 10 at 15:09








                • 3





                  @Sotos added some explaination. Best

                  – markus
                  Jan 10 at 15:26











                • Great Thanks! So f = .5 is not continuous? It's just to tell it to take the mean? i.e. We can not specify (f = .6)?

                  – Sotos
                  Jan 10 at 15:29






                • 3





                  @Sotos f could be a number between 0 (default) and 1. 0 means last observation carried forward, 1 would be next observation carried backwards, for any number in between is will result in y0 * (1 - f) + y1 * f, see ?approx

                  – markus
                  Jan 10 at 15:32













                • Great! I have one question, if the data has a NA in the last column, how do I transform that last NA as the last registered value?

                  – user195366
                  Jan 10 at 15:33








                1




                1





                Nice! That is the route I took but couldn't quite get the na.approx to work. Can you please explain the arguments?(method and f). FYI na.approx is from zoo package, not Base R

                – Sotos
                Jan 10 at 15:09







                Nice! That is the route I took but couldn't quite get the na.approx to work. Can you please explain the arguments?(method and f). FYI na.approx is from zoo package, not Base R

                – Sotos
                Jan 10 at 15:09






                3




                3





                @Sotos added some explaination. Best

                – markus
                Jan 10 at 15:26





                @Sotos added some explaination. Best

                – markus
                Jan 10 at 15:26













                Great Thanks! So f = .5 is not continuous? It's just to tell it to take the mean? i.e. We can not specify (f = .6)?

                – Sotos
                Jan 10 at 15:29





                Great Thanks! So f = .5 is not continuous? It's just to tell it to take the mean? i.e. We can not specify (f = .6)?

                – Sotos
                Jan 10 at 15:29




                3




                3





                @Sotos f could be a number between 0 (default) and 1. 0 means last observation carried forward, 1 would be next observation carried backwards, for any number in between is will result in y0 * (1 - f) + y1 * f, see ?approx

                – markus
                Jan 10 at 15:32







                @Sotos f could be a number between 0 (default) and 1. 0 means last observation carried forward, 1 would be next observation carried backwards, for any number in between is will result in y0 * (1 - f) + y1 * f, see ?approx

                – markus
                Jan 10 at 15:32















                Great! I have one question, if the data has a NA in the last column, how do I transform that last NA as the last registered value?

                – user195366
                Jan 10 at 15:33





                Great! I have one question, if the data has a NA in the last column, how do I transform that last NA as the last registered value?

                – user195366
                Jan 10 at 15:33













                3














                This is another possible answer, using na.locf from the zoo package.
                Edit: apply is actually not required; This solution fills in the last observed value if this value is missing.



                # create the dataframe
                Date1 <- c(NA,.1,NA,NA)
                Date2 <- c(.1, NA,NA,NA)
                Date3 <- c(.2,NA,NA,.1)
                Date4 <- c(NA,.3,NA,.2)
                Date5 <- c(.3,.2,.3,.1)
                Date6 <- c(.1,NA,NA,NA)
                df <- as.data.frame(cbind(Date1,Date2,Date3,Date4,Date5,Date6))
                rownames(df) <- c('A','B','C','D')

                > df
                Date1 Date2 Date3 Date4 Date5 Date6
                A NA 0.1 0.2 NA 0.3 0.1
                B 0.1 NA NA 0.3 0.2 NA
                C NA NA NA NA 0.3 NA
                D NA NA 0.1 0.2 0.1 NA



                # Load library
                library(zoo)
                df2 <- t(na.locf(t(df),na.rm = F)) # fill last observation carried forward
                df3 <- t(na.locf(t(df),na.rm = F, fromLast = T)) # last obs carried backward

                df4 <- (df2 + df3)/2 # mean of both dataframes

                df4 <- t(na.locf(t(df4),na.rm = F)) # fill last observation carried forward
                df4[is.na(df4)] <- 0 # NA values are 0

                Date1 Date2 Date3 Date4 Date5 Date6
                A 0.0 0.1 0.2 0.25 0.3 0.1
                B 0.1 0.2 0.2 0.30 0.2 0.2
                C 0.0 0.0 0.0 0.00 0.3 0.3
                D 0.0 0.0 0.1 0.20 0.1 0.1





                share|improve this answer






























                  3














                  This is another possible answer, using na.locf from the zoo package.
                  Edit: apply is actually not required; This solution fills in the last observed value if this value is missing.



                  # create the dataframe
                  Date1 <- c(NA,.1,NA,NA)
                  Date2 <- c(.1, NA,NA,NA)
                  Date3 <- c(.2,NA,NA,.1)
                  Date4 <- c(NA,.3,NA,.2)
                  Date5 <- c(.3,.2,.3,.1)
                  Date6 <- c(.1,NA,NA,NA)
                  df <- as.data.frame(cbind(Date1,Date2,Date3,Date4,Date5,Date6))
                  rownames(df) <- c('A','B','C','D')

                  > df
                  Date1 Date2 Date3 Date4 Date5 Date6
                  A NA 0.1 0.2 NA 0.3 0.1
                  B 0.1 NA NA 0.3 0.2 NA
                  C NA NA NA NA 0.3 NA
                  D NA NA 0.1 0.2 0.1 NA



                  # Load library
                  library(zoo)
                  df2 <- t(na.locf(t(df),na.rm = F)) # fill last observation carried forward
                  df3 <- t(na.locf(t(df),na.rm = F, fromLast = T)) # last obs carried backward

                  df4 <- (df2 + df3)/2 # mean of both dataframes

                  df4 <- t(na.locf(t(df4),na.rm = F)) # fill last observation carried forward
                  df4[is.na(df4)] <- 0 # NA values are 0

                  Date1 Date2 Date3 Date4 Date5 Date6
                  A 0.0 0.1 0.2 0.25 0.3 0.1
                  B 0.1 0.2 0.2 0.30 0.2 0.2
                  C 0.0 0.0 0.0 0.00 0.3 0.3
                  D 0.0 0.0 0.1 0.20 0.1 0.1





                  share|improve this answer




























                    3












                    3








                    3







                    This is another possible answer, using na.locf from the zoo package.
                    Edit: apply is actually not required; This solution fills in the last observed value if this value is missing.



                    # create the dataframe
                    Date1 <- c(NA,.1,NA,NA)
                    Date2 <- c(.1, NA,NA,NA)
                    Date3 <- c(.2,NA,NA,.1)
                    Date4 <- c(NA,.3,NA,.2)
                    Date5 <- c(.3,.2,.3,.1)
                    Date6 <- c(.1,NA,NA,NA)
                    df <- as.data.frame(cbind(Date1,Date2,Date3,Date4,Date5,Date6))
                    rownames(df) <- c('A','B','C','D')

                    > df
                    Date1 Date2 Date3 Date4 Date5 Date6
                    A NA 0.1 0.2 NA 0.3 0.1
                    B 0.1 NA NA 0.3 0.2 NA
                    C NA NA NA NA 0.3 NA
                    D NA NA 0.1 0.2 0.1 NA



                    # Load library
                    library(zoo)
                    df2 <- t(na.locf(t(df),na.rm = F)) # fill last observation carried forward
                    df3 <- t(na.locf(t(df),na.rm = F, fromLast = T)) # last obs carried backward

                    df4 <- (df2 + df3)/2 # mean of both dataframes

                    df4 <- t(na.locf(t(df4),na.rm = F)) # fill last observation carried forward
                    df4[is.na(df4)] <- 0 # NA values are 0

                    Date1 Date2 Date3 Date4 Date5 Date6
                    A 0.0 0.1 0.2 0.25 0.3 0.1
                    B 0.1 0.2 0.2 0.30 0.2 0.2
                    C 0.0 0.0 0.0 0.00 0.3 0.3
                    D 0.0 0.0 0.1 0.20 0.1 0.1





                    share|improve this answer















                    This is another possible answer, using na.locf from the zoo package.
                    Edit: apply is actually not required; This solution fills in the last observed value if this value is missing.



                    # create the dataframe
                    Date1 <- c(NA,.1,NA,NA)
                    Date2 <- c(.1, NA,NA,NA)
                    Date3 <- c(.2,NA,NA,.1)
                    Date4 <- c(NA,.3,NA,.2)
                    Date5 <- c(.3,.2,.3,.1)
                    Date6 <- c(.1,NA,NA,NA)
                    df <- as.data.frame(cbind(Date1,Date2,Date3,Date4,Date5,Date6))
                    rownames(df) <- c('A','B','C','D')

                    > df
                    Date1 Date2 Date3 Date4 Date5 Date6
                    A NA 0.1 0.2 NA 0.3 0.1
                    B 0.1 NA NA 0.3 0.2 NA
                    C NA NA NA NA 0.3 NA
                    D NA NA 0.1 0.2 0.1 NA



                    # Load library
                    library(zoo)
                    df2 <- t(na.locf(t(df),na.rm = F)) # fill last observation carried forward
                    df3 <- t(na.locf(t(df),na.rm = F, fromLast = T)) # last obs carried backward

                    df4 <- (df2 + df3)/2 # mean of both dataframes

                    df4 <- t(na.locf(t(df4),na.rm = F)) # fill last observation carried forward
                    df4[is.na(df4)] <- 0 # NA values are 0

                    Date1 Date2 Date3 Date4 Date5 Date6
                    A 0.0 0.1 0.2 0.25 0.3 0.1
                    B 0.1 0.2 0.2 0.30 0.2 0.2
                    C 0.0 0.0 0.0 0.00 0.3 0.3
                    D 0.0 0.0 0.1 0.20 0.1 0.1






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Jan 10 at 15:49

























                    answered Jan 10 at 15:08









                    NiekNiek

                    937517




                    937517























                        3














                        Here's another option with base R + rollmean from zoo (clearly easy to rewrite in base R for this case with window size k = 2).



                        t(apply(df, 1, function(x) {
                        means <- c(0, rollmean(na.omit(x), 2), tail(na.omit(x), 1))
                        replace(x, is.na(x), means[1 + cumsum(!is.na(x))[is.na(x)]])
                        }))
                        # Date1 Date2 Date3 Date4 Date5 Date6
                        # A 0.0 0.1 0.2 0.25 0.3 0.2
                        # B 0.1 0.2 0.2 0.30 0.2 0.1
                        # C 0.0 0.0 0.0 0.00 0.3 0.3
                        # D 0.1 0.2 0.3 0.20 0.1 0.1
                        # E 0.0 0.0 0.1 0.20 0.1 0.3




                        Explanation. Suppose that x is the first row of df:



                        #   Date1 Date2 Date3 Date4 Date5 Date6
                        # A NA 0.1 0.2 NA 0.3 0.2


                        Then



                        means
                        # [1] 0.00 0.15 0.25 0.25 0.20


                        is a vector of 0, rolling means of two the following non-NA elements, and the last non-NA element. Then all we need to do is to replace those elements of x that are is.na(x). We will replace them by the elements of means at indices 1 + cumsum(!is.na(x))[is.na(x)]. That's the trickier part. Here



                        cumsum(!is.na(x))
                        # [1] 0 1 2 2 3 4


                        Meaning that the first element of x has seen 0 non-NA elements, while, say, the last one has seen 4 non-NA elements so far. Then



                        cumsum(!is.na(x))[is.na(x)]
                        # [1] 0 2


                        is about those NA elements in x that we want to replace. Notice that then



                        1 + cumsum(!is.na(x))[is.na(x)]
                        # [1] 1 3


                        corresponds to the elements of means that we want to use for replacement.






                        share|improve this answer




























                          3














                          Here's another option with base R + rollmean from zoo (clearly easy to rewrite in base R for this case with window size k = 2).



                          t(apply(df, 1, function(x) {
                          means <- c(0, rollmean(na.omit(x), 2), tail(na.omit(x), 1))
                          replace(x, is.na(x), means[1 + cumsum(!is.na(x))[is.na(x)]])
                          }))
                          # Date1 Date2 Date3 Date4 Date5 Date6
                          # A 0.0 0.1 0.2 0.25 0.3 0.2
                          # B 0.1 0.2 0.2 0.30 0.2 0.1
                          # C 0.0 0.0 0.0 0.00 0.3 0.3
                          # D 0.1 0.2 0.3 0.20 0.1 0.1
                          # E 0.0 0.0 0.1 0.20 0.1 0.3




                          Explanation. Suppose that x is the first row of df:



                          #   Date1 Date2 Date3 Date4 Date5 Date6
                          # A NA 0.1 0.2 NA 0.3 0.2


                          Then



                          means
                          # [1] 0.00 0.15 0.25 0.25 0.20


                          is a vector of 0, rolling means of two the following non-NA elements, and the last non-NA element. Then all we need to do is to replace those elements of x that are is.na(x). We will replace them by the elements of means at indices 1 + cumsum(!is.na(x))[is.na(x)]. That's the trickier part. Here



                          cumsum(!is.na(x))
                          # [1] 0 1 2 2 3 4


                          Meaning that the first element of x has seen 0 non-NA elements, while, say, the last one has seen 4 non-NA elements so far. Then



                          cumsum(!is.na(x))[is.na(x)]
                          # [1] 0 2


                          is about those NA elements in x that we want to replace. Notice that then



                          1 + cumsum(!is.na(x))[is.na(x)]
                          # [1] 1 3


                          corresponds to the elements of means that we want to use for replacement.






                          share|improve this answer


























                            3












                            3








                            3







                            Here's another option with base R + rollmean from zoo (clearly easy to rewrite in base R for this case with window size k = 2).



                            t(apply(df, 1, function(x) {
                            means <- c(0, rollmean(na.omit(x), 2), tail(na.omit(x), 1))
                            replace(x, is.na(x), means[1 + cumsum(!is.na(x))[is.na(x)]])
                            }))
                            # Date1 Date2 Date3 Date4 Date5 Date6
                            # A 0.0 0.1 0.2 0.25 0.3 0.2
                            # B 0.1 0.2 0.2 0.30 0.2 0.1
                            # C 0.0 0.0 0.0 0.00 0.3 0.3
                            # D 0.1 0.2 0.3 0.20 0.1 0.1
                            # E 0.0 0.0 0.1 0.20 0.1 0.3




                            Explanation. Suppose that x is the first row of df:



                            #   Date1 Date2 Date3 Date4 Date5 Date6
                            # A NA 0.1 0.2 NA 0.3 0.2


                            Then



                            means
                            # [1] 0.00 0.15 0.25 0.25 0.20


                            is a vector of 0, rolling means of two the following non-NA elements, and the last non-NA element. Then all we need to do is to replace those elements of x that are is.na(x). We will replace them by the elements of means at indices 1 + cumsum(!is.na(x))[is.na(x)]. That's the trickier part. Here



                            cumsum(!is.na(x))
                            # [1] 0 1 2 2 3 4


                            Meaning that the first element of x has seen 0 non-NA elements, while, say, the last one has seen 4 non-NA elements so far. Then



                            cumsum(!is.na(x))[is.na(x)]
                            # [1] 0 2


                            is about those NA elements in x that we want to replace. Notice that then



                            1 + cumsum(!is.na(x))[is.na(x)]
                            # [1] 1 3


                            corresponds to the elements of means that we want to use for replacement.






                            share|improve this answer













                            Here's another option with base R + rollmean from zoo (clearly easy to rewrite in base R for this case with window size k = 2).



                            t(apply(df, 1, function(x) {
                            means <- c(0, rollmean(na.omit(x), 2), tail(na.omit(x), 1))
                            replace(x, is.na(x), means[1 + cumsum(!is.na(x))[is.na(x)]])
                            }))
                            # Date1 Date2 Date3 Date4 Date5 Date6
                            # A 0.0 0.1 0.2 0.25 0.3 0.2
                            # B 0.1 0.2 0.2 0.30 0.2 0.1
                            # C 0.0 0.0 0.0 0.00 0.3 0.3
                            # D 0.1 0.2 0.3 0.20 0.1 0.1
                            # E 0.0 0.0 0.1 0.20 0.1 0.3




                            Explanation. Suppose that x is the first row of df:



                            #   Date1 Date2 Date3 Date4 Date5 Date6
                            # A NA 0.1 0.2 NA 0.3 0.2


                            Then



                            means
                            # [1] 0.00 0.15 0.25 0.25 0.20


                            is a vector of 0, rolling means of two the following non-NA elements, and the last non-NA element. Then all we need to do is to replace those elements of x that are is.na(x). We will replace them by the elements of means at indices 1 + cumsum(!is.na(x))[is.na(x)]. That's the trickier part. Here



                            cumsum(!is.na(x))
                            # [1] 0 1 2 2 3 4


                            Meaning that the first element of x has seen 0 non-NA elements, while, say, the last one has seen 4 non-NA elements so far. Then



                            cumsum(!is.na(x))[is.na(x)]
                            # [1] 0 2


                            is about those NA elements in x that we want to replace. Notice that then



                            1 + cumsum(!is.na(x))[is.na(x)]
                            # [1] 1 3


                            corresponds to the elements of means that we want to use for replacement.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Jan 10 at 15:57









                            Julius VainoraJulius Vainora

                            36.7k76481




                            36.7k76481























                                0














                                I am finding the function below too complicated but it works, so here it goes.



                                fun <- function(x){
                                if(anyNA(x)){
                                inx <- which(!is.na(x))
                                if(inx[1] > 1) x[seq_len(inx[1] - 1)] <- 0
                                prev <- inx[1]
                                for(i in inx[-1]){
                                if(i - prev > 1){
                                m <- mean(c(x[i], x[prev]))
                                while(prev < i){
                                x[prev] <- m
                                prev <- prev + 1
                                }
                                }
                                prev <- i
                                }
                                }
                                x
                                }

                                res <- t(apply(df1, 1, fun))
                                res <- as.data.frame(res)
                                res
                                # Date.1 Date.2 Date.3 Date.4 Date.5
                                #A 0.0 0.1 0.25 0.25 0.3
                                #B 0.2 0.2 0.20 0.30 0.2
                                #C 0.0 0.0 0.00 0.00 0.3
                                #E 0.0 0.0 0.10 0.20 0.1


                                Data.



                                df1 <- read.table(text = "
                                Date.1 Date.2 Date.3 Date.4 Date.5
                                A NA 0.1 0.2 NA 0.3
                                B 0.1 NA NA 0.3 0.2
                                C NA NA NA NA 0.3
                                E NA NA 0.1 0.2 0.1
                                ", header = TRUE)





                                share|improve this answer




























                                  0














                                  I am finding the function below too complicated but it works, so here it goes.



                                  fun <- function(x){
                                  if(anyNA(x)){
                                  inx <- which(!is.na(x))
                                  if(inx[1] > 1) x[seq_len(inx[1] - 1)] <- 0
                                  prev <- inx[1]
                                  for(i in inx[-1]){
                                  if(i - prev > 1){
                                  m <- mean(c(x[i], x[prev]))
                                  while(prev < i){
                                  x[prev] <- m
                                  prev <- prev + 1
                                  }
                                  }
                                  prev <- i
                                  }
                                  }
                                  x
                                  }

                                  res <- t(apply(df1, 1, fun))
                                  res <- as.data.frame(res)
                                  res
                                  # Date.1 Date.2 Date.3 Date.4 Date.5
                                  #A 0.0 0.1 0.25 0.25 0.3
                                  #B 0.2 0.2 0.20 0.30 0.2
                                  #C 0.0 0.0 0.00 0.00 0.3
                                  #E 0.0 0.0 0.10 0.20 0.1


                                  Data.



                                  df1 <- read.table(text = "
                                  Date.1 Date.2 Date.3 Date.4 Date.5
                                  A NA 0.1 0.2 NA 0.3
                                  B 0.1 NA NA 0.3 0.2
                                  C NA NA NA NA 0.3
                                  E NA NA 0.1 0.2 0.1
                                  ", header = TRUE)





                                  share|improve this answer


























                                    0












                                    0








                                    0







                                    I am finding the function below too complicated but it works, so here it goes.



                                    fun <- function(x){
                                    if(anyNA(x)){
                                    inx <- which(!is.na(x))
                                    if(inx[1] > 1) x[seq_len(inx[1] - 1)] <- 0
                                    prev <- inx[1]
                                    for(i in inx[-1]){
                                    if(i - prev > 1){
                                    m <- mean(c(x[i], x[prev]))
                                    while(prev < i){
                                    x[prev] <- m
                                    prev <- prev + 1
                                    }
                                    }
                                    prev <- i
                                    }
                                    }
                                    x
                                    }

                                    res <- t(apply(df1, 1, fun))
                                    res <- as.data.frame(res)
                                    res
                                    # Date.1 Date.2 Date.3 Date.4 Date.5
                                    #A 0.0 0.1 0.25 0.25 0.3
                                    #B 0.2 0.2 0.20 0.30 0.2
                                    #C 0.0 0.0 0.00 0.00 0.3
                                    #E 0.0 0.0 0.10 0.20 0.1


                                    Data.



                                    df1 <- read.table(text = "
                                    Date.1 Date.2 Date.3 Date.4 Date.5
                                    A NA 0.1 0.2 NA 0.3
                                    B 0.1 NA NA 0.3 0.2
                                    C NA NA NA NA 0.3
                                    E NA NA 0.1 0.2 0.1
                                    ", header = TRUE)





                                    share|improve this answer













                                    I am finding the function below too complicated but it works, so here it goes.



                                    fun <- function(x){
                                    if(anyNA(x)){
                                    inx <- which(!is.na(x))
                                    if(inx[1] > 1) x[seq_len(inx[1] - 1)] <- 0
                                    prev <- inx[1]
                                    for(i in inx[-1]){
                                    if(i - prev > 1){
                                    m <- mean(c(x[i], x[prev]))
                                    while(prev < i){
                                    x[prev] <- m
                                    prev <- prev + 1
                                    }
                                    }
                                    prev <- i
                                    }
                                    }
                                    x
                                    }

                                    res <- t(apply(df1, 1, fun))
                                    res <- as.data.frame(res)
                                    res
                                    # Date.1 Date.2 Date.3 Date.4 Date.5
                                    #A 0.0 0.1 0.25 0.25 0.3
                                    #B 0.2 0.2 0.20 0.30 0.2
                                    #C 0.0 0.0 0.00 0.00 0.3
                                    #E 0.0 0.0 0.10 0.20 0.1


                                    Data.



                                    df1 <- read.table(text = "
                                    Date.1 Date.2 Date.3 Date.4 Date.5
                                    A NA 0.1 0.2 NA 0.3
                                    B 0.1 NA NA 0.3 0.2
                                    C NA NA NA NA 0.3
                                    E NA NA 0.1 0.2 0.1
                                    ", header = TRUE)






                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered Jan 10 at 15:09









                                    Rui BarradasRui Barradas

                                    16.7k51730




                                    16.7k51730






























                                        draft saved

                                        draft discarded




















































                                        Thanks for contributing an answer to Stack Overflow!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54130682%2ftransform-na-values-based-on-first-registration-and-nearest-values%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        The term 'EXEC' is not recognized as the name of a cmdlet Powershell

                                        NPM command prompt closes immediately [closed]

                                        Error binding properties and functions in emscripten