sort                  package:base                  R Documentation

_S_o_r_t_i_n_g _o_r _O_r_d_e_r_i_n_g _V_e_c_t_o_r_s

_D_e_s_c_r_i_p_t_i_o_n:

     Sort (or order) a numeric or complex vector (partially) into
     ascending (or descending) order.

_U_s_a_g_e:

     sort(x, partial = NULL, na.last = NA, decreasing = FALSE,
          method = c("shell", "quick"), index.return = FALSE)
     is.unsorted(x, na.rm = FALSE)

_A_r_g_u_m_e_n_t_s:

       x: a numeric or complex vector.

 partial: a vector of indices for partial sorting.

 na.last: for controlling the treatment of `NA's. If `TRUE', missing
          values in the data are put last; if `FALSE', they are put
          first; if `NA', they are removed.

decreasing: logical.  Should the sort be increasing or decreasing?

  method: character specifying the algorithm used.

index.return: logical indicating if the ordering index vector should be
          returned as well; this is only available for the default
          `na.last = NA'.

   na.rm: logical.  Should missing values be removed?

_D_e_t_a_i_l_s:

     If `partial' is not `NULL', it is taken to contain indices of
     elements of `x' which are to be placed in their correct positions
     by partial sorting.  After the sort, the values specified in
     `partial' are in their correct position in the sorted array.  Any
     values smaller than these values are guaranteed to have a smaller
     index in the sorted array and any values which are greater are
     guaranteed to have a bigger index in the sorted array.

     The sort order for character vectors will depend on the collating
     sequence of the locale in use: see `Comparison'.

     `is.unsorted' returns a logical indicating if `x' is sorted
     increasingly, i.e. `is.unsorted(x)' is true if `any(x != sort(x))'
     (and there are no `NA's).

     `method = "shell"' uses Shellsort and was the only method before R
     version 1.5.x (although improved there to an O(n^{4/3}) variant of
     Sedgewick (1996)).

     Method `"quick"' uses Singleton's Quicksort implementation and is
     only available when `x' is numeric, the sort is increasing and
     `partial' is `NULL'.  It is normally somewhat faster than
     Shellsort (perhaps twice as fast on vectors of length a million)
     but has poor performance in the rare worst case. (Peto's
     modification using a pseudo-random midpoint is used to make the
     worst case rarer.)

_V_a_l_u_e:

     For `sort' the sorted vector unless `index.return' is true, when
     the result is a list with components named `x' and `ix' containing
     the sorted numbers and the ordering index vector.  In the latter
     case, if `method == "quick"' ties may be reversed in the ordering,
     unlike `sort.list', as quicksort is not stable.

_R_e_f_e_r_e_n_c_e_s:

     Sedgewick, R. (1986) A new upper bound for Shell sort. J.
     Algorithms 7, 159-173.

     Singleton, R. C. (1969) An efficient algorithm for sorting with
     minimal storage: Algorithm 347. Communications of the ACM 12,
     185-187.

_S_e_e _A_l_s_o:

     `order', `rank'.

_E_x_a_m_p_l_e_s:

     data(swiss)
     x <- swiss$Education[1:25]
     x; sort(x); sort(x, partial = c(10, 15))
     median # shows you another example for `partial'

     stopifnot(!is.unsorted(sort(x)),
               !is.unsorted(LETTERS),
                is.unsorted(c(NA,1:3,2), na.rm = TRUE))

     ## illustrate `stable' sorting (of ties):
     sort(c(10:3,2:12), method = "sh", index=TRUE) # is stable
     ## $x : 2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10 11 12
     ## $ix: 9  8 10  7 11  6 12  5 13  4 14  3 15  2 16  1 17 18 19
     sort(c(10:3,2:12), method = "qu", index=TRUE) # is not
     ## $x : 2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10 11 12
     ## $ix: 9 10  8  7 11  6 12  5 13  4 14  3 15 16  2 17  1 18 19
     ##        ^^^^^


     ## Small speed comparison simulation:
     N <- 2000
     Sim <- 20
     rep <- 50 # << adjust to your CPU
     c1 <- c2 <- numeric(Sim)
     for(is in 1:Sim){
       x <- rnorm(N)
       gc()  ## sort should not have to pay for gc
       c1[is] <- system.time(for(i in 1:rep) sort(x, method = "shell"))[1]
       c2[is] <- system.time(for(i in 1:rep) sort(x, method = "quick"))[1]
       stopifnot(sort(x, meth = "s") == sort(x, meth = "q"))
     }
     100 * rbind(ShellSort = c1, QuickSort = c2)
     cat("Speedup factor of quick sort():\n")
     summary({qq <- c1 / c2; qq[is.finite(qq)]})

     ## A larger test
     x <- rnorm(1e6)
     gc()
     system.time(x1 <- sort(x, method = "shell"))
     gc()
     system.time(x2 <- sort(x, method = "quick"))
     stopifnot(identical(x1, x2))

