1

I have the following object variable1:

GRanges object with 25605 ranges and 2 metadata columns:
              seqnames             ranges strand   |  totgenes   density
                 <Rle>          <IRanges>  <Rle>   | <integer> <numeric>
      [1]         chr1 [3000001, 3100000]      *   |         2       0.2
      [2]         chr1 [3100001, 3200000]      *   |         1       0.1
      [3]         chr1 [3200001, 3300000]      *   |         1       0.1
      [4]         chr1 [3300001, 3400000]      *   |         1       0.1
      [5]         chr1 [3400001, 3500000]      *   |         2       0.2
      ...          ...                ...    ... ...       ...       ...
  [25601] chrUn_random [1600001, 1700000]      *   |         0         0
  [25602] chrUn_random [1900001, 2000000]      *   |         0         0
  [25603] chrUn_random [2100001, 2200000]      *   |         0         0
  [25604] chrUn_random [2400001, 2500000]      *   |         0         0
  [25605] chrUn_random [5900001, 5900358]      *   |         0         0

and I want to take randomly 100 rows from this object. For that:

sample(variable1, 100)

However, I'd like to sample using prob= according to the density column. I can do this:

sample(sort(unique(variable1$density)), prob=table(sort(variable1$density))/25605, replace=TRUE, size=100)

but that way, I only get the density values. I'd like to get the whole rows.

zx8754
  • 52,746
  • 12
  • 114
  • 209
user2979409
  • 773
  • 1
  • 12
  • 23

1 Answers1

1

This ought to do it:

sam <- sample(variable1, size = 100, replace = TRUE, prob = variable1$density)

Metadata columns in GRanges objects (to the right of the line above) can be accessed using $, just as in a data frame.