6

I have a simple table with paragraph numeration:

> table <- data.frame(id=c(1,2,3,4,5,6,7,8,9), paragraph=c("1.1.1.1","1","2","1.1","100","1.2","10","1.1.1","1.1.2"))
> print(table)

id paragraph
1   1.1.1.1
2         1
3         2
4       1.1
5       100
6       1.2
7        10
8     1.1.1
9     1.1.2
10     1.10

I would like to sort it by this way:

id paragraph
2         1
4       1.1
8     1.1.1
1   1.1.1.1
9     1.1.2
6       1.2
10     1.10
3         2
7        10
5       100

The issue for me (I could probably split them by . to the data.frame and then apply multiple column ordering), is that I don't know how many dots could be in the output – the amount could vary from time to time.

double-beep
  • 5,031
  • 17
  • 33
  • 41
Kirill
  • 63
  • 5

1 Answers1

2

Here's one option:

sp <- strsplit(as.character(table$paragraph), "\\.")
ro <- sapply(sp, function(x) sum(as.numeric(x) * 100^(max(lengths(sp)) + 0:(1 - length(x)))))
table[order(ro), ]
#    id paragraph
# 2   2         1
# 4   4       1.1
# 8   8     1.1.1
# 1   1   1.1.1.1
# 9   9     1.1.2
# 6   6       1.2
# 10 10      1.10
# 3   3         2
# 7   7        10
# 5   5       100

As, clearly, the levels structure cannot be ignored, with sp I first split the paragraph numbers. Then, as to translate paragraph numbers into integers by preserving the order, for each paragraph number I multiply the number of the section by 100^n (for a particular n), of the subsection by 100^(n-1), and so on (using 100 should suffice in practice but you could also use larger numbers), so that their sum is the desired integer, and ro is a vector of them.

Julius Vainora
  • 47,421
  • 9
  • 90
  • 102