Find how many numbers meet the constraints in a range

Question

Given 2 integers l and r, calculate how many numbers in [l, r] that meet these constraints

1) The number should be divisible by 7

2) The number contains at least three digit 7

3) The number contains more digit 7 than digit 4

777, 774746 meet those constraints, 7771, 77, 747474 are not.

Using brute force can easily find the answer but when the range is very large then it might take a lot of time.

I think dynamic programming can help to solve this problem, but i can't think of the solution

Can someone give me some guide?

What have you got so far? I mean what do you mean by brute force? I'm afraid there's not much to improve from that unless you're doing something really extreme. The decimal representation is unfortunately not very friendly to computers :/ — Zdeněk Jelínek, Oct 28 '15 at 13:51
I mean test every number in [l, r] one by one, if i have to check the numbers in [777, 77777777777777] then it might take a lot of time, so i am trying to find a way to optimize it — ffstar, Oct 28 '15 at 13:58

score 2 · Answer 1 · edited Jun 20 '20 at 09:12

Taking from the original brute-force version:

Iterate with i over numbers between [l,r]
    Use modulo to check if i is divisible by 7
    Use modulo and division to get counts of digits in i
    digit_count(7) >= 3
    digit_count(7) > digit_count(4)

here's some ideas I came up with...

1. Use only multiples of 7, implicitly fulfilling the first criterion:

This one is really simple. We can improve this to only use i that is divisible by 7. If I give you a number x and ask you to generate me numbers divisible by n until you reach y, then you'd best do:

for (auto i = x + x % n; i < y; i += n)

So for the case of multiples of 7 between l and r, all you need to do is run the loop for (auto i = l + l % 7; i < r; i += 7) This will give you 7x speed-up from the brute-force version.

2. Remember the digit counts

There's no need to execute numerous divisions and modulos to get you the count of digits of each number you go through. Since you know by how much you increment, you also know what digits change to what. This way, you'd only need to split into digits the starting number (e.g. l % 7 + l).

Now what we'd be storing isn't the count of digits but actually something very much resembling BCDs - an array of digits that represent the number we are currently working with. You'd then get something like std::vector<int> expressing the array of [7, 7, 2, 4, 5, 7] for number 772457. Now all you need to do is to use the BCD arithmetic inside the array every time you increment the loop counter, going [7, 7, 2, 4, 5, 7] + 6 = [7, 7, 2, 4, 6, 3].

The other thing we'd need to store are two ints - sevens and fours. In the intitialization phase, once you "disintegrate" the first number into the array, you'd just go through it and increment sevens for each 7, fours for each 4. And you'd just keep this numbers up to date: with each update of the array, you'd decrement fours for each 4 you took away and increment it for each 4 you created in the array. And the same for number 7. You can then compare sevens >= 3 && sevens > fours and know the result fast.

Funny thing is, that this gives you no theoretical improvement in the complexity and it might not work, but I suspect it should... It's quite a lot of code so I'm not going to provide it. You might end up working with the BCD array inverted or starting with the r end of iteration range so you don't need to resize the array. And maybe you can come up with many more improvements and tweaks. However, I have strong doubt that the solution can be made asymptotically less complex this way.

3. More thoughts

Now this wasn't dynamic programming at all. Or was it? If you think about it, I have a gut feeling that this idea of an array of numbers as BCD can now be converted to a problem where you look for permutations containing a given combination. You can make a graph out of it and search it. And that's where you'd go dynamic. I'm afraid, however, that this would make for quite a longer post...

But I already got the first doubt about that and that's the check for divisibility by 7 which would then be applied to all the numbers that are found in the graph (the graph would only support criterions 2 and 3 by its nature and yield all numbers containing the combinations). So in the end, it boils down to sizes of ranges that should be supported by the alrgorithm and the ratio of numbers fulfilling the first criterion and the numbers fulfilling the second and third ones in those ranges.

EDIT: I have since found that my idea of computing the count of numbers fitting the criteria is incorrect. Some small comparisons table:

|   range    |      numbers f/ c2     | c2_groups | c2_total | c1_total |
|   0 - 1k   |         777            |         1 |        1 |     ~143 |
|  1k - 10k  | _777, 7_77, 77_7, 777_ |         4 |       40 |    ~1286 |
| 10k - 100k |   __777, _7_77, ...    |        10 |     1000 |   ~12857 |

Where numbers f/ c2 are numbers fulfilling criterion 2, c2_groups is count of possible combinations of any digit and 7s in the number, cx_total is total count of numbers fulfilling criterion x in the range.

Having that, it looks like it's quite questionable whether it would be efficient to filter by the number of digits criteria first. I suppose that would require some mathematical analysis that would take longer than implement the solution...

Space search

With having state equivalent to method #2, it is possible to do DFS in the numbers range. Instead of incrementing by 7, it would store a digits vector and increment values in it based on an offset that would be movable, e.g.

increment [1, 0, 7, _] -> [1, 0, 8, _]
                 ^               ^

This is what the algorithm will be doing in the core loop. You can then check whether the current digits vector setup can fulfill the criteria - e.g. [0, p, _, _] can fulfill them, while [0, 0, p, _] cannot (p is the element that is being pointed to). This way, you will keep incrementing the highest possible digit, skipping a lot of numbers. Every time there is a possibility to fulfill the requirements, you will increment the offset and repeat the process:

push [7, 7, _, _] -> [7, 7, 0, _]
         ^                  ^

Once you're at the least significant digit position, you'll also start checking the divisibility by 7 of each candidate. You can try either converting the digits to int and using modulo or using some sort of divisibility algorithm (these use digits so that's a pleasant coincidence).

This way, you'll get a number that passes all criteria and return it. Now you might come to a situation where you exhaust all the digits in given digit range. In that case, you need to move the offset one place back:

pop [7, 7, 7, 9] -> [7, 7, 7, _]
              ^            ^

Now, you'd use increment, see that [7, 7, 8, _] can fulfill the criteria and push again. Then run through 0, 1, 2, ... sequence until you come to 7, see that 7787 is ok with both 2nd and 3rd criteria but fails division by 7. And so on...

You'll also need to check whether you're not already over the r limit. I guess that can be done in quite a sane manner by splitting r to digits as well and comparing it from the most significant digit.

Given that we have no math analysis for this, and that this is still going through quite a lot of numbers (especially in case that 7 is the least significant digit), I wonder whether this is really worth implementing. But it's not something super-complex either. Good luck!

Thanks for the detail information, i also tried your first idea before, but i think there is still some way to optimize it. — ffstar, Oct 28 '15 at 14:44
Edited the post, graph with proper number-generation is definitely a way to go... I'll get back to you with more info later — Zdeněk Jelínek, Oct 28 '15 at 14:55
sorry for asking but how do you know there are only 100 numbers that have three or more 7 digits around that 130k numbers? — ffstar, Oct 28 '15 at 15:02
Well the 130k is 900k / 7 which is (1M - 100k) / 7. I think this is the approximate count of multiples in given range size due to definition of division. Approximate because there is rounding and because range size is not sufficient enough to determine it (e.g there's 2 multiples of 7 in 40-50 range and only 1 multiple in 50-60 range). — Zdeněk Jelínek, Oct 28 '15 at 15:12
And the 100 is result of the old partial permutations trick: numbers between 100000 and 999999 have 6 digits: _ _ _ _ _ _. Your rules require at least three of them to be sevens - you end up with 7 7 7 _ _ _ (the order doesn't matter). So there is 10 * 10 * 10 combinations that fulfill the 2nd criterion. However, there is a possibility of having same count of fours that isn't allowed so the actual solution is 7 7 7 *4 _ _ which gets you to the number 9 * 10 * 10 = 900 permutations and I was wrong in the original post... :) — Zdeněk Jelínek, Oct 28 '15 at 15:16
So, I have realized that the idea of partial permutations is wrong and that it's not really that easy to compute and generalize... — Zdeněk Jelínek, Oct 28 '15 at 16:50

score 1 · Answer 2 · edited May 23 '17 at 11:51

1

For 1: if(yourint % 7 == 0)

For 2: check this link split int into digits, check if digit equals 7 and count to 3.

For 3: expend link 2 with an if a digit equals 7 or 4 than counter++

At the end you should check your counters (7 an 4) which one is the highest.

edited May 23 '17 at 11:51

Community

1
1

answered Oct 28 '15 at 13:55

Pim

393
3
18

This is what i did, but if the range is very large then i have to wait for a long time to get the answer. – ffstar Oct 28 '15 at 13:59

Find how many numbers meet the constraints in a range

2 Answers2

1. Use only multiples of 7, implicitly fulfilling the first criterion:

2. Remember the digit counts

3. More thoughts