The Wikipedia article on Zipf's law includes a few descriptions of the distribution, including some ways to calculate it:
https://en.wikipedia.org/wiki/Zipf%27s_law
The first equation under the Theoretical review section might help. Using this, we can write a short Python script to associate the total number of viewers for each of 36 videos:
N_elements = 1000000
video_exponent = 1
distribution_sum = 0
total_viewers = 0
# First, add up the relative number of viewers across all 36 movie ranks
for k_rank in range(1,36):
sum = 0
for n in range(1, N_elements):
sum = sum + 1/(n**video_exponent)
distribution_sum = distribution_sum + (1/(k_rank**video_exponent))/sum
# Next, distribute the number of viewers so that the total comes to 1,000,000
print("Movie Rank | # of Viewers")
for k_rank in range(1,36):
sum = 0
for n in range(1, N_elements):
sum = sum + 1/(n**video_exponent)
viewers_at_k_rank = round((N_elements/(k_rank**video_exponent))/(sum * distribution_sum))
print(k_rank, end="|")
print(viewers_at_k_rank)
total_viewers = total_viewers + viewers_at_k_rank
print("\nSum of all viewers accounted for so far, to make sure we're at 1,000,000")
print(total_viewers)
The result adds up to 1,000,002 viewers, but that's not a big deal. Why isn't it a big deal, you ask? It seems that while many different things follow a general Zipfian distribution, they tend to be slightly different based on what type of thing it is. The video_exponent variable can be adjusted so that the Zipfian distribution simulated above can more closely match actual video statistics. The difference is usually much greater than 2 in 1,000,000.
You can get an idea of what the real-world video_exponent is by finding some real ranked videos, and adjusting video_exponent and N_elements until the code matches the real numbers. Then, reset N_elements back to 1,000,000, and you'll have a realistic video-watching data set.