Correctly setting random seeds for repeatability

Question

The method for setting random seeds using the Fortran 90 subroutine random_seed is quite straightforward.

call random_seed( put=seed )

But I can't find any information about guidelines for setting the seed (which is absolutely necessary when you want repeatability). Folklore I've heard in the past suggested that scalar seeds should be large. E.g. 123456789 is a better seed than 123. The only support for this I can find on the web is that it is suggested for the ifort extension function ran() that one use a "large, odd integer value"

I understand this might be implementation specific and am using gfortran 4.8.5 but am also interested in ifort and (if possible) general guidelines that are independent of implementation. Here's some example code:

# for compactness, assume seed size of 4, but it will depend on 
# the implementation (e.g. for my version of gfortran 4.8.5 it is 12)

seed1(1:4) = [ 123456789, 987654321, 456789123, 7891234567 ]
seed2(1:4) =   123456789
seed3(1:4) = [         1,         2,         3,          4 ]

I'd guess that seed1 is fine, but pretty verbose if you're setting it manually (as I am) as seed length can be 12 or 33 or whatever. And I'm not even sure that it's fine because I haven't been able to find any guidelines at all about setting these seeds. I.e. these seeds should be negative for all I know, or 3 digit even numbers, etc. although I guess you'd hope the implementation would warn you about that (?).

seed2 and seed3 are obviously more convenient to set and for all I know are just as good. @Ross suggests that seed2 is in fact fine in his answer here: Random number generator (RNG/PRNG) that returns updated value of seed

So my question in summary is just: How can I correctly set the seed? Are any or all of seed1 to seed3 acceptable?

*can someone make an "ifort" tag synonym for intel-fortran?* No. This was discussed a while ago, see https://meta.stackoverflow.com/questions/284685/intel-fortran-compiler-tags/284707. — High Performance Mark, Aug 17 '18 at 11:17
For a real application, you should be 'priming' your RNG by grabbing and discarding the first few numbers (maybe 100 to be safe). I can write up an answer if you'd like. — Ross, Aug 17 '18 at 15:12

score 5 · Accepted Answer · answered Aug 17 '18 at 11:40

Guidelines for setting the seed depend on the PRNG algorithm that RANDOM_NUMBER uses, but in general the more "entropy" you provide the better.

If you have a single scalar value, you can use some simple PRNG to expand that to the full seed array required by RANDOM_SEED. See e.g. the function lcg in the example code at https://gcc.gnu.org/onlinedocs/gcc-4.9.1/gfortran/RANDOM_005fSEED.html

Current versions of GFortran have some protections against bad seeds, and it should be relatively immune towards "dumb" seeding (e.g. all values of seed(:) identical, or all values small or even zero), but for portability to other compilers following something like I suggested above might still be a good idea.

Thanks! I was guessing there was a simple way to expand the scalar and was hoping someone would show how ;-) — JohnE, Aug 17 '18 at 11:45

Ross · Answer 2 · 2018-08-17T17:49:48.563

What you provide to random_seed( put=... ) is used to determine the starting state of the generator, which (as janneb states) should have as much entropy as reasonably possible. You could construct some relatively sophisticated method of generating this entropy - grabbing from the system somehow is a good choice for this. The code janneb links is a good example.

However, I typically like to be able to reproduce a single run from a given seed if necessary. This is useful for debugging and regression testing. Then, for production runs, the code can pull a single seed 'randomly' somehow. Therefore, I want to get good RNG from a single 'seed'. In my experience, this is easily achieved by providing this single seed then letting the generator add entropy by generating numbers. Consider the following example:

program main
   implicit none
   integer, parameter :: wp = selected_real_kind(15,307)
   integer, parameter :: n_discard = 100

   integer :: state_size, i
   integer, allocatable, dimension(:) :: state
   real(wp) :: ran, oldran

   call random_seed( size=state_size )
   write(*,*) '-- state size is: ', state_size

   allocate(state(state_size))

   ! -- Simple method of initializing seed from single scalar
   state = 20180815
   call random_seed( put=state )

   ! -- 'Prime' the generator by pulling the first few numbers
   ! -- In reality, these would be discarded but I will print them for demonstration
   ran = 0.5_wp
   do i=1,n_discard
      oldran = ran
      call random_number(ran)
      write(*,'(a,i3,2es26.18)') 'iter, ran, diff: ', i, ran, ran-oldran
   enddo

   ! Now the RNG is 'ready'
end program main

Here, I give a single seed, and then generate a random number 100 times. Typically, I would discard these initial, potentially corrupted, numbers. In this example, I'm printing them to see whether they look non-random. Running with PGI 15.10:

enet-mach5% pgfortran --version

pgfortran 15.10-0 64-bit target on x86-64 Linux -tp sandybridge 
The Portland Group - PGI Compilers and Tools
Copyright (c) 2015, NVIDIA CORPORATION.  All rights reserved.
enet-mach5% pgfortran main.f90 && ./a.out
 -- state size is:            34
iter, ran, diff:   1  8.114813341476008191E-01  3.114813341476008191E-01
iter, ran, diff:   2  8.114813341476008191E-01  0.000000000000000000E+00
iter, ran, diff:   3  8.114813341476008191E-01  0.000000000000000000E+00
iter, ran, diff:   4  8.114813341476008191E-01  0.000000000000000000E+00
iter, ran, diff:   5  8.114813341476008191E-01  0.000000000000000000E+00
iter, ran, diff:   6  2.172220012214012286E-01 -5.942593329261995905E-01
iter, ran, diff:   7  2.172220012214012286E-01  0.000000000000000000E+00
iter, ran, diff:   8  2.172220012214012286E-01  0.000000000000000000E+00
iter, ran, diff:   9  2.172220012214012286E-01  0.000000000000000000E+00
iter, ran, diff:  10  2.172220012214012286E-01  0.000000000000000000E+00
iter, ran, diff:  11  6.229626682952016381E-01  4.057406670738004095E-01
iter, ran, diff:  12  6.229626682952016381E-01  0.000000000000000000E+00
iter, ran, diff:  13  6.229626682952016381E-01  0.000000000000000000E+00
iter, ran, diff:  14  6.229626682952016381E-01  0.000000000000000000E+00
iter, ran, diff:  15  6.229626682952016381E-01  0.000000000000000000E+00
iter, ran, diff:  16  2.870333536900204763E-02 -5.942593329261995905E-01
iter, ran, diff:  17  2.870333536900204763E-02  0.000000000000000000E+00
iter, ran, diff:  18  4.344440024428024572E-01  4.057406670738004095E-01
iter, ran, diff:  19  4.344440024428024572E-01  0.000000000000000000E+00
iter, ran, diff:  20  4.344440024428024572E-01  0.000000000000000000E+00
iter, ran, diff:  21  8.401846695166028667E-01  4.057406670738004095E-01
iter, ran, diff:  22  8.401846695166028667E-01  0.000000000000000000E+00
iter, ran, diff:  23  6.516660036642036857E-01 -1.885186658523991809E-01
iter, ran, diff:  24  6.516660036642036857E-01  0.000000000000000000E+00
iter, ran, diff:  25  6.516660036642036857E-01  0.000000000000000000E+00
iter, ran, diff:  26  5.740667073800409526E-02 -5.942593329261995905E-01
iter, ran, diff:  27  5.740667073800409526E-02  0.000000000000000000E+00
iter, ran, diff:  28  2.746286719594053238E-01  2.172220012214012286E-01
iter, ran, diff:  29  2.746286719594053238E-01  0.000000000000000000E+00
iter, ran, diff:  30  2.746286719594053238E-01  0.000000000000000000E+00
iter, ran, diff:  31  6.803693390332057334E-01  4.057406670738004095E-01
iter, ran, diff:  32  6.803693390332057334E-01  0.000000000000000000E+00
iter, ran, diff:  33  3.033320073284073715E-01 -3.770373317047983619E-01
iter, ran, diff:  34  3.033320073284073715E-01  0.000000000000000000E+00
iter, ran, diff:  35  7.090726744022077810E-01  4.057406670738004095E-01
iter, ran, diff:  36  1.148133414760081905E-01 -5.942593329261995905E-01
iter, ran, diff:  37  1.148133414760081905E-01  0.000000000000000000E+00
iter, ran, diff:  38  1.435166768450102381E-01  2.870333536900204763E-02
iter, ran, diff:  39  1.435166768450102381E-01  0.000000000000000000E+00
iter, ran, diff:  40  3.607386780664114667E-01  2.172220012214012286E-01
iter, ran, diff:  41  7.664793451402118762E-01  4.057406670738004095E-01
iter, ran, diff:  42  7.664793451402118762E-01  0.000000000000000000E+00
iter, ran, diff:  43  2.009233475830143334E-01 -5.655559975571975428E-01
iter, ran, diff:  44  2.009233475830143334E-01  0.000000000000000000E+00
iter, ran, diff:  45  6.353673500258167905E-01  4.344440024428024572E-01
iter, ran, diff:  46  4.110801709961720007E-02 -5.942593329261995905E-01
iter, ran, diff:  47  4.110801709961720007E-02  0.000000000000000000E+00
iter, ran, diff:  48  8.812926866162200668E-01  8.401846695166028667E-01
iter, ran, diff:  49  8.812926866162200668E-01  0.000000000000000000E+00
iter, ran, diff:  50  9.386993573542241620E-01  5.740667073800409526E-02
iter, ran, diff:  51  3.444400244280245715E-01 -5.942593329261995905E-01
iter, ran, diff:  52  7.501806915018249811E-01  4.057406670738004095E-01
iter, ran, diff:  53  9.961060280922282573E-01  2.459253365904032762E-01
iter, ran, diff:  54  9.961060280922282573E-01  0.000000000000000000E+00
iter, ran, diff:  55  8.221603419923440015E-02 -9.138899938929938571E-01
iter, ran, diff:  56  4.879567012730348097E-01  4.057406670738004095E-01
iter, ran, diff:  57  1.109193695682364478E-01 -3.770373317047983619E-01
iter, ran, diff:  58  7.625853732324401335E-01  6.516660036642036857E-01
iter, ran, diff:  59  7.625853732324401335E-01  0.000000000000000000E+00
iter, ran, diff:  60  2.831393817822487335E-01 -4.794459914501914000E-01
iter, ran, diff:  61  6.888800488560491431E-01  4.057406670738004095E-01
iter, ran, diff:  62  7.462867195940532383E-01  5.740667073800409526E-02
iter, ran, diff:  63  8.036933903320573336E-01  5.740667073800409526E-02
iter, ran, diff:  64  8.036933903320573336E-01  0.000000000000000000E+00
iter, ran, diff:  65  1.644320683984688003E-01 -6.392613219335885333E-01
iter, ran, diff:  66  5.701727354722692098E-01  4.057406670738004095E-01
iter, ran, diff:  67  6.849860769482774003E-01  1.148133414760081905E-01
iter, ran, diff:  68  1.481334147600819051E-01 -5.368526621881954952E-01
iter, ran, diff:  69  5.538740818338823146E-01  4.057406670738004095E-01
iter, ran, diff:  70  1.605380964906970576E-01 -3.933359853431852571E-01
iter, ran, diff:  71  5.662787635644974671E-01  4.057406670738004095E-01
iter, ran, diff:  72  7.672021111475118005E-01  2.009233475830143334E-01
iter, ran, diff:  73  6.360901160331167148E-01 -1.311119951143950857E-01
iter, ran, diff:  74  6.647934514021187624E-01  2.870333536900204763E-02
iter, ran, diff:  75  9.231234697231371911E-01  2.583300183210184287E-01
iter, ran, diff:  76  3.288641367969376006E-01 -5.942593329261995905E-01
iter, ran, diff:  77  5.034149292976053403E-02 -2.785226438671770666E-01
iter, ran, diff:  78  3.249701648891658579E-01  2.746286719594053238E-01
iter, ran, diff:  79  4.110801709961720007E-01  8.611000610700614288E-02
iter, ran, diff:  80  7.268168600551945246E-01  3.157366890590225239E-01
iter, ran, diff:  81  1.325575271289949342E-01 -5.942593329261995905E-01
iter, ran, diff:  82  2.147735613282293343E-01  8.221603419923440015E-02
iter, ran, diff:  83  8.951429003614350677E-01  6.803693390332057334E-01
iter, ran, diff:  84  9.606624794444940107E-02 -7.990766524169856666E-01
iter, ran, diff:  85  8.749502748152764298E-01  7.788840268708270287E-01
iter, ran, diff:  86  6.864316089628772488E-01 -1.885186658523991809E-01
iter, ran, diff:  87  3.753116578189263919E-01 -3.111199511439508569E-01
iter, ran, diff:  88  4.614216639259325348E-01  8.611000610700614288E-02
iter, ran, diff:  89  8.632683590919612016E-01  4.018466951660286668E-01
iter, ran, diff:  90  5.110403908483931446E-01 -3.522279682435680570E-01
iter, ran, diff:  91  3.512250603649960112E-01 -1.598153304833971333E-01
iter, ran, diff:  92  2.984351275420635830E-01 -5.278993282293242828E-02
iter, ran, diff:  93  7.902858007228701354E-01  4.918506731808065524E-01
iter, ran, diff:  94  9.136098520217217356E-01  1.233240512988516002E-01
iter, ran, diff:  95  8.360105557375590024E-01 -7.759929628416273317E-02
iter, ran, diff:  96  7.623052313611680120E-01 -7.370532437639099044E-02
iter, ran, diff:  97  2.525198759725810760E-02 -7.370532437639099044E-01
iter, ran, diff:  98  9.228433278518650695E-01  8.975913402546069619E-01
iter, ran, diff:  99  1.283834133499510699E-01 -7.944599145019139996E-01
iter, ran, diff: 100  7.311534560989940701E-01  6.027700427490430002E-01

8 of the first 10 numbers generated are the repeated! This is a good illustration of why some generators require a high-entropy state in the first place. However, after 'some' time, the numbers start to look reasonable.

For my applications, 100 or so random numbers is a very small cost, so whenever I seed a generator, I prime them in this manner. I didn't observe this obviously bad behavior on ifort 16.0, gfortran 4.8, or gfortran 8.1. Non-repeating numbers is a pretty low bar, though. So I would prime for all compilers, not just ones I have observed bad behavior for.

From the comments, some compilers attempt to eliminate bad behavior by processing the input state in some way to yield the actual internal state. Gfortran uses an "xor cipher". The operation is reversed on a get.

The argument _isn't_ (necessarily) the starting state of the generator. The seed need merely be computed (by the processor) from the argument. — francescalus, Aug 17 '18 at 16:53
Hm, how would they maintain the same cycle on put/get? I suppose get could return a processed state and put could undo that processing? — Ross, Aug 17 '18 at 16:58
Yes, explicitly `get` needn't return the same value that was immediately prior `put`. — francescalus, Aug 17 '18 at 16:58
I guess a simple way that’s true is if any internal state isn’t a default-kind integer, the memory is cast somehow. — Ross, Aug 17 '18 at 16:59
@Ross, at least with gfortran, the developers tried to be defensive in the case of a poor set of seeds. The details are hazy, but ISTR that gfortran scrambles the supplied seeds with some sort of hash algorithm. — Steve, Aug 17 '18 at 17:00
In current Gfortran, the seed given with put= is encrypted with a "xor cipher" to generate the actual internal state. With get= the state is decrypted before returning. Of course, the determined programmer can still shoot himself in the foot, but it takes a bit of effort. — janneb, Aug 17 '18 at 17:10
That makes sense. I definitely get the same set of numbers if I get/put every cycle. Also, a 'gotten' state matches the one 'put' there, if no numbers have been generated. I'll update the wording in my answer slightly. — Ross, Aug 17 '18 at 17:14
Also, in current GFortran the last seed value is a bit special, as it has to be clamped into a small interval. So with get= you will most likely not get the same value back that you put=. It DOES fulfill the requirements of the standard, but it might be a bit unintuitive. — janneb, Aug 17 '18 at 17:23
When I checked, gfortran 8.1.0 and 4.8 gave the same array for get that was entered with put. — Ross, Aug 17 '18 at 17:52
So you guys don't typically do the 'priming', then? What if you want a single, (potentially) reproducible seed? — Ross, Aug 17 '18 at 17:53
@Ross, see the `else` branch in the gfortran documentation that janneb linked to. The LCG generator uses a single seed. You seed the LCG, grab N integers (12 or 33 for gfortran) from the LCG, and then use those N integers as the seeds to `random_seed`. If you change your single seed to a new value, you'll grab N different integers. — Steve, Aug 17 '18 at 18:14
@francescalus, How will F2018 help? Ross appears to want to use a single seed. `random_init` does not solve that problem. — Steve, Aug 17 '18 at 18:31
@Steve, my comment wasn't useful, agreed. My view was, if I want "repeatable" I'll benefit from a simple statement such as from `random_init`; if I want "reproducible" I'll store the seed as metadata and scalar isn't a concern. — francescalus, Aug 17 '18 at 18:41
@Ross: Try the example at https://pastebin.com/URWtR9p4 ; with a newer GFortran you should see that the get= seed differs from the put= one in the last element of the array. — janneb, Aug 17 '18 at 19:17
FWIW, in this discussion with "current GFortran" or "newer GFortran" I refer to GCC >= 7, at that time improvements to the PRNG were made. — janneb, Aug 17 '18 at 19:23

Correctly setting random seeds for repeatability

2 Answers2

Linked

Related