The article below
the hyperlinks is about a new experimental field in Science: Randomics.
Randomics can be applied on many problems, here it is applied on the prime
number distribution. For the basic tools and concepts please see the hyperlinks
below.
SATOCONOR.COM
SATOCONOR.COM Journal
of RANDOMICS
The Study of Patterns and Randomness: Randomics13
Are the prime numbers randomly distributed? Part 3
By Johan Gerard van der Galiën (M.Sc.) and Martin Winer
For comments: johan.van.der.galien@satoconor.com
Version 2.2 November 25, 2006
Download extract randomness
primes (PRSample1.dat 128 Kb)
READ ALSO:
The Supporting Paper: ‘Fundamentals of Randomics’ by J.G. van der Galiën
Abstract:
This paper will NOT
demonstrate that the prime numbers are randomly distributed but we DO
demonstrate that one can extract chaos (information entropy or so called
randomness) from it. The method of successive XOR has been demonstrated in a
supporting paper to extract information entropy from any source or file that
has a True Entropy <> 0. Applying this method on the natural numbers
directly yields extracts with all zero or one bits, consequently in contrast to
the prime number distribution the natural number distribution contains 0
A bit stream where 1’s
represented ODD primes and 0’s represented ODD composite number, was prepared
by using the deterministic method of trial divisions. The (bit count – 1) + bit
count is the value of the natural number a bit represents. The novel method of
successive XOR extractions were applied on pairs of neighboring bits. 5
successive XOR extractions reduces the bias until the Goodness-of-Fit test, one
degree of freedom, Chi-bits and the first order entropy were accordingly to a
random file. The obtained fifth extract file passed all tests of the ENT test
suite and is not compressible anymore with State-of-the-Art algorithms.
A second file, which was a bit
stream where 1’s represented ANY prime number and 0’s ANY composite number, was
prepared using the deterministic sieve of Eratosthenis. The bit count is here
the natural number a bit represents. Novel Winer-Galien bitwise manipulator
algorithms were used, which is based on sieving with Dusart’s or the exact
probability density functions of primes and composites, to reduce the (0’s/1’s)
bias as close as possible to unity.
Three remarkable, but yet
unexplained, and different wavy curves with ever increasing wavelength were
discovered in the cumulative first order entropy of the sieved bit stream (with
either Round(), Floor() or Ceiling() functions used in the algorithm) versus
sample size spectra with a fixed number of bytes as increment. So the three
different wavy curves must an artifact from the sieving algorithm. Nevertheless
there is a trend of increasing entropy for the local minimums and maximums of
the curves. Because of the trend this must mean that the file produced by
sieving becomes more random in the tail. So ultimately one should get a random
file produced this way from the prime number distribution that will pass
standard randomness tests.
1. Introduction
There has been a protracted discussion about the ‘randomness’ of the prime distribution. “God may not play dice with the universe, but something strange is going on with the prime numbers." P. Erdös, referring to the famous quote of Einstein.1 To settle this problem once and for all it was decided to do a measurement of the information entropy, extract randomness from compressed prime number distributions and plot information entropy spectra versus the natural number line.2 Getting insight in the “randomness” of the prime numbers is one of the biggest challenges in science. Many unknown things in science come down to this counter intuitive and strange distribution property in the most ordered set we have: The one of the natural numbers. It is the small black dot in the white field of the famous Ying and Yang symbol. If we understand why there will always remain a fraction of chaos in “pure” order and order in “pure” chaos, we will understand something new about the Laws of Nature, all formed during that brief moment of the Big Bang, and this snap shot of the primordial hiss is our handle to tackle the problem of determinism versus indeterminism. I remind you that Zuse’s cellular automata emulation of the Universe theorema has never been disproven.3 And the chaos in the prime number distribution is believed by us to be of the purest form because for the rest the natural number set has the highest ordering. At least we got a 128 Kb analytical pure sample of this primordial hiss for further research, obtained as will be presented in this paper and downloadable from this site.
Entropy can easily be calculated for π(x) by summing the entropy contribution (Sp) of each individual prime (p) under x:
Sp
= Log2(p) / (p – 1) + Log2(p / (p – 1)) (1.1.)4
Problem with 1.1. is that
as x goes to infinity Sp, with has as unit Numbers per Number, also
goes to infinity. With the method presented in this paper it is possible to
derive finite Shannon information entropy in bits per bit for the prime number
distribution, extract analytical pure randomness from the raw chaos and do a H
measurement on the obtained bit stream (the extract) by compression with
State-of-the-Art algorithms. In the rest of the paper when we talk about
entropy we mean the
There are many good approximations for π(x). The last mentioned below is used in this paper. The oldest one is the Gauss conjecture also known as Li(x):
π(x) ≈ Li(x) = 0∫x 1 / Log(t) (1.2.)6
Log(x) is the same as Ln(x) from your calculator, base e logarithm.
This is actually a Cumulative Distribution Function (CDF) since it is a discrete formula (x is an element of the natural numbers and > 0). The probability that a certain x is a prime number is the derivative of 1.2. In other words the Probability Density Function (PDF):
Density ≈ Li’(x) = 1 / Log(x) (1.3.)
There are better approximations, like the one from Pierre Dusart.7 The CDF range is given as:
(x /
Log(x))(1 + 0.992 / Log(x)) ≤ π(x) ≤ (x / Log(x))(1 + 1.2762 / Log(x)) (1.4.)
The two terms can be differentiated to give the range of the approximate density:
F(x) = (x /
Log(x))(1 + C / Log(x)) (1.5.)
(fg)’ = f’g
+ fg’
f(g)’ =
f’(g)g’
(1 /
Log(x))(1 + C / Log(x)) + (x / Log(x))(1 + C / Log(x))’
(1 + C /
Log(x))’ = (C / Log(x))’ = (- C / Log(x)2)(1 / x)
F’(x) = (1
/ Log(x)) + (C / Log(x)2) – (x / Log(x))(C / Log(x)2x)
=>
(1 /
Log(x)) + (C / Log(x)2) + (C / Log(x)3) (1.6.)
The mean calculated from the two values of 1.6. (C = 0.992 and 1.2762) was used as the best approximation of π(x) density in this paper.
2. Materials and Methods
A
4,194,303 bytes raw file was prepared, in which each bit coded for either an
odd composite (0) or an odd prime (1), with an implementation of the trial
division algorithm to identify the primes. This file was subjected to
successive XOR extractions. The idea of these extractions was to obtain an
about 50/50 distribution of the zero and one bits (Goodness-of-Fit test, one
degree of freedom Chi-bits ≈ 0, 5% probability Chi-bits = 0 and 95% =
3.84) in the extracted file. In other words an H ≈ 1 bits per bit. The
change in Chi-bits and First Order Entropy before and after was monitored for
each extraction. After the extraction the H was measured on the obtained
extract file by means of compression with PAsQDa -6e option.8 H in
bits per bit is actually about the same as the compression ratio from
State-of-the-Art algorithms.9 The randomness of the extract files
was checked by the ENT Test Suite from the Fourmilab (
A 49,807,360 bytes Raw File was prepared in which each bit coded for either a composite (0) or a prime (1), with an implementation of the sieve of Eratosthenis. From this file 120 samples of 400,000 bytes were taken and subjected to the Winer-Galien bitwise manipulator algorithm with 1.6. or exact densities to see if this would remove the (0’s/1’s) bias in the bits towards unity. Counts of the obtained bits were collected and for each sample the first order entropy of the resulting bits was calculated. From this spectra were plotted. Also the cumulative first order entropy was plotted for 1.6.. The method was calibrated with a 10.9 Mb random file from the first hexadecimal digits of PI and the spectra from sieving were compared to the spectrum of the raw file obtained the same way as from the sieving results.
2.1. Winer-Galien’s bitwise manipulator algorithms
Read bits(n)
if bit = 1
Output = 1
if bit = 0
Y = Y + 1
if(Y>Round(1-Result(1.6.(n))/Result(1.6.(n)))
Y= 0
Output = 0
Loop until End Of Sample
Algorithm 1: Bitwise manipulator with Dusart’s densities.
Read bits(n)
if bit = 1
Output = 1
Primes = Primes + 1
if bit = 0
Y = Y + 1
if(Y>Round(1-(Primes/n))/(Primes/n)
Y= 0
Output = 0
Loop until End Of Sample
Algorithm 2: Bitwise manipulator with exact densities.
The rational behind these algorithms is that the density of the composites divided by the density of the primes gives the mean local number of composites per prime number (ratio of the local bias). The Algorithms try to reduce the bias of the whole source file to 1 by giving a zero bit as output when the counter of read zero bits in the source has just passed the ratio of the local bias in the source. The counter is then reset to zero and the process is repeated over and over again until the end of the source file. Algorithm 1 was also tested with Floor() (Graph 6) and Ceiling() (Graph 7) instead of Round() (Graph 5). If not specifically mentioned graphs of sieving processes are made with the Round() function.
3. Results
|
Item |
First order entropy (bits per bit) |
True entropy (H) (bits per bit) PAsQDa -6e |
|
Raw File |
0.523445 |
0.45 |
|
|
|
|
XOR Extract 1 |
0.755539 |
0.75 |
|
XOR Extract 2 |
0.930146 |
0.93 |
|
XOR Extract 3 |
0.994130 |
0.99 |
|
XOR Extract 4 |
0.999972 |
1.00 |
|
XOR Extract 5 |
0.999999 |
1.00 |
Table 1: The Entropy for the different (extract) files.
On all the produced XOR extract files the ENT Test Suite was applied. None of the files passed these tests, except for the fifth XOR extract, which actually passes all of these tests. Since this file is only 128 Kb it is too small to conduct the more advanced tests like DIEHARD and NIST. We can only give you the ENT test results in Log 1.
Entropy
= 0.999999 bits per bit.
Optimum
compression would reduce the size
of
this 1048568 bit file by 0 percent.
Chi
square distribution for 1048568 samples is 0.76, and randomly
would
exceed this value 50.00 percent of the times.
Arithmetic
mean value of data bits is 0.4996 (0.5 = random).
Serial
correlation coefficient is -0.000283 (totally uncorrelated = 0.0).
Entropy
= 7.998706 bits per byte.
Optimum
compression would reduce the size
of
this 131071 byte file by 0 percent.
Chi
square distribution for 131071 samples is 235.71, and randomly
would
exceed this value 75.00 percent of the times.
Arithmetic
mean value of data bytes is 127.5844 (127.5 = random).
Serial
correlation coefficient is -0.002982 (totally uncorrelated = 0.0).
Log 1: The fifth successive XOR extract from the Raw File, ENT Test Suite results.
|
File |
First order entropy (bits per bit) |
Chi-bits |
True entropy PasQDa –6e (bits per bit) |
|
Raw Eratosthenis |
0.300468 |
358210314 |
0.21 |
|
Dusart’s 1.6. densities (Algorithm 1) |
0.999990 |
596 |
0.91 |
|
Exact densities (Algorithm 2) |
0.999998 |
108 |
0.91 |
Table 2: The sieving results for the whole Eratosthenis raw file.
|
File |
Bias (0 bits / 1 bits) |
Size (bytes) |
|
Raw Erastosthenis |
17.74 |
49,807,360 |
|
Dusart’s 1.6. densities (Algorithm 1) |
1.008 |
5,334,555 |
|
Exact densities (Algorithm 2) |
1.003 |
5,323,084 |
Table 2: (continued)

Graph 1: The curve of first order entropy (Y) versus the sample number (X) for the 48 Mb raw file made with the sieve of Eratosthenis, composites are zero bits and prime numbers are one bits. The samples where 400,000 bytes (resolution). There is no sieving in this graph.

Graph 2: The same spectrum as Graph 1 but then zoomed in five times into the tail. Sample size 80,000 bytes (resolution). There is no sieving in this graph.

Graph 3: The curve of first order entropy (Y) versus the sample number (X) for sieving with 1.6. Dusart density. The samples where 400,000 bytes large (resolution).

Graph 4: The curve of first order entropy (Y) versus the sample number (X) for sieving with exact density. The samples where 400,000 bytes large (resolution). The samples refer to X * 400,000 bytes as input.

Graph 5: The cumulative first order entropy (Y) versus sample number (X) for sieving with 1.6. Dusart’s density. The samples had an increment of 400,000 bytes large. The samples refer to X * 400,000 bytes as input.

Graph 6: The cumulative first order entropy (Y) versus sample number (X) for sieving with 1.6. Dusart’s density and the Floor() function. The samples had an increment of 400,000 bytes large. The samples refer to X * 400,000 bytes as input.

Graph 7: The cumulative first order entropy (Y) versus sample number (X) for sieving with 1.6. Dusart’s density and the Ceiling() function. The samples had an increment of 400,000 bytes large. The samples refer to X * 400,000 bytes as input.

Graph 8: Calibrating the Dusart sieving algorithm with a blank file (10.9 Mb of the hexadecimal digits of PI) divided in to 28 samples (X) of 400,000 bytes from with the cumulative first order entropy (Y) was recorded and scatter plotted. The data points were connected with straight lines. The samples refer to X * 400,000 bytes as input.
4. Discussion
From Table 1 can be seen that the two different kinds of entropy are always equal when rounded to the same significant digits for the XOR extractions. Meaning that the files are equally chaotic on a lower and a higher order.
Five successive XOR extractions give the highest entropy and according to the ENT Test Suite this is the only extracted file that can be called random.
H is only accurate to 2 - 3 significant digits because the compression table is also stored in the compressed file. The peculiar situation can occur with a true random file as source that the compressed file is actually a (little) larger than the source file. In case of the fifth XOR extraction the source file is about 128 Kb and the “compressed” file is about 129 Kb. This gives 131.303 / 131,071 = 1,001770033 ≈ 1.00 bits per bit. (Information Entropy cannot exceed 1 bits per bit or 8 bits per byte.)
From Table 2 can be seen that there is an enormous reduction of the bias. In a way that it almost passes the Chi-bits randomness test. Since this test must be passed further randomness tests are useless. Nonetheless the high true entropy indicates that the files are pretty random and have only a small amount of compressible patterns.
Graph 1 shows a spectrum that is consistent with the fact that the one bits (primes) become more scarce as one goes up the natural number line. The spectrum is more or less a hyperbolic function although some fine structure can be seen in the tail. Therefore we zoomed in five times into this tail (Graph 2). This spectrum looks like random noise superposed on a declining trend.
The 1.6. (Graph 3) and exact density (Graph 4) spectra are very similar. So there should be no flaws in the way the formula 1.6. is derived as a mathematical differential of Dusart’s formula. Graph 4 shows a little bit more fine structure and should be regarded as the most accurate, because it uses exact densities. One can only say that Dusart’s CDF 1.4. is a very good approximation.
Round() (Graph 5), Floor() (Graph 6) and Ceiling() (Graph 7) al give a different spectrum. The spreading is very small though about 0.0002, 0.003 and 0.0001 bits per bit meaning that the sieving must be very good. These cumulative graphs are shown because the trends are more clear.
In order to calibrate the Dusart 1.6. algorithm sieving done on a 10.9 Mb random file made from the first 23,068,670 hexadecimal digits of PI. This file served as a blank sample and gave a spectrum like in Graph 8. The graph is almost a perfect hyperbolic curve, there is of course a little randomness or spreading left in the first order entropy of 28 400,000 bytes samples. Decreasing the number of samples and increasing the sample size would of course filter this out. This smoothness of the curve is amplified by the fact that the curve is cumulative. But it would be reaching to far to say that because the blank sample gives an almost smooth curve the wavy curves in Graph 5, 6 and 7 are not an artefact of the sieving algorithms but a Number Theoretic phenomenon.
5. Conclusions
-o0o-
Notes & References:
1) Mackenzie D. ‘Homage to an itinerant master’ Science 275 759 (1997)
2) Anonymous ‘Number theory and entropy’
http://www.maths.ex.ac.uk/~mwatkins/zeta/NTentropy.htm
3) Zuse K. ‘Rechnender Raum’ (1969) See also: Anonymous ‘Calculating space’ Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Calculating_Space
4) Anonymous ‘Prime numbers and entropy’
http://www.7stones.com/Homepage/Publisher/Primes.html
5) Van der Galiën J.G. ‘State-of-the-art compressors as tools for true entropy estimations’ SATOCONOR.COM 4.4. (2005)
6,7) Caldwell C.K. ‘How many primes are there?’ Prime Pages
http://primes.utm.edu/howmany.shtml
8) Mahoney M. ‘The paq data compression programs’
http://www.cs.fit.edu/~mmahoney/compression/
9) Meinsma G. ‘Data compression and information theory’
http://wwwhome.math.utwente.nl/~meinsmag/onzin/shannon.pdf
10) Walker P. ‘Ent: A pseudo random number sequence test program’
http://www.fourmilab.ch/random/
11)
Van der Galiën ‘Are the prime numbers randomly distributed? Part
12)
Van der Galiën ‘Last digit primes never 5: Are the prime numbers randomly
distributed? Part
13)
Van der Galiën ‘Fundamentals of randomics: Supporting paper for: Are the prime numbers
randomly distributed? Part