Testing RNG Uniformity: Visual Checks, Statistical Tests, and Sample Sizes

Uniformity is the cornerstone of quality random number generation. A uniform random number generator produces each possible outcome with equal probability—whether generating integers, floating-point numbers, or selecting from discrete options. Testing for uniformity ensures your random number generator behaves correctly, which is critical for fair simulations, unbiased sampling, and reliable statistical analysis.

When a random number generator fails the uniformity test, bias creeps into your results. This bias can be subtle—perhaps certain numbers appear 0.1% more often than others—but over millions of trials, even small biases compound into significant errors. Whether you're building a game, running scientific simulations, or generating random samples, verifying uniformity protects against these insidious failures.

Generate random numbers and test their properties using our Random Number Generator tool, then apply these testing methods to verify uniformity.

Understanding Uniformity

Uniformity means that every possible outcome has an equal probability of occurring. For a uniform distribution over integers 1 through 10, each number should appear exactly 10% of the time in a large sample. For a continuous uniform distribution over [0, 1), any interval of equal width should contain equal probabilities.

Discrete Uniformity

For discrete outcomes (like rolling a die or selecting from a list):

Expected frequency: If generating N numbers from k possible outcomes, each outcome should appear approximately N/k times.

Example: Generating 10,000 random integers from 1 to 10:

Expected frequency: 10,000 / 10 = 1,000 occurrences per number
Observed frequencies should cluster around 1,000

Continuous Uniformity

For continuous distributions (like floating-point numbers in [0, 1)):

Equal probability intervals: Any interval [a, b] within [0, 1) should contain proportion (b - a) of all generated values.

Example: For uniformly distributed values in [0, 1):

Interval [0, 0.5) should contain ~50% of values
Interval [0.5, 1.0) should contain ~50% of values
Interval [0, 0.1) should contain ~10% of values

Visual Inspection Methods

Before diving into statistical tests, visual inspection provides quick insights into RNG behavior and can reveal obvious problems.

Histograms

Histograms display the frequency distribution of generated values, making it easy to spot imbalances.

Procedure:

Generate a large sample (10,000+ values)
Divide the range into bins (10-50 bins typically)
Count occurrences in each bin
Plot histogram with expected frequency line

Interpreting Histograms:

Uniform distribution: Bars should have similar heights with random variation
Bias indicators: Systematic patterns, repeated spikes, or gaps
Smoothness: Uniform distributions should appear smooth, not jagged

Example: Testing a Die Simulator Generate 60,000 rolls and create a histogram with 6 bins. Each bin should contain approximately 10,000 occurrences. If one bin consistently shows 11,000+ while another shows 9,000-, you've detected bias.

Quantile-Quantile (Q-Q) Plots

Q-Q plots compare the distribution of your sample against the expected uniform distribution.

Procedure:

Generate sample values
Sort sample values
Compare sorted sample quantiles against theoretical uniform quantiles
Plot points—should form a straight diagonal line

Interpreting Q-Q Plots:

Straight diagonal line: Indicates uniform distribution
Curved line: Suggests skewness or bias
S-shaped curve: Indicates systematic deviation from uniformity

Practical Tip: Use statistical software (R, Python with matplotlib) to generate Q-Q plots automatically from your samples.

Runs Plots

Runs plots display consecutive values, revealing patterns that histograms might miss.

Procedure:

Generate sequence of values
Plot values in order (or pairs of consecutive values)
Look for clustering, cycles, or patterns

What to Look For:

Random scatter: Good uniformity
Clustering: Values grouping together suggests correlation
Cycles: Repeating patterns indicate periodicity or poor seeding
Gaps: Missing regions suggest range problems

Example: Testing 2D Uniformity Plot pairs of consecutive values (x_i, x_i+1) as points in a 2D space, where i represents the index position. Uniform distribution should show points scattered evenly across the square. Clusters or bands indicate correlation between consecutive values.

Statistical Tests for Uniformity

Visual inspection reveals obvious problems, but statistical tests provide quantitative measures of uniformity with defined significance levels.

Chi-Square Goodness-of-Fit Test

The chi-square test is the most common method for testing discrete uniformity.

Procedure:

Generate N values from k possible outcomes
Count observed frequency O_i for each outcome i
Calculate expected frequency E_i = N / k
Compute chi-square statistic: χ² = Σ((O_i - E_i)² / E_i)
Compare to chi-square distribution with (k-1) degrees of freedom

Example Calculation: Testing 10,000 integers from 1 to 10:

Expected per number: 1,000
Observed: [998, 1002, 995, 1008, 1001, 997, 1003, 999, 1004, 993]
χ² = (998-1000)²/1000 + (1002-1000)²/1000 + ... = 0.034
Critical value (α=0.05, df=9): 16.92
Since 0.034 < 16.92, we fail to reject uniformity (good result)

Requirements:

Expected frequency ≥ 5 per bin (preferably ≥ 10)
Independent observations
Large sample size (typically 1,000+)

Limitations:

Less sensitive to local deviations than global tests
Requires sufficient sample size per bin
Can miss subtle patterns

Kolmogorov-Smirnov (KS) Test

The KS test compares the empirical cumulative distribution function (CDF) to the theoretical uniform CDF.

Procedure:

Generate N values and sort them
Compute empirical CDF F_n(x)
Compare to theoretical uniform CDF F(x) = x
Calculate maximum difference: D = max|F_n(x) - F(x)|
Compare to KS distribution critical values

Advantages:

Works for continuous distributions
No binning required
Sensitive to deviations across the entire range

Limitations:

Less powerful for detecting tail deviations
Requires knowledge of the theoretical distribution

Example: Testing [0, 1) Uniformity Generate 10,000 values, sort them, and compute the maximum difference between empirical and theoretical CDFs. Large differences indicate non-uniformity.

Anderson-Darling Test

The Anderson-Darling test is similar to KS but gives more weight to tail deviations.

Advantages:

More sensitive to tail behavior than KS
Better for detecting deviations at distribution extremes
Commonly used in statistical quality control

When to Use:

When tail behavior is critical
When you suspect problems at distribution boundaries
When you need higher sensitivity than KS test

Practical Testing Procedure

A systematic approach ensures thorough testing:

Step 1: Generate Large Sample

Sample Size Guidelines:

Coarse checks: 1,000-10,000 values
Thorough testing: 100,000-1,000,000 values
Research-grade: 1,000,000+ values

Rule of Thumb: For chi-square with k bins, ensure expected frequency ≥ 5 per bin. If testing 10 bins, generate at least 50 values, but 1,000+ provides better power.

Step 2: Visual Inspection

Start with visual methods:

Create histogram with appropriate binning
Generate Q-Q plot
Create runs plot for sequence analysis

Action: If visual inspection reveals obvious problems, fix the RNG before proceeding to statistical tests.

Step 3: Statistical Testing

Apply multiple tests:

Chi-square for discrete uniformity
KS or Anderson-Darling for continuous uniformity
Run tests on multiple samples with different seeds

Best Practice: Use multiple seeds to ensure test results aren't seed-specific. A good RNG should pass tests across various seeds.

Step 4: Interpret Results

P-values:

p < 0.05: Reject uniformity (evidence of bias)
p ≥ 0.05: Fail to reject uniformity (consistent with uniform distribution)

Important: Failing to reject doesn't prove uniformity—it means no evidence of non-uniformity was found. Multiple tests with consistent results increase confidence.

Step 5: Document and Report

Record:

Sample size
Number of bins (if applicable)
Test statistics and p-values
Seeds used
Any visual observations

Common Pitfalls

Pitfall 1: Insufficient Sample Size Small samples lack statistical power to detect subtle biases. Use thousands to hundreds of thousands of values for meaningful tests.

Pitfall 2: Poor Binning Choices Uneven bin sizes or too few bins distort chi-square results. Use equal-sized bins and ensure adequate expected frequencies.

Pitfall 3: Testing Single Seed One seed might produce a "lucky" sequence that passes tests. Test multiple seeds to ensure consistency.

Pitfall 4: Over-Interpreting Visual Results Visual patterns can be misleading. Small random variations look like patterns. Always supplement visual inspection with statistical tests.

Pitfall 5: Ignoring Multiple Testing Running many tests increases the chance of false positives. Use Bonferroni correction or focus on a few key tests.

Worked Example: Testing a Custom RNG

Scenario: Testing a custom RNG that generates integers from 1 to 20.

Step 1: Generate 100,000 values.

Step 2: Visual inspection - histogram shows roughly equal bars (good sign).

Step 3: Chi-square test:

Expected per number: 5,000
Observed frequencies range from 4,987 to 5,023
χ² = 18.4
Critical value (α=0.05, df=19): 30.14
p-value ≈ 0.50 (fail to reject uniformity)

Step 4: Repeat with 10 different seeds - all pass chi-square test.

Conclusion: RNG appears uniform for this use case.

Conclusion

Testing RNG uniformity is essential for ensuring reliable random number generation. Visual methods provide quick insights, while statistical tests offer quantitative assessments. A combination of both approaches, applied to large samples across multiple seeds, provides confidence in RNG quality.

Remember that no test proves perfection—they only detect deviations from uniformity. Multiple tests with consistent results increase confidence, but good RNGs should pass tests across various conditions and seeds.

For practical random number generation, use our Random Number Generator, which employs high-quality algorithms designed to pass standard uniformity tests. Then apply these testing methods to verify the results meet your specific requirements.

For more on RNG quality, explore our articles on seeding and repeatability, common RNG mistakes, and true vs pseudo-randomness.

FAQs

How many samples do I need for reliable testing?

For chi-square tests, aim for at least 5 expected observations per bin (preferably 10+). For 10 bins, that means 50-100+ samples minimum, but 1,000+ provides better statistical power. For thorough testing, use 100,000+ samples.

What if my RNG fails uniformity tests?

First, verify your test implementation is correct. If tests are valid, the RNG likely has bias. Consider using a different algorithm (modern PRNGs like PCG or xoshiro), check seeding practices, or investigate range conversion issues (modulo bias).

Can visual inspection replace statistical tests?

No. Visual inspection can reveal obvious problems but lacks the rigor of statistical tests. Use visual methods for initial screening, then follow up with statistical tests for quantitative assessment.

How often should I test my RNG?

Test during development and when changing RNG implementations. For production systems, periodic testing (monthly or quarterly) helps catch degradation or implementation issues. Test after system updates that might affect RNG behavior.

Is perfect uniformity achievable?

Theoretical perfect uniformity requires infinite samples. In practice, good RNGs produce distributions that are statistically indistinguishable from uniform given appropriate sample sizes. Focus on RNGs that pass standard statistical tests rather than seeking perfect uniformity.

Sources

Knuth, Donald E. The Art of Computer Programming, Volume 2: Seminumerical Algorithms. Addison-Wesley, 1997.
L'Ecuyer, Pierre, and Simard, Richard. "TestU01: A C Library for Empirical Testing of Random Number Generators." ACM Transactions on Mathematical Software, 2007.
Marsaglia, George. "Diehard: A Battery of Tests of Randomness." Florida State University, 1996.

Try our Free Random Number Generator →

Formula Forge

Formula Forge

Testing RNG Uniformity: Visual Checks, Statistical Tests, and Sample Sizes

Understanding Uniformity

Discrete Uniformity

Continuous Uniformity

Visual Inspection Methods

Histograms

Quantile-Quantile (Q-Q) Plots

Runs Plots

Statistical Tests for Uniformity

Chi-Square Goodness-of-Fit Test

Kolmogorov-Smirnov (KS) Test

Anderson-Darling Test

Practical Testing Procedure

Step 1: Generate Large Sample

Step 2: Visual Inspection

Step 3: Statistical Testing

Step 4: Interpret Results

Step 5: Document and Report

Common Pitfalls

Worked Example: Testing a Custom RNG

Conclusion

FAQs

Sources