Formula Forge Logo
Formula Forge

Common RNG Mistakes: Modulo Bias, Seeding Pitfalls, and Range Bugs

Random number generation seems straightforward until subtle bugs introduce bias, predictability, or incorrect ranges. These mistakes can corrupt simulations, create unfair games, compromise security, or produce invalid statistical results. Understanding common RNG errors—and how to avoid them—prevents costly bugs and ensures reliable random number generation.

The most dangerous RNG mistakes are those that appear to work correctly. Code might run without errors, generate numbers that "look random," and pass cursory inspection. Yet hidden biases can systematically skew results, making simulations unreliable, games unfair, or statistical tests invalid. Learning to recognize and fix these issues is essential for anyone working with random numbers.

Generate random numbers correctly using our Random Number Generator, then review your code against these common mistakes to ensure reliability.

Mistake 1: Modulo Bias

Modulo bias is one of the most common and insidious RNG errors. It occurs when using the modulo operator (%) to map a larger range to a smaller range, creating unequal probabilities for different outcomes.

The Problem

Incorrect Implementation:

import random

# Trying to generate random integer from 1 to 10
bad_random = random.randint(0, 2**31 - 1) % 10 + 1

Why This Fails: If the RNG generates values 0 to 2³¹-1, and we want 1-10:

  • There are 2³¹ possible RNG values
  • 2³¹ ÷ 10 = 214,748,364 remainder 6
  • Values 1-6 occur 214,748,365 times each
  • Values 7-10 occur 214,748,364 times each

Result: Values 1-6 are slightly more likely than 7-10, creating bias.

The Fix

Option 1: Use Library Functions

import random

# Correct: Use built-in range function
correct_random = random.randint(1, 10)

Option 2: Rejection Sampling

import random

def uniform_range(min_val, max_val):
    """Generate uniform integer in [min_val, max_val] without bias."""
    range_size = max_val - min_val + 1
    max_valid = (2**31 // range_size) * range_size
    
    while True:
        candidate = random.randint(0, 2**31 - 1)
        if candidate < max_valid:
            return min_val + (candidate % range_size)

Option 3: Use Appropriate Range

import random

# If RNG generates 0-99, and you want 0-9:
# 100 is divisible by 10, so modulo works correctly
value = random.randint(0, 99) % 10  # No bias here

Detecting Modulo Bias

Test Method:

import random
from collections import Counter

# Generate many values
values = [random.randint(0, 2**31 - 1) % 10 for _ in range(100000)]
counts = Counter(values)

# Check if counts are roughly equal
expected = 100000 / 10
for value, count in sorted(counts.items()):
    deviation = abs(count - expected) / expected
    print(f"{value}: {count} (deviation: {deviation:.2%})")

Large deviations indicate bias.

Mistake 2: Predictable Seeding

Seeding PRNGs with predictable values makes sequences guessable, which is problematic for security and can create patterns in simulations.

The Problem

Predictable Seeds:

import random
import time

# Bad: Current time is predictable
random.seed(int(time.time()))

# Bad: Process ID is predictable
random.seed(os.getpid())

# Bad: Sequential seeds
for i in range(10):
    random.seed(i)  # Predictable pattern

Security Implications: If an attacker knows approximate time or can guess the seed, they can predict the sequence:

  • Game cheats can predict outcomes
  • Security tokens become guessable
  • Simulations become reproducible by attackers

The Fix

For Reproducibility (Research/Testing):

import random

# Fixed, documented seed
random.seed(42)  # Document why this seed was chosen

For Production Uniqueness:

import random
import secrets

# Use cryptographically secure random for seed
random.seed(secrets.randbits(64))

For Production with Some Entropy:

import random
import time
import os

# Combine multiple sources
seed = int(time.time()) ^ (os.getpid() << 16) ^ (id(object) & 0xFFFF)
random.seed(seed)

Critical: Never use predictable seeds for security applications. Use CSPRNGs (secrets, crypto.getRandomValues()) instead.

Mistake 3: Off-by-One Range Errors

Confusing inclusive vs. exclusive bounds creates out-of-range values or missing endpoints.

The Problem

Common Confusions:

Issue 1: Exclusive vs Inclusive

# RNG generates [0, 1) - exclusive of 1
value = random.random()  # 0 <= value < 1

# Mistake: Assuming inclusive
if value == 1.0:  # This never happens!
    print("Got maximum value")

Issue 2: Array Indexing

import random

items = ['a', 'b', 'c', 'd', 'e']

# Wrong: random() generates [0, 1), so this can't access last element
bad_index = int(random.random() * len(items))  # Max: 4, but 5 elements!

# Correct: Use randint with inclusive bounds
good_index = random.randint(0, len(items) - 1)

Issue 3: Range Boundaries

import random

# Want: Random integer from 5 to 10 inclusive
# Wrong: This generates 5-9
wrong = random.randint(5, 9)

# Correct: This generates 5-10
correct = random.randint(5, 10)

The Fix

Always Verify Boundaries:

import random

def test_range(min_val, max_val, n_samples=10000):
    """Verify random range includes both endpoints."""
    values = [random.randint(min_val, max_val) for _ in range(n_samples)]
    assert min(values) == min_val, "Minimum not reached"
    assert max(values) == max_val, "Maximum not reached"
    print(f"Range [{min_val}, {max_val}] verified")

test_range(5, 10)

For Floating-Point Ranges:

import random

# Want: [0, 1] inclusive (not [0, 1))
# Standard random() gives [0, 1), so adjust:
value = random.random()
if value == 0.0:
    value = 1.0  # Handle edge case, or use different method

# Better: Use uniform with explicit bounds
value = random.uniform(0, 1)  # Still [0, 1) typically
# Check documentation for inclusive/exclusive behavior

Mistake 4: Assuming Independence Where None Exists

Transformations can introduce correlation even when the underlying RNG is independent.

The Problem

Shuffling with Bias:

import random

def bad_shuffle(items):
    """Naive shuffle creates bias."""
    for i in range(len(items)):
        j = random.randint(0, len(items) - 1)  # Can pick same index!
        items[i], items[j] = items[j], items[i]
    return items

# This creates bias: some permutations are more likely than others

Correlated Transformations:

import random

# These are correlated, not independent
x = random.gauss(0, 1)
y = x + random.gauss(0, 1)  # y depends on x

# If you need independent samples:
x = random.gauss(0, 1)
y = random.gauss(0, 1)  # Truly independent

The Fix

Use Proven Algorithms:

import random

# Correct: Fisher-Yates shuffle (built into random.shuffle)
items = [1, 2, 3, 4, 5]
random.shuffle(items)  # Uniformly random permutation

For Custom Shuffling:

import random

def fisher_yates_shuffle(items):
    """Fisher-Yates shuffle algorithm."""
    for i in range(len(items) - 1, 0, -1):
        j = random.randint(0, i)  # j in [0, i] inclusive
        items[i], items[j] = items[j], items[i]
    return items

Ensure Independence:

import random

# Independent samples
sample1 = [random.gauss(0, 1) for _ in range(100)]
sample2 = [random.gauss(0, 1) for _ in range(100)]

# Verify independence (should have correlation near 0)
import numpy as np
correlation = np.corrcoef(sample1, sample2)[0, 1]
print(f"Correlation: {correlation:.4f}")  # Should be near 0

Mistake 5: Insufficient Samples in Tests

Testing RNG quality requires large samples. Small samples can pass bad generators or fail good ones by chance.

The Problem

Inadequate Testing:

import random
from collections import Counter

# Testing with only 100 samples
values = [random.randint(1, 10) for _ in range(100)]
counts = Counter(values)

# Might pass or fail by chance, not due to actual quality
for value, count in counts.items():
    if count < 5 or count > 15:  # Arbitrary thresholds
        print(f"Suspicious: {value} appears {count} times")

Statistical Tests Need Power:

  • Chi-square tests require expected frequency ≥ 5 per bin
  • Small samples lack power to detect subtle biases
  • Random variation can mask or create false patterns

The Fix

Use Appropriate Sample Sizes:

import random
from collections import Counter

# Test with adequate sample size
n_samples = 100000
values = [random.randint(1, 10) for _ in range(n_samples)]
counts = Counter(values)

# Expected frequency
expected = n_samples / 10

# Chi-square test
chi_square = sum((count - expected)**2 / expected for count in counts.values())
print(f"Chi-square statistic: {chi_square:.2f}")

# Critical value for 9 degrees of freedom, α=0.05: 16.92
if chi_square > 16.92:
    print("Reject uniformity - possible bias detected")
else:
    print("Consistent with uniform distribution")

Multiple Test Runs:

import random

def test_rng_quality(n_samples=100000, n_tests=10):
    """Run multiple tests to verify consistency."""
    passed = 0
    for _ in range(n_tests):
        # Run uniformity test
        if test_uniformity(n_samples):
            passed += 1
    
    print(f"Passed {passed}/{n_tests} tests")
    return passed == n_tests

Mistake 6: Using PRNGs for Security Tasks

Standard PRNGs are predictable and vulnerable. Never use them for security applications.

The Problem

Security Anti-Patterns:

import random

# Never do this for security:
session_token = ''.join(random.choice('abcdefghijklmnopqrstuvwxyz') 
                        for _ in range(32))

# Never do this:
encryption_key = random.getrandbits(256)

# Never do this:
password = ''.join(chr(random.randint(33, 126)) for _ in range(16))

Why This Fails:

  • PRNGs are predictable given enough output
  • Seeds can be guessed or inferred
  • Vulnerable to state recovery attacks

The Fix

Use CSPRNGs:

import secrets

# Correct: Cryptographically secure
session_token = secrets.token_urlsafe(32)
encryption_key = secrets.token_bytes(32)
password = secrets.token_hex(16)

Python Secrets Module:

import secrets

# Random integer in range
value = secrets.randbelow(100)

# Random choice from sequence
item = secrets.choice(['a', 'b', 'c'])

# Random bytes
token = secrets.token_bytes(16)

# URL-safe token
url_token = secrets.token_urlsafe(16)

Quick Checklist

Before deploying RNG code, verify:

  • [ ] No modulo bias: Using library functions or proper rejection sampling
  • [ ] Correct ranges: Inclusive/exclusive bounds verified, endpoints tested
  • [ ] Appropriate seeds: Fixed for reproducibility, secure for production
  • [ ] Independence: No accidental correlation between random draws
  • [ ] Adequate testing: Large samples (10,000+) for quality verification
  • [ ] Security: Using CSPRNGs for passwords, tokens, keys
  • [ ] Documentation: Seed choices and RNG algorithms documented

Conclusion

RNG mistakes are easy to make and hard to detect. Modulo bias, predictable seeds, range errors, and independence assumptions can silently corrupt results. Understanding these common pitfalls—and how to avoid them—prevents bugs and ensures reliable random number generation.

Always use library functions when possible (they handle edge cases correctly), test with adequate sample sizes, and choose appropriate RNGs for your application (PRNGs for statistics, CSPRNGs for security). When in doubt, consult RNG literature or use established libraries rather than implementing custom solutions.

For reliable random number generation, use our Random Number Generator, which implements best practices to avoid these common mistakes. Then review your code against this checklist to ensure correctness.

For more on RNG best practices, explore our articles on common RNG mistakes (this article), testing RNG uniformity, and seeding and repeatability.

FAQs

Why is modulo bias bad?

Modulo bias creates unequal probabilities for different outcomes. Even small biases compound over millions of trials, corrupting simulations, making games unfair, or invalidating statistical tests. Some outcomes become systematically more likely than others.

How do I know if my RNG has bias?

Test with large samples (10,000+ values). Use chi-square tests for discrete outcomes, Kolmogorov-Smirnov tests for continuous. Compare observed frequencies to expected. Consistent deviations indicate bias.

Can I fix modulo bias after the fact?

No. Once values are generated with bias, you can't remove it. You must fix the generation method. Use library functions or proper rejection sampling from the start.

Is seed predictability always a problem?

For security applications, yes—predictable seeds make sequences guessable. For research and testing, predictable seeds are desirable (fixed seeds enable reproducibility). Choose based on your application's needs.

What's the difference between PRNGs and CSPRNGs?

PRNGs are fast and reproducible but predictable—good for simulations and games. CSPRNGs are designed to resist prediction attacks—essential for security applications like password generation. Never use PRNGs for security.

Sources

  • Knuth, Donald E. The Art of Computer Programming, Volume 2: Seminumerical Algorithms. Addison-Wesley, 1997.
  • O'Neill, Melissa E. "PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation." Harvey Mudd College, 2014.
  • National Institute of Standards and Technology. "Recommendation for Random Number Generation Using Deterministic Random Bit Generators." NIST Special Publication 800-90A, 2015.
Try our Free Random Number Generator →
Related Articles