Clash Series: Checking Number is Prime

clash of code
algorithms
python
Author

Salman Faris

Published

March 5, 2022

This is the first of hopefully many more posts on Clash of Code tricks I learn along the way. I call this the Clash Series and in today’s series, we look at how to write an efficient script to check for prime numbers, and how to write it fast python for those fastest mode clashes.

prime number
An integer p is a prime number (or simply prime) if p > 1 and it is not divisible by any integer except 1 and itself. If p is not prime, we say that it is a composite.

Equivalently, we say p is prime if p > 1 and whenever it decomposes into p = ab, then either a=1, b=p or a=p, b=1. Thus, if p is not prime, then there is a decomposition of p into a product of positive integers a, b that are not necessarily 1 or p; hence, the name composite.

The first few prime numbers are 2, 3, 5, 7, 11 and so on, while the first few composite numbers are 1, 4, 6, 8, 9 and so on.

Now here is what we are interested in.

The problem

problem statement
Given an integer n \in \mathbb{Z}, how to enumerate all prime numbers less than n?

A naive enumeration algorithm can be achieved in the following way: For any given integer n,

  1. If n < 2, then n cannot be prime.
  2. If n = 2, then n is prime.
  3. Loop over all integers k \in [3, n). If there exists k such that k \equiv 0 \bmod n, then n is not prime. Otherwise n is prime.

A python code for this naive implementation is as follows.

def is_prime_naive(n: int) -> bool:
    # 1 and any negative integer are not prime.
    if n < 2:
        return False
    # 2 is prime.
    if n == 2:
        return True
    for k in range(2, n):
        # If n is divisible by k for any k < n, n is not prime.
        if n % k == 0:
            return False
    return True

The time complexity for this algorithm is \mathcal{O}(n) where n is the given integer. This is already good in most cases, but can we do better?

Well any integer n > 2 is always divisible by 2 if it is even, and therefore cannot be prime. So we can add a step to check if n is even or not before doing the looping step. As a result, our algorithm running time is cut in half as we now only have to loop over only odd numbers less than n.

def is_prime_better(n: int) -> bool:
    # 1 and any negative integer are not prime.
    if n < 2:
        return False
    # 2 is prime.
    if n == 2:
        return True
    # NEW: If n is even and n is not 2, then it is not prime.
    if n % 2 == 0:
        return False
    for k in range(3, n, 2):
        # NEW: If n is divisible by odd k for any k < n, n is not prime.
        if n % k == 0:
            return False
    return True

The problem, however, is that the theoretical time complexity of this algorithm is still \mathcal{O}(n/2) = \mathcal{O}(n). So good for small n but still problematic for super large n.

Efficient algorithm (sqrt trick)

So what can we do to make this code faster? Enter… the square root (sqrt) trick.

First, without loss of generality, let’s assume that n is a positive integer since if n < 2, then it’s taken care by the n < 2 check. We now make the following claim.

Lemma 1 (Sqrt Decomposition) Let n > 1 be a positive integer. If n = ab is a composite integer such that 0 < a \leqslant b, then a \leqslant \sqrt{n} \leqslant b.

Proof. First observe that n = ab is a perfect square if and only if a = b = \sqrt{n}, in which case we are done.

So suppose n = ab is not a perfect square. Then a and b cannot be \sqrt{n} and moreover a < b. We now claim that a < \sqrt{n} < b. Suppose a < \sqrt{n} but also b < \sqrt{n}, then ab < n which is a contradiction. Similarly, if both a > \sqrt{n} and b > \sqrt{n}, then ab > n, also a contradiction. Thus, since a < b, it has to be that a < \sqrt{n} < b. \blacksquare

So what’s the point of Lemma 1? Well the main takeaway is the following: if the number n is composite, then we will at least find one of its divisors before or equal its integer square root \lfloor \sqrt{n} \rfloor.

Theorem 1 Let n be a composite positive integer. Then there exists a prime divisor p of n such that p \leqslant \lfloor \sqrt{n} \rfloor.

Proof. By Lemma Lemma 1, we know that if n = kb is a composite integer with 0 < k \leqslant b, then k \leqslant \sqrt{n}, with equality iff n is a perfect square. Equivalently, this means that k \leqslant \lfloor \sqrt{n} \rfloor. If k is prime, then we are done. Otherwise, there exists a prime p that divides k (hence, divides n) that would also satisfy p \leqslant \lfloor \sqrt{n} \rfloor as desired. \blacksquare

Note: for a linear traversal check (like we are doing, starting from 3, 4, 5, …), we know that we will encounter any prime divisor p of any composite divisor k of n first. So the last part of the proof is unnecessary for our need, but we put it there for completeness.

So the idea of using sqrt decomposition speeds up our algorithm tremendously, especially for large n. In fact, even for small n like n=1080, its integer square root is 32 which is already two order of magnitudes lower.

We now add the sqrt trick to our python implementation.

def is_prime_sqrt_trick(n: int) -> bool:
    # 1 and any negative integer are not prime.
    if n < 2:
        return False
    # 2 is prime.
    if n == 2:
        return True
    # If n is even and n is not 2, then it is not prime.
    if n % 2 == 0:
        return False
    # NEW: loop until int(sqrt(n)) + 1.
    # The + 1 is to handle if n is perfect square.
    for k in range(3, int(n**.5)+1, 2):
        # If n is divisible by odd k for any k < n, n is not prime.
        if n % k == 0:
            return False
    return True

The sqrt trick improves time complexity from \mathcal{O}(n) to \mathcal{O}(\sqrt{n}) which is massive! To give you an idea of this speedup, we will run the naive and sqrt algorithms to check if n = 2{,}147{,}462{,}143 is prime. Note that this is a prime number on the order of 2^{31}.

The sqrt trick improves time complexity from \mathcal{O}(n) to \mathcal{O}(\sqrt{n}) which is massive!

speedup comparison ⚡️
import time

n = 2_147_462_143

for f in [is_prime_naive, is_prime_sqrt_trick]:
    start = time.time()
    f(n)
    print(f"Took total of {(time.time()-start):.10f} seconds using {f.__name__}")
Took total of 118.3165051937 seconds using is_prime_naive
Took total of 0.0010521412 seconds using is_prime_sqrt_trick

Efficient writing

So the impementation is super efficient \mathcal{O}(\sqrt{n}), but how do we make writing the code efficient for something like Clash of Code?

Well first note that when writing in a clash, you don’t care about the 80 char PEP format or readability. So it’s finally time to write one-liner 100 chars fugly code.

The trick I use is to use all to replace the for-loop and replace the False checks with True statement by “bundling” their evaluation using and. This gives the following one-liner.

def is_prime_sqrt_short(n: int) -> bool:
    return n == 2 or (n > 2 and n % 2 != 0 and all(n % k != 0 for k in range(3, int(n**.5)+1, 2)))

In fact, since the modulo operator % in python returns a positive integer, we can save writing one character for each != by writing > instead.

def is_prime_sqrt_super_short(n: int) -> bool:
    return n == 2 or (n > 2 and n % 2 > 0 and all(n % k > 0 for k in range(3, int(n**.5)+1, 2)))