What did you do with your old College Algebra text book after the semester was over? Tear all the pages out? Throw it away? Burn it? Hopefully after reading this article you'll wish you had kept it!

As both a technologist and a College Algebra teacher, I'm always looking for and pleased when I find a single problem that is excellent for teaching both programmers and college algebra students at the same time. For programmers, demonstrating concepts such as Test Driven Development (TDD), and for college algebra students, the concepts being taught in class. I found such a question when reviewing an application to a Data Science boot camp. I've modified the problem slightly for clarity but the intent is the same.

Write a method that accepts a positive integer as a parameter and returns the sum of all positive integers up to but not including that parameter that are divisible by 3 or 5.

The academy hosting the boot camp was asking applicants to write a program, in the language of their choice, to solve the problem posed by the question as part of a "Test your skills" section.

If we use an example to illustrate, where the parameter is 1000, the total we're looking for is:

```
3 + 5 + 6 + 9 + 10 + 12 + 15 + 18 + 20 + 21 + ... + 995 + 996 + 999
```

I choose Java as my language and proceeded using Test Driven Development. I first created a test for an input of 2, saw the test run red, then did the simplest thing possible, i.e., change the method that calculates the sum to return 0, so that the test runs green. I slowly added tests and constantly re-factor everything to build up a suite of tests like the ones shown here:

```
private SumCalculator sumCalculator;
@Before
public void setUp() throws Exception {
sumCalculator = new SumCalculator();
}
@Test
public void test2() {
assertEquals("sum up to 2 is 0", 0, sumCalculator.sumUpTo(2));
}
@Test
public void test3() {
assertEquals("sum up to 3 is 0", 0, sumCalculator.sumUpTo(3));
}
@Test
public void test4() {
assertEquals("sum up to 4 is 3", 3, sumCalculator.sumUpTo(4));
}
```

The `SumCalculator`

class contains a method `sumUpTo`

, that
takes an integer as input and returns the sum of all positive integers, up to but not including
the input, that are divisible by 3 or 5.

When finished the entire set of tests were as follows and all tests ran green:

Test Case Input | Expected Result |
---|---|

2 | 0 |

3 | 0 |

4 | 3 |

5 | 3 |

6 | 8 |

7 | 14 |

9 | 14 |

10 | 23 |

11 | 33 |

15 | 45 |

16 | 60 |

31 | 225 |

1000 | 233,168 |

The final method looked like this:

```
public int sumUpTo(int max) {
int sum = 0;
for (int i = 3; i < max; ++i ) {
if (i % 3 == 0 || i % 5 == 0) {
sum += i;
}
}
return sum;
}
```

The approach was to form a loop, iterating over the positive integers from 3 up to but not including `max`

. We start
at 3 since 1 and 2 will never be evenly divisible by either 3 or 5.
A variable to hold the `sum`

is created and initialized, and if a number is encountered that is evenly divisible
by either 3 or 5, we add that number to the `sum`

.

However this approach has problems. First is the potential for
integer overflow. If the incoming value for `max`

is large enough, the value accumulated
by `sum`

will overflow its integer buffer. I can fix that by simply changing `sum`

and
the return type to `long`

.

```
public long sumUpTo(int max) {
long sum = 0;
for (int i = 3; i < max; ++i ) {
if (i % 3 == 0 || i % 5 == 0) {
sum += i;
}
}
return sum;
}
```

The second problem is that the method is horribly inefficient for large numbers. A
quick timing test passing in 1000000000 for the `max`

takes, on average,
5900 milliseconds for the method to complete.

For curiosity sake I timed the method again by passing in an even larger number, `Integer.MAX_VALUE`

, which
of course in Java is 2^{31}-1, or 2,147,483,647. This
time the method took over 12 seconds to complete on average. Not good. Obviously the
brute force approach I've applied by using a loop over the positive integers, while it
produces the correct answer, is so slow that it makes the method unusable
for sufficiently large numbers.

So how can the method be improved so that it's faster? And not just faster, but fast enough so that it's usable.

This is where College Algebra comes in. Let me introduce you to **arithmetic
sequences and series**.

An **arithmetic sequence** is a sequence in which each term after the first differs
from its predecessor by a fixed constant `d`

, called the **common difference**.
So the common difference is found by `d = a`

.
_{n+1} - a_{n}

The **n ^{th} term** of an arithmetic sequence is given by

`a`_{n} = a_{1} + (n - 1)d

where `a`_{1}

is the first term of the sequence and `d`

is the common difference.
The sum of the first n terms of an arithmetic sequence is called a **finite arithmetic series**
and is denoted by `S`

and can be found by the following formula:
_{n}

S_{n} =

n(a_{1} + a_{n})
2

where a_{1} is the first term of the sequence, a_{n} is the n^{th} term, and n
is the number of terms in the sequence.

The positive integers divisible by 3, up to a given number, form a finite arithmetic sequence, as do the positive integers divisible by 5. Thus, the sum of the first n terms of the positive integers that are divisible by 3 form a finite arithmetic series. Likewise, the sum of the first n terms of the positive integers that are divisible by 5 also form a finite arithmetic series.

That means we can use the above information to solve the problem, and hopefully in an efficient way.

But we have a major re-factoring job to do and the benefits of doing Test Driven Development will pay off nicely. It is invaluable to have a suite of tests in place that will ensure that our re-factor is correct.

So, to find the sum of the first n terms of an arithmetic sequence, I need the
first term, a_{1}, the last term, a_{n}, and the number of terms, n.

Let's focus only on the sequence of the numbers that are divisible by 3 up to but not including 1000. That series looks like this:

```
3 + 6 + 9 + 12 + ... + 999
```

The above shows that a_{1} = 3, the common difference is 3, and a_{n} = 999.

How many terms are we adding in the example? If we look at the formula `a`

and solve it algebraically for n, we end up with
_{n} = a_{1} + (n-1)d

n =

a_{n} - a_{1} + d
d

Plugging in the numbers from the example yields

n =

999 - 3 + 3
3

= 333
The above formula can be reduced since a_{1} = d = 3, so we end up with a neat formula to find
the number of terms in a finite arithmetic sequence:

n =

a_{n}
d

We'll use this later.

Also in our example it's pretty easy to see that the n^{th} term is 999, but in general,
to find the n^{th} term, take the passed in parameter, subtract 1,
divide by 3, truncate any fractional part, then multiply that result by 3.

Here's what that looks like in Java code, where d = 3:

```
int an = (max - 1) / d * d;
```

Since division and multiplication are of the same precedence, Java will work left to right, first performing the
division, i.e., integer division of which will truncate the fractional part, then multiply that result
by `d`

to achieve the largest integer multiple of `d`

that is less than `max`

.
Now we're moving. Back to our formula for finding the sum of an arithmetic series, `S`

:
_{n}

S_{n} =

n(a_{1} + a_{n})
2

and substituting `a`

for n results in
_{n}/d

S_{n} =

a_{n}(a_{1} + a_{n})
2d

And since a_{1} = d

S_{n} =

a_{n}(d + a_{n})
2d

This is the formula that will be implemented in our code.

To show it in use, the sum we saw earlier

```
3 + 6 + 9 + 12 + ... + 999
```

is equal to

S_{n} =

999(3 + 999)
2(3)

= 166833
Now we have to repeat for the finite arithmetic series formed by the integers divisible by 5.

```
5 + 10 + 15 + 20 + ... + 995
```

S_{n} =

995(5 + 995)
2(5)

= 99500
So we should be able to add 166833 and 99500 to get the final answer, right? Wrong.

In each sequence we have counted the multiples of 15, thereby counting them twice, so we need to subtract those numbers once to even things out.

```
15 + 30 + 45 + 60 + ... + 990
```

S_{n} =

990(15 + 990)
2(15)

= 33165
Finally, when the parameter is 1000, we should get 166833 + 99500 - 33165 = 233168.

So, in the Java code, create a private method `series`

to calculate the sum of a finite arithmetic
series, and call that method from `sumUpTo`

for each of the series we need to find:

```
public long sumUpTo(int max) {
return series(max, 3) + series(max, 5) - series(max, 15);
}
private long series(int max, int d) {
long an = (max - 1)/d * d;
return an * (d + an) / (2*d);
}
```

To show that my algebra is sound and my new code accurately produces the correct answer, I need to verify it by running all of the tests that I constructed at the beginning of the exercise. Doing so, all tests run green so I'm confident the refactor is correct and the code is calculating correctly.

Now for the important part: how much did we improve the speed?

Re-running the timing tests specified previously and comparing to our first implementation, we get the following results:

max | Using a loop | Using a series |
---|---|---|

100,000 | 7 ms | < 1 ms |

10,000,000 | 61 ms | < 1 ms |

1,000,000,000 | 5900 ms | < 1 ms |

Integer.MAX_VALUE | 12200 ms | < 1 ms |

Quite a drastic and amazing improvement! The re-factored
code using arithmetic series concepts runs in less than 1 ms for any integer value passed
in, including `Integer.MAX_VALUE`

! How much time could be saved
in an application that uses the method millions of times a day? It's a huge win!

With this exercise I not only got to experience the benefits of TDD by using it to drive out a solution but also create a suite of tests that protect the code from re-factors. I also got the satisfaction of applying my College Algebra knowledge to re-factor inefficient code into an amazingly fast solution. It's very fun to see an elegant algorithm completely outperform such a brute force solution.

If you were a member of the Data Science academy responsible for evaluating applications, would you favor an applicant who turned in the brute force loop or the arithmetic series implementation? As for me, I know which I prefer.

So, next time you encounter an arithmetic problem that you're tempted to implement a brute force solution, pull out your old College Algebra text book and see if you can find a more elegant solution.

More