Why do I need reduce?
Honestly, you don't really. reduce was a built-in in Python 2 and was demoted to functools in Python 3.
Most of the examples from reduce can be done in an easier fashion using other built-ins.
What does reduce do?
reduce works the same way as most "reduce" functions in Computer Science. It "reduces" an array of numbers to one number, adding the first number to the second, and adding subsequent numbers to each intermittent total.
Example 1: Implementing a "sum" function using reduce
from functools import reduce
import operator as o
my_list = [1,2,3,4,5]
reduce(o.add, my_list)
>>> 15
The above example works the same way as running sum(my_list).
What the above example is doing:
o.addis a functional approach to algebra;o.add(1,2)is the same as1 + 2.o.addtakes two arguments,reduceadds the next value to the current subtotal.
It works the same way as:
total = 1
next_numbers = [2, 3, 4, 5]
for next_number in next_numbers:
total = total + next_number
Basically, there's always a function with two parameters used, with one of the parameters being the next value and the other being the intermediate total value.
Example 2: Implementing a "factorial" function using reduce
from functools import reduce
import operator as o
my_list = [1,2,3,4,5]
reduce(o.mul, my_list)
>>> 120
As you can see, it's similar to the loop we mentioned before, except that the addition operator is replaced with a multiplication operator:
total = 1
next_numbers = [2, 3, 4, 5]
for next_number in next_numbers:
total = total * next_number
Which is faster? reduce or the loops?
import timeit
from functools import reduce
import operator as o
def factorial(values):
total = values[0]
next_numbers = values[1:]
for next_number in next_numbers:
total = total * next_number
return total
timeit.timeit('factorial([1,2,3,4,5])', setup="from __main__ import factorial", number=1000)
>>> 0.0005066430021543056
timeit.timeit('reduce(o.mul, [1,2,3,4,5])', setup="from functools import reduce\nimport operator as o", number=1000)
>>> 0.000527281008544378
From the above, it appears that our own factorial implementation has similar performance to using reduce.
Conclusion
There are many situations where it's useful to reduce numbers to a single value, however the implementation for those situations might be better suited for other functions than reduce from `functools.
Understanding how reduce works, by using an intermittent sum, is useful since this topic comes up often in Computer Science, and is actually how Apache Spark aggregates data.
As we saw above, the time difference for a user defined function or a similar version with reduce yield similar results. In this case one should obviously opt for whatever appears to be more clean/verbose/understandable code.
Comments