Transducers have the following shape (custom code in "…"):
(fn [rf]
(fn ([] ...)
([result] ...)
([result input] ...)))
Many of the core sequence functions (like map, filter, etc) take operation-specific arguments (a predicate, function, count, etc) and return a transducer of this shape closing over those arguments. In some cases, like cat, the core function is a transducer function and does not take an rf.
The inner function is defined with 3 arities used for different purposes:
-
Init (arity 0) - should call the init arity on the nested transform rf, which will eventually call out to the transducing process.
-
Step (arity 2) - this is a standard reduction function but it is expected to call the rf step arity 0 or more times as appropriate in the transducer. For example, filter will choose (based on the predicate) whether to call rf or not. map will always call it exactly once. cat may call it many times depending on the inputs.
-
Completion (arity 1) - some processes will not end, but for those that do (like transduce), the completion arity is used to produce a final value and/or flush state. This arity must call the rf completion arity exactly once.
An example use of completion is partition-all, which must flush any remaining elements at the end of the input. The completing function can be used to convert a reducing function to a transducing function by adding a default completion arity.
Early termination
Clojure has a mechanism for specifying early termination of a reduce:
A process that uses transducers must check for and stop when the step function returns a reduced value (more on that in Creating Transducible Processes). Additionally, a transducer step function that uses a nested reduce must check for and convey reduced values when they are encountered. (See the implementation of cat for an example.)
Transducers with reduction state
Some transducers (such as take, partition-all, etc) require state during the reduction process. This state is created each time the transducible process applies the transducer. For example, consider the dedupe transducer that collapses a series of duplicate values into a single value. This transducer must remember the previous value to determine whether the current value should be passed on:
(defn dedupe []
(fn [xf]
(let [prev (volatile! ::none)]
(fn
([] (xf))
([result] (xf result))
([result input]
(let [prior @prev]
(vreset! prev input)
(if (= prior input)
result
(xf result input))))))))
In dedupe, prev is a stateful container that stores the previous value during the reduction. The prev value is a volatile for performance, but it could also be an atom. The prev value will not be initialized until the transducing process starts (in a call to transduce for example). The stateful interactions are therefore contained within the context of the transducible process.
In the completion step, a transducer with reduction state should flush state prior to calling the nested transformer’s completion function, unless it has previously seen a reduced value from the nested step in which case pending state should be discarded.