While I was researching how to do level-order traversals of a binary tree in Haskell, I came across a library called tree-traversals which introduced a fancy Applicative instance called Phases. It took me a lot of effort to understand how it works. Although I still have some unresolved issues, I want to share my journey.
Note: I was planning to post this article on Reddit. But I gave up because it was too long so here might be a better place.
See the discussion.
Note: This article is written in a beginner-friendly way. Experts may find it tedious.
Note: The author is not a native English speaker and is glad to accept corrections, refinements and suggestions.
The Phases Applicative is capable of reordering effects in the chain of the underlying Applicative, without changing the function application semantics.
To demonstrate this, let's import it first.
ghci> import Control.Applicative.Phases
Then we define two effectful computations.
ghci> let foo = putStrLn "foo" *> pure (3 :: Int)
ghci> let bar = putStrLn "bar" *> pure (4 :: Int)
Tip: If you have trouble reading these
*>expressions, you can interpret them monadically. Which is to saylet foo = putStrLn "foo" *> pure (3 :: Int)is equivalent to
let foo = putStrLn "foo" >> return (3 :: Int)or even
let foo = do { putStrLn "foo"; return (3 :: Int) }(This is only a trick for readability and is definitely not to say every Applicative is a Monad, although in the case of
IOit's true.)
Let's chain these computations.
ghci> pure (,) <*> foo <*> bar
foo
bar
(3,4)
(One could use <$> or even liftA2 for conciseness. But I prefer pure and <*> because they feel more "primitive".)
To adapt the Phases Applicative, add now
ghci> runPhasesForwards (pure (,) <*> now foo <*> now bar)
foo
bar
(3,4)
Which just yields the same result. Boring.
Use delay to postpone one effect.
ghci> runPhasesForwards (pure (,) <*> delay (now foo) <*> now bar)
bar
foo
(3,4)
Use delay twice to postpone one effect ... by two turns.
ghci> let baz = putStrLn "baz" *> pure (5 :: Int)
ghci> runPhasesForwards (pure (,,) <*> delay (delay (now foo)) <*> delay (now bar) <*> now baz)
baz
bar
foo
(3,4,5)
Note although the order of the effects changes, the associated function application does not. Which can be seen in the above computation yielding the value (3,4,5), which corresponds to the values "extracted" in the order of foo,bar,baz.
It works like magic.
The definition of Phases is
data Phases f a where
Lift :: f a -> Phases f a
(:<*>) :: f (a -> b) -> Phases f a -> Phases f bTip: if you are unfamiliar with the above syntax, learn about the
GADTsextension first.A side effect of
GADTsis that it enables the use of existential quantification in data constructors. The definition of(:<*>)exploits this. It "hides" the typeainside the constructor.If we use the
ExistentialQuantificationextension instead, the above definition would bedata Phases f a = Lift (f a) | forall x. f (x -> a) :<*> Phases f xIn this definition, the constructor
(:<*>)"hides" the typex. (Notexdoes not appear to the left hand side of=, thus is not a part of the type signature.)
This definition resembles that of a Free Applicative a lot because the (:<*>) constructor captures the <*> operation instead of actually performing it. So I read the Free Applicative paper.
Tip: A free something, explained in a non Category Theory way, is a wrapper data type that satisfies something but does not require the underlying type to satisfy it. Instead of performing the actual computation, it captures the arguments into corresponding data constructors, accumulating a chain-like structure, which allows the computation to be interpreted in different ways later. If you are interested, try Free Monads first.
To my surprise, there are two kinds of Free Applicatives: the left-parenthesised and the right-parenthesised.
The left-parenthesised version has the following definition:
data FreeAL f a = PureL a
| forall b. FreeAL f (b -> a) :*: f bWhich corresponds to the canonical form for chaining Applicatives:
pure f <*> x1 <*> x2 <*> ... <*> xnThe right-parenthesised version has the following definition:
data FreeA f a = Pure a
| forall b. f (b -> a) :$: FreeA f bWhich corresponds to right-parenthesised canonical form:
fn <*> (... (f2 <*> (f1 <*> (pure x))))This form is rarely seen in the real world.
The paper claimed these two definitions were isomorphic and it took me some time to figure out why.
To turn a left-parenthesised form into a right-parenthesised one:
(pure f <*> x) <*> y
=
fmap (&) x <*> (fmap (&) y <*> (pure (\fy -> \fx -> f fx fy)))where (&) is the reverse function application operator. (Provided in Data.Function)
i.e.
(&) x f = f xTo turn a right-parenthesised form into a left-parenthesised one:
u <*> (v <*> pure x)
=
pure (\fu -> \fv -> fu (fv x)) <*> u <*> vThis isomorphism / duality should not be very surprising for two reasons.
- It's analogous to how to implement one of
foldl/foldrvia the other. - It can be seen as a generalization of the interchange law of Applicatives. (
u <*> pure y = pure ($ y) <*> u)
The definition of Phases is almost a definition of the right-parenthesised Free Applicative.
data Phases f a = Lift (f a)
| forall x. f (x -> a) :<*> Phases f x
-- vs --
data FreeA f a = Pure a
| forall b. f (b -> a) :$: FreeA f bBut it isn't. Because the Lift constructor can capture any effectful applicative computation while the Pure constructor can only capture a plain value.
I have been playing with Haskell for several years. But until this time I learn that Applicative does not restrict the order of its effects.
Take IO as an example.
The following expression
pure (,) <*> getChar <*> getCharwhen executed, reads two characters from stdin and wraps them in a pair.
Suppose the user inputs ab.
You might get ('a', 'b').
You might also get ('b', 'a') ! This will happen if the second getChar is executed first.
This indetermination will drive you crazy.
So most Applicatives including IO are implemented in a way that you can assume a left-to-right order on effects. This guarantees the latter behavior will never occur.
Not specifying of the order of effects is a feature, not a bug™. Some libraries exploit it. Backwards from the transformers package runs effects in reverse order. Concurrently from the async package runs effects in parallel.
It's a pity that most Applicative tutorials don't point out this issue. One exception is the tutorial from wikibooks, which includes a dedicated section.
Other references include this Reddit post and this Stack Overflow question.
Guess what this expression will output when executed?
add <*> (multiply <*> pure 10)where
add = putStrLn "add" *> pure (+ (1 :: Int))
multiply = putStrLn "multiply" *> pure (* (2 :: Int))The answer is
add
multiply
Which is counterintuitive. At the same time, an astute reader will realize the effects in the chain of Applicatives is assocative. This is to say if one can combine effects with <>, the final effect of (effect1 <> effect2) <> effect3 is the same as that of effect1 <> (effect2 <> effect3). Plus there is pure, a way to wrap a value into an Applicative with no effect (I guess that's why it's called pure), which corresponds to mempty. So we can conclude that it forms a special instance of Monoid!
Being a Monoid means, for any "order-sensitive" Applicative like IO, the order of the associated effects depends only on the order in which the computation appears in the expression, not on the order in which the computation is performed. If we want to shuffle the effects, we need to move the computations around without changing the semantics of the overall computation. And the "overall computation" in the chain of Applicatives is ... a function application.
The function application operator is the most notorious operator in the world.
- It's not associative:
f a b ≠ f (a b) - It's not commutative:
f x ≠ x f - It doesn't even have a visible notation! Suppose it has one,
!, then a Haskell programzipWith (,) [1..10] ['a'..'z']would have to be written aszipWith ! (,) ! [1..10] ! ['a'..'z']. (We don't treat($)as a valid candidate because a function application operator should be left-associative.)
But we can fix that.
- We know that
f x y ≠ f (x y). But we can combinexandyfirst then apply the result tofbyf x y = (uncurry f) ((,) x y). - We know that
f (g x) ≠ (f g) x. But we can composefandgfirst then applyxto the composition byf (g x) = ((.) f g) x. - We know that
f x ≠ x f. But we can swap them byf x = ($ x) f. - We know that
f x y ≠ f y x. But we can swapxandybyf x y = (flip f) y x. - We know that
(f x) (g y) ≠ (f g) (x y). But we can use(f x) (g y) = (\(x', y') -> f x' (g y')) ((,) x y). This will change the order of occurrence of terms fromf,x,g,ytof,g,x,y.
These techniques are useless, until you find they can be applied in the Applicative world.
By the way, the function application operator in the Applicative world has a visible notation, that is, <*>.
Now if you stare at the definition of the Applicative instance for Phases, it should look less like a bunch of abstract nonsense to you.
instance Applicative f => Applicative (Phases f) where
pure = now . pure
Lift mf <*> Lift ma = Lift $ mf <*> ma
Lift mf <*> (mh :<*> ty) = liftA2 (.) mf mh :<*> ty
(mg :<*> tx) <*> Lift ma = liftA2 flip mg ma :<*> tx
(mg :<*> tx) <*> (mh :<*> ty) = liftA2 (\g h ~(x,y) -> g x (h y)) mg mh :<*> liftA2 (,) tx tyIt "lifts" the above shuffling techniques to the Applicative level!
We have divided our analysis of Applicatives into two parts - the function application part and the effect part. The former is done. Here comes the latter.
I struggled for a long time trying to understand what happens to the effects when two Phases are chained with a <*>. Then I came up with the following data type.
data EffectList a = Singleton a
| Cons a (EffectList a)
deriving ShowHere I model effects as a. And a should be a Monoid.
For example:
- String
"1"represents "some effect that outputs 1 to stdout when executed". - String
"12"represents "some (composed) effect that outputs 1 then outputs 2 to stdout when executed". - String
""represents "no effect".
This data type definition is very similar to that of Phases,
data Phases f a = Lift (f a)
| forall x. f (x -> a) :<*> Phases f xexcept it replaces the f (x -> a) part.
Then I invent an operator <+> (I just make up the notation), defined as
infixl 4 <+>
(<+>) :: Monoid a => EffectList a -> EffectList a -> EffectList a
(Singleton xs) <+> (Singleton ys) = Singleton (xs <> ys)
(Singleton xs) <+> (Cons ys m) = Cons (xs <> ys) m
(Cons xs m) <+> (Singleton ys) = Cons (xs <> ys) m
(Cons xs m) <+> (Cons ys n) = Cons (xs <> ys) (m <+> n)I define it by referencing the definition of <*> for Phases.
For example, one case of <*> is
(mg :<*> tx) <*> Lift ma = liftA2 flip mg ma :<*> txIf we focus on the effect part and blur the others, we will get
(mg :<*> tx) <*> (Lift ma) = (mg ... ma) :<*> txWhich corresponds to
(Cons xs m) <+> (Singleton ys) = Cons (xs <> ys) mUsing the same method, we can derive the other cases.
Before experimenting with this operator, let's introduce two functions.
now :: Monoid a => a -> EffectList a
now = Singleton
delay :: Monoid a => EffectList a -> EffectList a
delay = Cons memptyWhich are defined in the same way as those with the same name in Phases. I list them below for your information.
-- | schedule an action to run in the current phase
now :: f a -> Phases f a
now = Lift
-- | delay all actions by a phase
delay :: Applicative f => Phases f a -> Phases f a
delay ta = pure id :<*> taLet's try it.
ghci> now "1" <+> now "2" <+> now "3" <+> now "4"
Singleton "1234"
ghci> now "1" <+> now "2" <+> delay (now "3") <+> now "4"
Cons "124" (Singleton "3")
ghci> delay (now "1") <+> now "2" <+> delay (delay (now "3")) <+> now "4"
Cons "24" (Cons "1" (Singleton "3"))
We can interpret the results in this way: "effects" with the same "nested level" are concatenated together.
For instance, in the third example:
- We get
"24"first becausenow "2"andnow "4"are nested under zerodelays. - We get
"1"next becausenow "1"is nested under onedelay. - We get
"3"last becausenow "3"is nested under twodelays.
To further explain what the <+> operator does, let's list the (desugared) operands and the result in the third example in parallel.
Cons "" (Singleton "1")
Singleton "2"
Cons "" (Cons "" (Singleton "3"))
Singleton "4"
----------------------------------------------
Cons "24" (Cons "1" (Singleton "3"))
Can you recognize this computation?
It's zipWith (<>)! (To be more precise, it's zipWith4 (<>).)
But not exactly. Here are the differences:
zipWithaccepts normal lists while<+>acceptsEffectLists, which is defined as non-empty lists.zipWithhandles ragged lists by dropping the extra parts, while<+>keeps them.- To use
zipWith, you use it likeliftA2. To use<+>, you use it like<*>(i.e. interleave the operands with it)
For the sake of completeness, here is the definition of run, the counterpart to runPhasesForward. It fuses the effects stored in a EffectList.
run :: Monoid a => EffectList a -> a
run (Singleton xs) = xs
run (Cons xs m) = xs <> run mYou can use it like this.
ghci> run $ Cons "24" (Cons "1" (Singleton "3"))
"2413"
With the aid of EffectList, it would be natural to understand how Phases works. Effects in the same phase will be concatenated in the same cell while effects in the later phases will be concatenated in the next ones.
- Evgeniy Malov provides an implementation for level order traversals of a binary tree. It contains a
zipWith'function, which keeps the extra parts of ragged lists too! Seems thePhasesApplicative really captures some essentials. - The Applicative paper also mentions
zipWithas an instance of Applicative.zappis similar to<+>.
- I still consider my method (divide an Applicative into the function application part and the effect part and analyze them one by one) a little complicated. If you have a more concise or a more clever explanation, please let me know.
- Does
EffectListhave a canonical name? Is there an isomorphic structure mentioned in a paper?