## Who’s Afraid of the Big Ol’ unsafePerformIO?

*Note. I wrote this post a long time ago. Now I know more about Haskell, so I’ve reworked and republished it on Computationally Endowed, my new corner of the internet. You can find the new version of this post at its new location: Random Code, Permutations, and unsafePerformIO.*

A couple of months ago while in New York I got a call from a friend in need of some Haskell help. He was working on a project for a Comp. Bio course and he needed to implement the Knuth Shuffle, but he was having a bit of touble using Haskell arrays. His problem involved some type error that I couldn’t tackle over the phone, so it had to wait until later.

Once I got home, I got onto Wikipedia and looked up Knuth Shuffle. Seeing it was an inherently imperative algorithm, I wasn’t totally surprised my friend hit a few bumps while trying to translate it to Haskell. I decided to go ahead and implement my own version so I’d have a better understanding of what he was doing. I called him later that night to help him fix the problem, but by that time he had figured it out.

The implementation I wrote sat around on my hard drive untouched for a couple months until I rediscovered it last week. For your viewing pleasure, here it is:

module Permutation ( permute ) where import Data.Array.MArray import Data.Array.IO import System.Random permute :: [a] -> IO [a] permute xs = let len = length xs in do a <- newListArray (0, len - 1) xs mapM_ (\i -> randomRIO (i, len - 1) >>= swap a i) [0..len-1] getElems a swap :: (Ix i) => IOArray i a -> i -> i -> IO () swap a i j = do e1 <- readArray a j e2 <- readArray a i writeArray a i e1 writeArray a j e2

Looking this code over, a question came to mind: can I pull the
`permute`

function out of the `IO`

monad by using
`unsafePerformIO`

? The documentation
page for `unsafePerformIO`

lays down some cursory guidelines for its
use:

This is the “back door” into the `IO`

monad, allowing
`IO`

computation to be performed at any time. For this to be safe,
the `IO`

computation should be free of side effects and independent
of its environment.

If the I/O computation wrapped in `unsafePerformIO`

performs side
effects, then the relative order in which those side effects take place
(relative to the main I/O trunk, or other calls to unsafePerformIO) is
indeterminate.

The second part of this disclaimer points out the essential problem with
allowing mutation in a lazy language: without any guarantee of when an
expression will be evaluated, one does not know when or in what order side
effects will take place. Hence the `IO`

monad, which requires the
programmer sequence computations that may produce side effects or may depend on some mutable state.

But what if you don’t care when a side effect takes place? What if you don’t
care when a computation is performed relative to other side effects?
`unsafePerformIO`

in essence allows you to say exactly that about
a computation; it removes that computation from the `IO`

monad, thus
absolving the programmer of the task of specifying when, relative to other
computations, the computation should be performed. However the documentation
for this function tells us it is only “safe” if, i.e., it is necessary that,
“the `IO`

computation … be free of side effects and independent of
its environment”. This certainly seems like a sufficient condition; if I can say
that about a computation, then it certainly doesn’t matter when or in what
relative order the computation is performed. Is it necessary? No, and I
belive the `permute`

function above proves it.

`permute`

does mutate global state: the state of Haskell’s random
number generator (stored in `theStdGen`

) is modified by each call
to `randomRIO`

. It is dependent on its environment: the value in
`theStdGen`

determines the values produced each time
`randomRIO`

is called. Despite this it would be absurd to claim
the state of Haskell’s random number generator matters to me each time I want
to permute a list, or that it matters when I modify the state of the random
number generator relative to other computations. Therefore, I can define
`permute`

as follows:

permute :: [a] -> [a] permute l = unsafePerformIO $ let len = length l in do a <- newListArray (0, len - 1) l mapM_ (\i -> randomRIO (i, len - 1) >>= swap a i) [0..len-1] getElems a

A few more issues need to be addressed. The documentation page for
`unsafePerformIO`

mentions three precautions one must take when
compiling a module using this function. All of them address unintended behavior
that arises due to optimizations performed by GHC (under the assumption of referential transparency).The `NOINLINE`

pragma is essential. It doesn’t seem as though common subexpression elimination
or let-lifting will be an issue in the module. However, uses of permute could
fall victim to either of these optimizations. As an example, consider the
following expression:

[permute [1..4], permute [1..4]]

If this expression is in some compiled module, GHC could potentially (almost certainly) identify the common subexpressions and transform the expression it into the following:

let e = permute [1..4] in [e, e]

Obviously, this wouldn’t be a good thing, since instead of getting two (likely)
distinct permutations, one would *always* get the same permutation. To
avoid this, use the `-fno-cse`

option
when compiling such a module.

### The exception

A possible objection one might raise to all of this is that it is possible to
set the state of Haskell’s random number generator by using the function
`setStdGen`

. After the state of the generator is set, the generator
ceases to be “random” and becomes observably deterministic and so it would
matter when this action is performed relative to calls to `permute`

.
Such an objection I belive is right, or at least tenable. So don’t do it!