Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So, do you mean like you have some big array, and you want to do something like this? (Below is not a real programming language.)

  for i in 0 to arr.len() {
      new_val = f(arr[i]);
      log("Changing {arr[i]} to {new_val}.\n");
      arr[i] = new_val;
  }
I haven't used Haskell in a long time, but here's a kind of pure way you might do it in that language, which I got after tinkering in the GHCi REPL for a bit. In Haskell, since you want to separate IO from pure logic as much as possible, functions that would do logging return instead a tuple of the log to print at the end, and the pure value. But because that's annoying and would require rewriting a lot of code manipulating tuples, there's a monad called the Writer monad which does it for you, and you extract it at the end with the `runWriter` function, which gives you back the tuple after you're done doing the computation you want to log.

You shouldn't use Text or String as the log type, because using the Writer involves appending a lot of strings, which is really inefficient. You should use a Text Builder, because it's efficient to append Builder types together, and because they become Text at the end, which is the string type you're supposed to use for Unicode text in Haskell.

So, this is it:

  import qualified Data.Text.Lazy as T
  import qualified Data.Text.Lazy.Builder as B
  import qualified Data.Text.Lazy.IO as TIO
  import Control.Monad.Writer
  
  mapWithLog :: (Traversable t, Show a, Show b) => (a -> b) -> t a -> Writer B.Builder (t b)
  mapWithLog f = mapM helper
    where
      helper x = do 
        let x' = f x
        tell (make x <> B.fromString " becomes " <> make x' <> B.fromString ". ")
        pure x'
      make x = B.fromString (show x)

  theActualIOFunction list = do
    let (newList, logBuilder) = runWriter (mapWithLog negate list)
    let log = B.toLazyText logBuilder
    TIO.putStrLn log
    -- do something with the new list...
So "theActualIOFunction [1,2,3]" would print:

  1 becomes -1. 2 becomes -2. 3 becomes -3.
And then it does something with the new list, which has been negated now.


Does this imply that the logging doesn't happen until all the items have been processed though? If I'm processing a list of 10M items, I have to store up 10M*${num log statements} messages until the whole thing is done?


Alternatively, the Writer can be replaced with "IO", then the messages would be printed during the processing.

The computation code becomes effectful, but the effects are visible in types and are limited by them, and effects can be implemented both with pure and impure code (e.g. using another effect).

The effect can also be abstract, making the processing code kinda pure.

In a language with unrestricted side effects you can do the same by passing a Writer object to the function. In pure languages the difference is that the object can't be changed observably. So instead its operations return a new one. Conceptually IO is the same with the object being "world", so computation of type "IO Int" is "World -> (World, Int)". Obviously, the actual IO type is opaque to prevent non-linear use of the world (or you can make the world cloneable). In an impure language you can also perform side-effects, it is similar to having a global singleton effect. A pure language doesn't have that, and requires explicit passing.


Yes, it does imply that, except since Haskell is lazy, you'll be holding onto a thunk until the IO function is evaluated, so you won't have a list of 10 million messages in memory up until you're printing, and even then, lists are lazy, too, so you won't ever have all entries of the list in memory at once, either, because list entries are also thunks, and once you're done printing it, you'll throw it away and evaluate a thunk to create the next cons cell in the list, and then you evaluate another thunk to get the item that the next cell points to and print it. Everything is implicitly interleaved.

In the case above, where I constructed a really long string, it depends on the type of string you use. I used lazy Text, which is internally a lazy list of strict chunks of text, so that won't ever have to be in memory all at once to print it, but if I had used the strict version of Text, then it would have just been a really long string that had to be evaluated and loaded into memory all at once before being printed.


Sorry, I lack a lot of context for Haskell and its terms (my experience with FP is limited largely to forays into Lisp / Clojure), but if I'm understanding right, you're saying because the collection is being lazily evaluated, the whole process up to the point of re-combining the items back into their final collection will be happening in a parallel manner, so as long as the IO is ordered to occur before that final collection, it will occur while other items are still being processed? So if the program were running and the system crashed half way through, we'd still have logs for everything that was processed up to the point it crashed (modulo anything that was inflight at the time of the crash)?

What happens if there are multiple steps with logging at each point? Say perhaps a program where we want to:

1) Read records from a file

2) Apply some transformations and log

3) Use the resulting transformations as keys to look up data from a database and log that interaction

4) Use the result from the database to transform the data further if the lookup returned a result, or drop the result otherwise (and log)

5) Write the result of the final transform to a different file

and do all of the above while reporting progress information to the user.

And to be very clear, I'm genuinely curious and looking to learn so if I'm asking too much from your personal time, or your own understanding, or the answer is "that's a task that FP just isn't well suited for" those answers are acceptable to me.


> And to be very clear, I'm genuinely curious and looking to learn so if I'm asking too much from your personal time, or your own understanding, or the answer is "that's a task that FP just isn't well suited for" those answers are acceptable to me.

No, that's okay, just be aware that I'm not an expert in Haskell and so I'm not going to be 100% sure about answering questions about Haskell's evaluation system.

IO in Haskell is also lazy, unless you use a library for it. So it delays the action of reading in a file as a string until you're actually using it, and in this case that would be when you do some lazy transformations that are also delayed until you use them, and that would be when you're writing them to a file. When you log the transformations, only then do you start actually doing the transformations on the text you read from the file, and only then do you open the file and read a chunk of text from it, like I said.

As for adding a progress bar for the user, there's a question on StackOverflow that asks exactly how to do this, since IO being lazy in Haskell is kind of unintuitive.

https://stackoverflow.com/questions/6668716/haskell-lazy-byt...

The answers include making your own versions of the standard library IO functions that have a progress bar, using a library that handles the progress bar part for you, and reading the file and writing the file in some predefined number of bytes so you can calculate the progress yourself.

But, like the other commenter said, you can also just do things in IO functions directly.


It's entirely up to you. You can just write Haskell with IO everywhere, and you'll basically be working in a typical modern language but with a better type system. Main is IO, after all.

> if the program were running and the system crashed half way through, we'd still have logs for everything that was processed up to the point it crashed

Design choice. This one is all IO and would export logs after every step:

  forM_ entries $ \entry -> do
      (result, logs) <- process entry
      export logs
      handle result
Remember, if you can do things, you can log things. So you're not going to encounter a situation where you were able to fire off an action, but could not log it 'because purity'.


Now repeat it for every function where you want to log.

Now repeat this for every location where you want to log something because you're debugging


For debugging purposes, there's Debug.Trace, which does IO and subverts the type system to do so.

But with Haskell, I tend to do less debugging anyway, and more time getting the types right to with; when there's a function that doesn't work but still type checks, I feed it different inputs in GHCi and reread the code until I figure out why, and this is easy because almost all functions are pure and have no side effects and no reliance on global state. This is probably a sign that I don't write enough tests from the start, so I end up doing it like this.

But, I agree that doing things in a pure functional manner like this can make Haskell feel clunkier to program, even as other things feel easier and more graceful. Logging is one of those things where you wonder if the juice is worth the squeeze when it comes to doing everything in a pure functional way. Like I said, I haven't used it in a long time, and it's partly because of stuff like this, and partly because there's usually a language with a better set of libraries for the task.


> Logging is one of those things where you wonder if the juice is worth the squeeze

Yeah, because it's often not just for debugging purposes. Often you want to trace the call and its transformations through the system and systems. Including externally provided parameters like correlation ids.

Carrying the entire world with you is bulky and heavy :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: