So, do you mean like you have some big array, and you want to do something like ...

tpmoney · 2025-04-12T00:42:23 1744418543

Does this imply that the logging doesn't happen until all the items have been processed though? If I'm processing a list of 10M items, I have to store up 10M*${num log statements} messages until the whole thing is done?

siknad · 2025-04-12T16:58:53 1744477133

Alternatively, the Writer can be replaced with "IO", then the messages would be printed during the processing.

The computation code becomes effectful, but the effects are visible in types and are limited by them, and effects can be implemented both with pure and impure code (e.g. using another effect).

The effect can also be abstract, making the processing code kinda pure.

In a language with unrestricted side effects you can do the same by passing a Writer object to the function. In pure languages the difference is that the object can't be changed observably. So instead its operations return a new one. Conceptually IO is the same with the object being "world", so computation of type "IO Int" is "World -> (World, Int)". Obviously, the actual IO type is opaque to prevent non-linear use of the world (or you can make the world cloneable). In an impure language you can also perform side-effects, it is similar to having a global singleton effect. A pure language doesn't have that, and requires explicit passing.

trealira · 2025-04-12T01:08:55 1744420135

Yes, it does imply that, except since Haskell is lazy, you'll be holding onto a thunk until the IO function is evaluated, so you won't have a list of 10 million messages in memory up until you're printing, and even then, lists are lazy, too, so you won't ever have all entries of the list in memory at once, either, because list entries are also thunks, and once you're done printing it, you'll throw it away and evaluate a thunk to create the next cons cell in the list, and then you evaluate another thunk to get the item that the next cell points to and print it. Everything is implicitly interleaved.

In the case above, where I constructed a really long string, it depends on the type of string you use. I used lazy Text, which is internally a lazy list of strict chunks of text, so that won't ever have to be in memory all at once to print it, but if I had used the strict version of Text, then it would have just been a really long string that had to be evaluated and loaded into memory all at once before being printed.

tpmoney · 2025-04-12T13:55:12 1744466112

Sorry, I lack a lot of context for Haskell and its terms (my experience with FP is limited largely to forays into Lisp / Clojure), but if I'm understanding right, you're saying because the collection is being lazily evaluated, the whole process up to the point of re-combining the items back into their final collection will be happening in a parallel manner, so as long as the IO is ordered to occur before that final collection, it will occur while other items are still being processed? So if the program were running and the system crashed half way through, we'd still have logs for everything that was processed up to the point it crashed (modulo anything that was inflight at the time of the crash)?

What happens if there are multiple steps with logging at each point? Say perhaps a program where we want to:

1) Read records from a file

2) Apply some transformations and log

3) Use the resulting transformations as keys to look up data from a database and log that interaction

4) Use the result from the database to transform the data further if the lookup returned a result, or drop the result otherwise (and log)

5) Write the result of the final transform to a different file

and do all of the above while reporting progress information to the user.

And to be very clear, I'm genuinely curious and looking to learn so if I'm asking too much from your personal time, or your own understanding, or the answer is "that's a task that FP just isn't well suited for" those answers are acceptable to me.

trealira · 2025-04-13T20:48:30 1744577310

> And to be very clear, I'm genuinely curious and looking to learn so if I'm asking too much from your personal time, or your own understanding, or the answer is "that's a task that FP just isn't well suited for" those answers are acceptable to me.

No, that's okay, just be aware that I'm not an expert in Haskell and so I'm not going to be 100% sure about answering questions about Haskell's evaluation system.

IO in Haskell is also lazy, unless you use a library for it. So it delays the action of reading in a file as a string until you're actually using it, and in this case that would be when you do some lazy transformations that are also delayed until you use them, and that would be when you're writing them to a file. When you log the transformations, only then do you start actually doing the transformations on the text you read from the file, and only then do you open the file and read a chunk of text from it, like I said.

As for adding a progress bar for the user, there's a question on StackOverflow that asks exactly how to do this, since IO being lazy in Haskell is kind of unintuitive.

https://stackoverflow.com/questions/6668716/haskell-lazy-byt...

The answers include making your own versions of the standard library IO functions that have a progress bar, using a library that handles the progress bar part for you, and reading the file and writing the file in some predefined number of bytes so you can calculate the progress yourself.

But, like the other commenter said, you can also just do things in IO functions directly.

mrkeen · 2025-04-13T12:26:13 1744547173

It's entirely up to you. You can just write Haskell with IO everywhere, and you'll basically be working in a typical modern language but with a better type system. Main is IO, after all.

> if the program were running and the system crashed half way through, we'd still have logs for everything that was processed up to the point it crashed

Design choice. This one is all IO and would export logs after every step:

  forM_ entries $ \entry -> do
      (result, logs) <- process entry
      export logs
      handle result

Remember, if you can do things, you can log things. So you're not going to encounter a situation where you were able to fire off an action, but could not log it 'because purity'.

troupo · 2025-04-12T07:26:21 1744442781

Now repeat it for every function where you want to log.

Now repeat this for every location where you want to log something because you're debugging

trealira · 2025-04-12T14:13:03 1744467183

For debugging purposes, there's Debug.Trace, which does IO and subverts the type system to do so.

But with Haskell, I tend to do less debugging anyway, and more time getting the types right to with; when there's a function that doesn't work but still type checks, I feed it different inputs in GHCi and reread the code until I figure out why, and this is easy because almost all functions are pure and have no side effects and no reliance on global state. This is probably a sign that I don't write enough tests from the start, so I end up doing it like this.

But, I agree that doing things in a pure functional manner like this can make Haskell feel clunkier to program, even as other things feel easier and more graceful. Logging is one of those things where you wonder if the juice is worth the squeeze when it comes to doing everything in a pure functional way. Like I said, I haven't used it in a long time, and it's partly because of stuff like this, and partly because there's usually a language with a better set of libraries for the task.

troupo · 2025-04-12T18:56:19 1744484179

> Logging is one of those things where you wonder if the juice is worth the squeeze

Yeah, because it's often not just for debugging purposes. Often you want to trace the call and its transformations through the system and systems. Including externally provided parameters like correlation ids.

Carrying the entire world with you is bulky and heavy :)