Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's exactly the problem, in the software I have in mind, the conversions are actually very slow, and I can't easily change the content of the functions that process the data, they are very mathematical, it would take much time to rewrite everything.

For example, it's not my case but it's like having to convert between two image representations (matrix multiply each pixel) every time.

I'm scared that this kind of 'automatic conversion' slowness will be extremely difficult to debug and to monitor.





Why would it be difficult to monitor the slowness? Wouldn’t a million function calls to the from_F_to_K function be very noticeable when profiling?

On your case about swapping between image representations: let’s say you’re doing a FFT to transform between real and reciprocal representations of an image - you probably have to do that transformation in order to do the the work you need doing on reciprocal space. There’s no getting around it. Or am I misunderstanding?

Please don’t take my response as criticism, I’m genuinely interested here, and enjoying the discussion.


I have many functions written by many scientists in a unique software over many years, some expect a data format the others another, it's not always the same function that is called, but all the functions could have been written using a unique data format. However, they chose the data format when writing the functions based on the application at hand at that moment and the possible acceleration of their algorithms with the selected data structure.

When I tried to refactor using types, this kind of problems became obvious. And forced more conversions than intended.

So I'm really curious because, a part from rewriting everything, I don't see how to avoid this problem. It's more natural for some applications to have the data format 1 and for others the data format 2. And forcing one over the other would make the application slow.

The problem arises only in 'hybrid' pipelines when new scientist need to use some existing functions some of them in the first data format, and the others in the other.

As a simple example, you can write rotations in a software in many ways, some will use matrix multiply, some Euler angles, some quaternions, some geometric algebra. It depends on the application at hand which one works the best as it maps better with the mental model of the current application. For example geometric algebra is way better to think about a problem, but sometimes Euler angles are output from a physical sensor. So some scientists will use the first, and the others the second. (of course, those kind of conversions are quite trivial and we don't care that much, but suppose each conversion is very expensive for one reason or another)

I didn't find it a criticism :)


If I understood the problem correctly, you should try calculating each format of the data once and reusing it. Something like:

    type ID {
        AsString string
        AsInt int
        AsWhatever whatever
    }

    function new type ID:
        return new ID {
            AsString: calculateAsString()
            AsInt: calculateAsInt()
            AsWhatever: calculateAsWhatever()
        }
This does assume every representation will always be used, but if that's not the case it's a matter of using some manner of a generic only-once executor, like Go's sync.Once.

But the data changes very often in place with the functions calls on it.

I agree that would be a good solution, despite that my data is huge, but it assumes the data doesn't change, or doesn't change that much.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: