It's a good article (perhaps a little slow to get going), and an interesting topic.
I suppose today, with the surges of interest in ML-derived content, the concept of "color" of bits is more relevant than ever.
There's the quote "Machine learning is money laundering for bias" (consider, eg, that if you ask an image generator model to create a "woman," it will very likely be a white woman), and I suppose the same is true for copyright, and other things. Basically it adds a layer of plausible deniability, and some could argue changes the "color of the bits."
As the article discusses, the law often _does_ care about the color/provenance of bits, though CS people are more prone to take a "data is just data" approach.
It's a great illustration of what a pure and an impure (effectful) function is.
The author complains that getting some bits as a result of function f() is seen by the legal system differently than when these bits are obtained via another function g(), even though f() ≡ g()。
The problem is, of course, that f() and g() are not pure functions. They produce effects as a part of their being computed, such as accessing various sources of information. The colour thus obtained is one of the effects: accessing a medium while breaking a license agreement that forbids doing so is not the same as obtaining the very same bits from a random number generator by a blind chance. (Note that usually an RNG is also a stateful thing.)
It is easy to provide an analogy of colour in another pair of impure functions computing the same result.
Assume a predicate (theorem) P, which we assume is true. Let f(P) be true iff P is true, but producing a detailed proof of truthiness of P while computed (say, as a log output). Let g(P) be true iff P is true, but providing no explanation whatsoever. They return the very same one bit result: true, and this is the correct result, by construction. But there true values definitely have different colours, don't they?
I take it somewhat differently: the function by itself may be pure, but executing it is a real action in a physical universe - an event with causes and effects.
The colour of a thing isn't obtained from any state changes - rather, it's computed by going back down the causal chain leading to the thing in question. A pirated video file has a different colour than the bit-for-bit identical file that fell out of an RNG, for a very simple reason: in the first case, you actually got a pirated file. In the latter, you didn't. The provenance of data is what determines the colour.
I feel viewing colour as effects of an impure function implies that the colour is a thing that gets created due some events, and exists on its own. But it doesn't. Rather, the colour of a thing is a pure function of how that specific thing came to be, and also of the "colour palette" at the time of interest. That is, the same bytes can bear the colour "illegal" today, but be of "perfectly OK" colour tomorrow, due to a law change you aren't even aware of.
Curiously, while the article implies that the CS POV is grounded in hard reality and the legal POV is focused on make-believe constructed universe abstractions, I'm starting to think the opposite is the case. The legal POV is really about discovering how things came to be or happen. Looking at the causal network. That's as real as it gets. CS POV, in contrast, purposefully discards provenance and gives everything value semantics. That is clearly a fantasy view (though a very useful one for certain tasks).
I think they are distinguishing between photographs of real situations (with children) and realistic renders. The result may be essentially the same, but the "bit colour" (the source) taints the first.
The author seems to argue if there are no children involved there's no problem, but notes others argue the images are harmful to view regardless of how they are produced.
I think it's fair to say that 20 years ago, discussions around CSAM were a lot less developed than they are today in a variety of ways.
There were a lot more people much more ready to publicly make (IMHO terrible) arguments about how a jpg is just a stream of bits, which is a number, and you can't just arbitrarily ban specific large integers. Or about how nobody is harmed if streams of bits get passed around, it's just data, ones and zeroes, that's not harming anyone. Or that the data is encoded, so it's meaningless without a decoder, a different decoder could decode that bitstream into literally anything else.
Fundamentally these arguments were always pretty nasty and meritless. It's images/video of kids being raped, the continued existence and trading of which can be argued to do continuing harm to the victims. The accessibility of this stuff also goes some way to normalise the behaviour and create bubbles in which there is demand for new material, so more kids get abused.
Ideas of children being unable to consent were well established in law and society though, it was only 20 years ago, so that bit about consent does stick out considerably. But again in certain corners of the internet * there were people who would argue about it, you hope theoretically.
( * and not that obscure, pretty sure you could find this sort of stuff in the comments on slashdot back in that sort of time-period)
> It's images/video of kids being raped, the continued existence and trading of which can be argued to do continuing harm to the victims.
I don't think that's what the author is saying. They say that IF any real world kids are involved in the "production" of the video, the image is illegal (have a bad "colour"). Nobody disagrees with that, especially not the author of that blog post. This is contrasted with drawn or rendered images that were created in a computer graphic program.
> Or about how nobody is harmed if streams of bits get passed around, it's just data, ones and zeroes, that's not harming anyone. Or that the data is encoded, so it's meaningless without a decoder, a different decoder could decode that bitstream into literally anything else.
That's exactly what the author is arguing againts - he says that this is a naive interpretation of computer person, that makes no sense to a lawyer (in his words: color is not a function of bits).
I think CSAM is too sensitive topic to use it an example (today, maybe 20 years ago it was different). So my version of that example for 2023: watching and distributing (adult) rape videos is morally abhorrent and often illegal. But porn websites are full of staged rape videos. It's not about the content of the video, but its source (colour).
...that was my explanation of the thought process of the author. As he mentions, not everyone agrees with that interpretation in that context.
Sure, and I’m not trying to put words in the author’s mouth here. I wanted to give some ‘historical’ context for those who may not have been around back then.
But I think it’s either a sincere mistake or a very poor attitude to bring up consent affecting the colour in relation to CSAM, which they explicitly do about a third of the way into the article.
The internet was just colonised by bland NPCs and everyone interesting is now on the dark web. The reason why you don't see them here today is the same reason why you didn't see them in the NYT in 2003, you can't make money off ads when you're out of the Overton window.
It's hard to express how much I disagree with that position, given the context.
Contrarianism isn't interesting purely for its own sake, especially in relation to child abuse. And the internet is full of voices who are knee-jerk contrarian for the pure sake of it. Most of what they have to say is meaningless noise.
It sounds like you're dismissing the arguments purely by what you think it would mean to consider them seriously. "If these arguments are valid, then these conclusions are valid. These conclusions should not be valid. Therefore the arguments are invalid."
Why is it untenable that yes, data is data and it should move freely, and no, children should not be raped and filmed?
Well, because of the effects stated in the post you’re responding to.
It’s a source of continuing harm to the person who has been abused, and it creates bubbles of acceptance which go on to encourage creation of more material.
We have moved past the somewhat myopic “it’s just ones and zeroes” view of this in general, as a society. Actions have consequences, even actions as simple as exchanging a specific large integer with another computer.
Consider loli — should that be illegal? No real children are involved, but nonetheless loli depicts the sexualization of children. In Canada loli is prohibited, in the US it is not (which I personally think is the right call, but the matter is very controversial).
"But when it comes to child pornography, I think maybe Colour should make a difference - if we're going to ban it at all, it should matter where it came from. Whether any children were actually involved, who did or didn't give consent, in short: what Colour the bits are. The other side takes the opposite tack: child pornography is dangerous by its very existence, and it doesn't matter where it came from. They're claiming that whether some bits are child pornography or not, and if so, whether they're illegal or not, should be entirely determined by (strictly a function of) the bits themselves."
I see what you mean about the "who did or didn't give consent" line. I don't see how "consent" could be a mitigating factor in determining whether or not something qualifies as CSAM.
Photographs or movies involving, for example, nude children, which do not meet the bright-line child pornography standard, are nonetheless in a legal gray area and may be considered to be exploitation of those children if the parents did not give consent. Think Thora Birch at the end of American Beauty.
It's unfortunate - I was going to send this around because it's an excellent thing to think about, but then there's that bit that comes out of nowhere and if I send that to someone they'll think certain things about me.
That bit is very relevant to the concept of the colour of the bits (in this particular case, computer images) the author is explaining, and is only a single sentence of a quite long blog post. Why are you bothered by it so much that you 're afraid to even send the post to someone else?
Colour is used as a metaphor for the relationships lawyers want to create and the structuralist view that how you create logic matters as much or more than the content of the logic. Mushed together into one concept.
Then the author teases out the tangled truth. Logic and binary code can't support inherent relationships. Ones and zeros don't come with a 'colour'.
The ethical problems in the article conflates a symbolic representation generated by computer monitors from code, with the code itself being the problem. Like blaming the wind for carrying a dirty limerick into your ear.
Unfortunately the best way to deal with symbols is to burn them down. Detect and delete dirty pictures. You could make the Colour of generating unethical symbols illegal. Detect and prosecute after the fact like we do with most crimes. But not the binary code itself being illegal, just the symbolic representation of CP.
Also, DRM failing in the article's examples, is a feature not a bug. It gives a temporary structure to digital rights and doesn't constrain code re-use (like a software patent would) or try to lock down ownership of logic.
This is the article that opened my eyes about the irremediable distance between Law and Computer Science. Nonetheless, Law comes from Humanities, Computer Science from Sciences. Each one has a specific mindset and it's not easy to change from one to another, let alone conciliate them in the real world.
There's the quote "Machine learning is money laundering for bias" (consider, eg, that if you ask an image generator model to create a "woman," it will very likely be a white woman), and I suppose the same is true for copyright, and other things. Basically it adds a layer of plausible deniability, and some could argue changes the "color of the bits."
As the article discusses, the law often _does_ care about the color/provenance of bits, though CS people are more prone to take a "data is just data" approach.