For calculating the XOR of 1 to n there is a closed form solution, so no need to...

sdenton4 · 2025-07-03T17:14:36 1751562876

Nice!

My offhand solution not using xor is to subtract from the sum of 1 to n, which has a closed form solution. The closed form roughly halves the execution time, as we only have to iterate over the range once.

Good to know there's a similar speedup available on the xor path...

tomtomtom777 · 2025-07-03T11:26:15 1751541975

Fascinating. It can see it work but I still can't really wrap my head around where the magic cycle length of 4 comes from.

danbruc · 2025-07-03T14:59:54 1751554794

Combining two consecutive integers starting with an even one yields one.

  xxxxxxx0 ^ xxxxxxx1 = 00000001

If we start at a number divisible by four and do this twice, we get one twice.

  xxxxxx00 ^ xxxxxx01 = 00000001
  xxxxxx10 ^ xxxxxx11 = 00000001

And combining the two of course yields zero and we are right back at the start.

betasilly · 2025-07-03T13:04:10 1751547850

Another interesting fact is that each time you make the xor of four consecutive numbers, beginning with an even number, the result is zero. Example in J.

  xor =: (16 + 2b0110) b.
  f =: 3 : 'xor/ y + i. 4'
  f"0 ] 2 * 1 + i. 100

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Summing a hundred millions: +/ f"0 ] 2 * i 100000000 gives zero (it takes a few seconds). So it seems the stated property holds for every even n.

danbruc · 2025-07-03T15:08:56 1751555336

Yes, because XORing two consecutive integers only differing in the least significant bit yields one.

  xxxxxxx0 ^ xxxxxxx1 = 00000001

Doing this twice with four consecutive numbers then also cancels the remaining one. That also means that you do not have to use consecutive numbers, you can use two arbitrary pairs

  2m ^ 2m + 1 ^ 2n ^ 2n + 1

and for example

  16 ^ 17 ^ 42 ^ 43

should be zero.

NickPollard · 2025-07-03T12:18:45 1751545125

There are essentially two bits of information in the 'state' of this iterated algorithm: a) Are all the non-lowest bits zero, or are they the value of the latest N b) the value of the lowest bit

So the cycle of (N, 1, N+3, 0) corresponds to (A) and (B) being: (0,0), (0,1), (1,1), (1, 0) - i.e. the 4 possible combinations of these states.

HappyPanacea · 2025-07-03T12:36:27 1751546187

If we generalize the problem to base k (they are k-1 duplicate of each number except the missing number, find missing one using base k-wise addition) then we can see the cycle is the smallest number such the base k-wise addition from 1 to the number is zero and it is power of k will form a cycle. I'm not sure if all such numbers are power of k if they exists or if there is an upper bound on them. For example in base 4 there appears to be no such cycle.

HappyPanacea · 2025-07-03T15:54:13 1751558053

I made an arithmetical mistake in base 4, so I was wrong. I also wrote they are instead of there are.

I think the following is true: For even k the cycle is k^2 long and for odd k is k long. Why? because units' place of generalized xor from 1 to k-1 is (k^2-k)/2 and therefore zero mod k if k is odd, if k is even then if we repeat it twice we get zero. For the second digit, k times the same digit will always give zero. Thus for odd k we have a zero when n is divisible by k and for even k we have a zero when n is divisible by 2k and the smallest power of k divisible by 2k is k^2 so it must be the cycle length.

Thorrez · 2025-07-03T10:54:04 1751540044

In your array-based equation, you say n+1, but in your explanation you say n+3. Is that a mistake?

danbruc · 2025-07-03T12:30:41 1751545841

No, that is correct, those two n represent slightly different things. n + 3 is the value after n XOR n + 1 XOR n + 2, so the n in the array index expression is n + 2 from the explaination and n + 3 results from (n + 2) + 1. I thought about how I could make this less confusing but it just became more confusing in my mind, so just used n in both cases.

Thorrez · 2025-07-04T04:07:40 1751602060

I'm now seeing that they're different. However, this sounds a bit off to me:

>n + 3 is the value after n XOR n + 1 XOR n + 2, so the n in the array index expression is n + 2 from the explaination and n + 3 results from (n + 2) + 1.

The reason I think it's off is that array index expression the start of the sum is 1, but in the explanation the start of the sum is n. So I don't think it's as simple as the ending being different by 2.

danbruc · 2025-07-04T06:54:41 1751612081

In the array expression the array values and the index depend on n and both vary, in the explanation the n is fixed. Let us do an example starting at say 8.

  n =  8   [  8, 1,  9, 0 ][ 8 % 4] =  8 = n
  n =  9   [  9, 1, 10, 0 ][ 9 % 4] =  1 
  n = 10   [ 10, 1, 11, 0 ][10 % 4] = 11 = n + 1
  n = 11   [ 11, 1, 12, 0 ][11 % 4] =  0

  n =  8   n                         = 8               =  8 = n
           n ^ n + 1                 = 8 ^ 9           =  1
           n ^ n + 1 ^ n + 2         = 8 ^ 9 ^ 10      = 11 = n + 3
           n ^ n + 1 ^ n + 2 ^ n + 3 = 8 ^ 9 ^ 10 ^ 11 =  0

So in the explanation we get 11 = n + 3 still with reference to the starting value n = 8, in the array expression on the other hand we have moved on to n = 10 when we pull 11 = n + 1 out of the array.

danbruc · 2025-07-04T10:36:51 1751625411

Amendment to the parent comment. Note that the arrays contain values like 9 and 10 that are not the XOR of 1 to n for any n but they will also never be accessed because when they appear in the array, then the index used will point to a different element.

Also because we end up back at zero when ever n % 4 == 3, there is some flexibility about the starting point. I wrote 1 to n because that is what the article used, but it would be mathematically cleaner to start at zero which actually changes nothing because XORing with zero does nothing. And we do not have to start at zero at all, we can start at any number divisible by four and less than n because the running XOR sum will become zero just before each multiple of four. So XORing together 0...n, 1...n, 4...n, 8...n or generally 4k...n will give the same result. The explanation part looked at one cycle starting at 4k and ending at 4k + 3 with the running XOR sum being back at zero. Maybe this would have been the less confusing explanation, just using 4k instead of using n again with the constraint that it is divisible by four.

Thorrez · 2025-07-05T07:07:59 1751699279

It makes sense to me now. Thanks for taking the time to explain it!

skullt · 2025-07-03T12:31:31 1751545891

There's a bit of a trick in that solution: n is assumed to have the lower two bits clear so for an arbitrary n the array would really be:

[(n & ~3), 1, (n & ~3) + 3, 0][n % 4]

where the (n & ~3) makes sure those lower 2 bits are cleared. But note that we only ever can look at the first element when n % 4 == 0. In that case, (n & ~3) == n already. And further, we only ever can look at the third element when n % 4 == 2. In that case (n & ~3) == n - 2, so (n & ~3) + 3 == n + 1. Hence the array can be simplified to the one given in the other comment.