[ /cjdns/xorcmp ]

I asked people for suggestions of what kind of information they might want to appear on the new FC00. larsg responded that it might be useful if we had a way of highlighting nodes which are keyspace neighbours. In cjdns, similar to the Kademlia protocol, this is accomplished by sorting nodes using a xor metric to sort nodes into k-buckets.

As such, keyspace neighbours may be far apart on the actual network graph. Despite this, they keep track of each other, since when a nearby node wants to communicate with a node they are unaware of, they ask a node which has a similar key.

Therefore, in order to understand what kind of decisions are being made by cjdns, we have to look at how it computes xor-distance.

Where to look?

larsg pointed me in the right direction:

The following function can be found in cjdns/dht/Address.c.

int Address_xorcmp(uint32_t target,
                   uint32_t negativeIfCloser,
                   uint32_t positiveIfCloser)
{
    if (negativeIfCloser == positiveIfCloser) {
        return 0;
    }
    uint32_t ref = Endian_bigEndianToHost32(target);
    return ((Endian_bigEndianToHost32(negativeIfCloser) ^ ref)
               < (Endian_bigEndianToHost32(positiveIfCloser) ^ ref)) ? -1 : 1;
}

Making sense of things

Address_xorcmp is a function which takes three arguments: a target node's address and two nodes' addresses, which are to be compared for closeness to that target address.

Each address is a uint32_t. That's a 32 bit unsigned long int.

Before performing further computation, this function checks to see if the two nodes to be compared are equal. If they are, a zero is returned. I'm assuming this means they are duplicate entries of the same node.

If they are not equal, then the function continues by declaring and instantiating another 32 bit unsigned long int ('ref'), which is the product of the following function called on the target:

Endian_bigEndianToHost32(x)

This is actually a Macro which conditionally (depending on whether you are on linux, darwin, or freebsd) includes the appropriate libraries and creates a function based upon your system's Endianness.

On Linux, it uses byteswap.h, which (on Ubuntu) is found in /usr/include/byteswap.h (via the command locate byteswap).

As far as I can tell, at least on my system, this ends up invoking

static inline uint32_t Endian_byteSwap32_manual(uint32_t input)
{
    input = Endian_rotateAndMask(0x00FF00FF,  8);
    return Endian_rotateAndMask(0x0000FFFF, 16);
}

Endian_rotateAndMask is yet another macro:

#define Endian_rotateAndMask(mask, rotateBits) \
    ((input >> rotateBits) & mask) | ((input & mask) << rotateBits)

This is done to all the unsigned long ints in question, though in the of the target int, it makes sense to store it in ref since it will be xor'd against the other two arguments.

The results of both xor'd ints are compared against each other. If the second argument's xor'd value is less than the third argument's xor'd value, a value of -1 is returned. Otherwise (in which case the other would have been greater, since we earlier ruled out equality), a value of 1 is returned. Thusly we can sort all nodes by their value relative to a single target node.

Implementing it in Javascript

We still need a little more info:

we don't know yet what data is being passed into the comparison function, aside from the fact that it's a uint32_t.

Also, we're going to need to check our math pretty carefully, since Javascript doesn't natively support operations on unsigned integers.

A simple C program can print out the value of the greatest 32 bit unsigned lont int:

#include <stdio.h>
#include <stdint.h>

int main(void){
  printf("%ju\n",(uintmax_t)(UINT32_MAX));
  printf("%ju\n", (uintmax_t)(UINT32_MAX)^ 0xFFFFFFFF);
}

// prints
// 4294967295
// 0

That value is going to come in handy, so we can start by declaring it:

var uint32_MAX=4294967295;

Fortunately, Javascript only has one numerical type, and it's bigger than what we need. To ensure that all numbers we are treating as uint32_ts are confined to this range, we can use the unsigned right shift operator.

var confine=function(n){
  return n>>>0;
};

We can test that it's working fairly easily, by passing a number with more bits:

console.log("%s\n%s"
  ,confine((uint32_MAX<<1)+1)
  ,uint32_MAX
);

Run this and you'll see that the results are in fact equivalent.

Next we can try to implement the rotateAndMask macro as a function. It's fairly difficult to read, if you don't look at the definition of the macro (and understand the C preprocessor).

C's macro system replaces source text prior to compilation. Thusly, while it seems like this function only takes two arguments, it actually expands to some inline code which is invisibly capturing other variables from the environment. The behaviour is almost lambda-like, except that the definition of the inline code is in another file, making it difficult to decypher.

It makes more sense, for our purposes, to implement it as a ternary function.

var rotateAndMask=function(input,mask,rotateBits){
  return ((input >> rotateBits) & mask) | ((input & mask) << rotateBits);
};

NOTE: The problem here is that Javascript's bitwise operators won't treat these variables as unsigned 32 bit integers. I realized this after implementing the rest of the functions, and fixed the problem. I'm going to leave this faulty implementation and the notes about it for reference.

Swapping bytes

Then we need to make sure to substitute that third invisible variable each time the function is called. We could implement it inline, and have it capture the variable as in the C code, but at this point I'd rather just pass it manually. I don't know for sure that I won't end up reusing the function, and the extra input serves as a reminder of what it's actually doing.

var byteSwap32_manual=byteSwap=function(input){ // input is uint32_t 
  input=rotateAndMask(input,0x00FF00FF,8);
  return rotateAndMask(input,0x0000FFFF,16);
};

With that done, we can translate the function Address_xorcmp, which takes three uint32_ts and indicates which of the latter two is closer to the first.

var xorcmp=function(t,a,b){
  if(a===b)
    return 0;
  var ref=byteSwap(t);
  return ((byteSwap(a)^ref) < byteSwap(b)^ref)? -1: 1;
};

NOTE: this article isn't done. I'm still hacking on this code to try to get it to work. Right now there's a problem somewhere in byteSwap.

Checking the Math

I wrote this simple C program as a reference

#include <stdio.h>
#include <stdint.h>

uint32_t rotateAndMask(uint32_t input,uint32_t mask,uint32_t rotateBits){
  return ((input >> rotateBits) & mask) | ((input & mask) << rotateBits);
};

uint32_t byteSwap(uint32_t input){
  input = rotateAndMask(input,0x00FF00FF, 8);
  return  rotateAndMask(input,0x0000FFFF, 16);
}

int xorcmp(uint32_t t,uint32_t a,uint32_t b){
  if(a==b)
    return 0;
  uint32_t ref=byteSwap(t);
  return ((byteSwap(a)^ref)<(byteSwap(b)^ref))?-1:1;
}

int main (void){
  printf("%ju\n", byteSwap(0xFFFFFFFF));
  printf("%ju\n", byteSwap(0xFFFFFFFD));
  printf("%ju\n", byteSwap(0xFFFFFFFE));
  printf("%ju\n", byteSwap(0xFFFFFFFC));

  printf("%i\n", xorcmp(0xFFFFFFFF,0xFFFFFFFE,0xFFFFFFFD));
  printf("%i\n", xorcmp(0xFFFFFFFF,0xFFFFFFFD,0xFFFFFFFE));
  printf("%i\n", xorcmp(0xFFFFFFFF,0xFFFFFFFE,0xFFFFFFFE));

  return 0;
}

This is an equivalent program in javascript

var uint={};
uint.max=4294967295

uint.confine=function(n){
  return n>>>0;
};

// simulate >>
uint.rightshift=function(a,b){
  return a>>>b;  
};

// simulate <<
uint.leftshift=function(a,b){
  return uint.confine(a<<b);  
};

var rotateAndMask=function(input,mask,rotate){
  return uint.confine((uint.rightshift(input,rotate)&mask)|
    (uint.leftshift((input&mask),rotate)));
};

var byteSwap=function(input){
  input=rotateAndMask(input,0x00FF00FF,8);
  return rotateAndMask(input,0x0000FFFF,16);
};

var xorcmp=function(t,a,b){
  if(a===b)
    return 0;
  var ref=byteSwap(t);
  return ((byteSwap(a)^ref)<(byteSwap(b)^ref))?-1:1;
};

console.log(byteSwap(0xFFFFFFFF));
console.log(byteSwap(0xFFFFFFFD));
console.log(byteSwap(0xFFFFFFFE));
console.log(byteSwap(0xFFFFFFFC));

console.log(xorcmp(0xFFFFFFFF,0xFFFFFFFE,0xFFFFFFFD));
console.log(xorcmp(0xFFFFFFFF,0xFFFFFFFD,0xFFFFFFFE));
console.log(xorcmp(0xFFFFFFFF,0xFFFFFFFE,0xFFFFFFFE));

So everything seems to work, but there's still one piece missing. I need a function which converts a cjdns ipv6 into its uint32_t components.

Converting keys to uint32_ts

Once again, there's a bit of indirection to overcome in tracking down this algorithm's components.

cjdns/crypto/AddressCalc.c contains the directive #include "crypto_hash_sha512.h", which links source files from NaCl, so we need to understand what it's doing.

Otherwise, the process seems fairly self contained.

int AddressCalc_addressForPublicKey(uint8_t addressOut[16], const uint8_t key[32])
{
    uint8_t hash[crypto_hash_sha512_BYTES];
    crypto_hash_sha512(hash, key, 32);
    crypto_hash_sha512(hash, hash, crypto_hash_sha512_BYTES);
    Bits_memcpyConst(addressOut, hash, 16);
    return hash[0] == 0xFC;
}

int AddressCalc_validKey(const uint8_t key[32])
{
    uint8_t hash[crypto_hash_sha512_BYTES];
    crypto_hash_sha512(hash, key, 32);
    crypto_hash_sha512(hash, hash, crypto_hash_sha512_BYTES);
    return hash[0] == 0xFC;
}

int AddressCalc_validAddress(const uint8_t address[16])
{
    return address[0] == 0xFC;
}

Update

I had a chance to talk to Arceliar about some details I was having trouble figuring out on my own. He cleared them up in no time, and I was able to finish writing the functions necessary to compare two cjdns ipv6s' xor distance relative to a third address.

<@Arceliar> the Address_xorcmp function is a little weird
<@Arceliar> addresses are actually 128 bits, split into 4 uint32_t
<@Arceliar> gets parsed into order: 3, 4, 1, 2
<@Arceliar> because cjd felt like rotating things by 64 bits for some reason
<@Arceliar> and because there's not a uint128_t available to compare the whole address at once
<@Arceliar> not sure why 2x uint64_t isn't used instead... so you only have to check 2 things instead of 4...

The code:

var uintXor=function(a,b){
  return byteSwap(a)^byteSwap(b);
};

// make sure to zero-pad the fc first
var fcParse=function(fc){
  var result=[];
  fc.replace(/:/g,"")
    .replace(/[a-f0-9]{8}/g,function(uint){
      result.push(uint);
      return "";
    });
  return result;
};

var fcQuarterToUint=function(q){
  var bytes=[];
  q.replace(/[0-9a-f]{2}/g,function(hexByte){
    bytes.push(hexByte);
    return "";
  });
  return bytes
    .map(function(bb){
      return (bb.charCodeAt(0)<<8)|(bb.charCodeAt(1));
    })
    .reduce(function(a,b){
      return (a<<8)|b;
    });
};

var fcXor=function(A,B){
  return A.map(function(x,i){
    return uintXor(x,B[i]);
  });
};

var fcUintSwap=function(U){
  return U.slice(2).concat(U.slice(0,2));
};

var fcCompare=function(T,A,B){
  var T2=fcParse(T).map(fcQuarterToUint)
    ,A2=fcParse(A).map(fcQuarterToUint)
    ,B2=fcParse(B).map(fcQuarterToUint);
  var A3=fcUintSwap(fcXor(T2,A2));
  var B3=fcUintSwap(fcXor(T2,B2));
  var temp={};
  temp[T]={};
  temp[T][A]=A3;
  temp[T][B]=B3;
  return temp;
};

console.log(fcCompare(
  "fc6a:30c9:53a1:2c6b:ccbf:1261:2aef:45d3"
  ,"fcbf:8145:9f55:0202:908f:bcce:c01e:caf2"
  ,"fcbe:5f12:67d8:77ea:e4d8:aecc:2b4f:0a4b"
));

This outputs an object that looks like this:

{ 'fc6a:30c9:53a1:2c6b:ccbf:1261:2aef:45d3': 
 { 'fcbf:8145:9f55:0202:908f:bcce:c01e:caf2': [ 1414861147, 17961300, 205917460, 1346700804 ],
   'fcbe:5f12:67d8:77ea:e4d8:aecc:2b4f:0a4b': [ 1381043735, 1358954515, 184813076, 50334724 ] } }

Each element in each array is a 32 bit integer. The elements are ranked in terms of their significance, meaning that if the first element of the first array is larger than the first element of the second array, then by the xor metric, the first node is closer to the target node.

I haven't decided exactly how I'm going to use this for FC00. It probably won't be in exactly this format, as each of these objects takes up a fairly large amount of memory. It's more likely it'll be a compressed version. Arceliar suggested taking the log, and maybe shifting the results together. It's either that or use bignum.js. In any case, I'm glad to have finally finished this!