package chase.kd;
import java.util.*;
public class KDTree {
	public static enum Direction { LEFT, RIGHT }
	
	private static final Direction kdCompare(KDNode P, KDNode R) {
		if(P.coordinate[R.axis] < R.coordinate[R.axis])
			return Direction.LEFT;
		return Direction.RIGHT;
	}
	
	private static final void kdInsert(KDNode P, KDNode R) {
		KDNode T = null;
		KDNode F = null;
		Direction Q = null;
		int depth = 0;
		if(R == null) {
			R = P;
			P.axis = depth % P.coordinate.length;//0
		} else {
			T = R;
			while(T != null && !P.equals(T)) {
				F = T;
				Q = kdCompare(P,T);
				T = getChild(T,Q);
				depth++;
			}
			//P is not already in the tree
			if(T == null && F != null && Q != null) {
				setChild(F,Q,P);
				P.axis = (depth-1) % P.coordinate.length;
			}
		}
	}
	
	private static final KDNode getChild(KDNode P, Direction Q) {
		if(Q.equals(Direction.LEFT)) {
			return P.leftChild;
		}
		return P.rightChild;
	}
	
	//@_@ not sure if this will work...
	private static final void setChild(KDNode P, Direction Q, KDNode N) {
		if(Q.equals(Direction.LEFT)) {
			P.leftChild = N;
			P.childType = Direction.LEFT;
			return;
		}
		P.rightChild = N;
		P.childType = Direction.RIGHT;
	}
	
	private static final KDNode findFather(KDNode P,KDNode R, KDNode F) {
		if(P.equals(R)) return F;
		return findFather(P,getChild(R,kdCompare(P,R)),R);
	}
	
	private static final KDNode findDMinimum(KDNode P, int D) {
		KDNode L, H;
		if(P == null) return null;
		
		if(P.axis == D) {
			if(P.leftChild == null) return P;
			P = P.leftChild;
		}
		L = findDMinimum(P.leftChild,D);
		H = findDMinimum(P.rightChild,D);
		if(H != null && H.coordinate[D] <= P.coordinate[D]) {
			P = H;
			P.childType = Direction.RIGHT;
		}
		if(L != null && L.coordinate[D] <= P.coordinate[D]) {
			P = L;
			P.childType = Direction.LEFT;
		}
		
		return P;
	}
	
	private static final void kdDelete(KDNode P, KDNode R) {
		KDNode F,N;
		N = kdDelete1(P);
		F = findFather(P,R,null);
		if(F == null) R = N;
		else setChild(F,P.childType,N);
	}
	
	/**
	 * Delete node P and return a pointer to the
	 * root of the resulting subtree.
	 */
	private static final KDNode kdDelete1(KDNode P) {
		KDNode F, R;
		int D;
		if(P.leftChild == null && P.rightChild == null) {
			return null; //feeling leafy?
		}
		D = P.axis;
		
		if(P.rightChild == null) {
			setChild(P,Direction.RIGHT,P.leftChild);
			P.leftChild = null;
		}
		R = findDMinimum(P.rightChild,D);
		F = findFather(R,P.rightChild,P);
		
		//this won't work
		if(R.childType.equals(Direction.LEFT)) {
			F.leftChild = kdDelete1(R);
		} else {
			F.rightChild = kdDelete1(R);
		}
		
		setChild(R,Direction.LEFT,P.leftChild);
		setChild(R,Direction.RIGHT,P.rightChild);
		R.axis = P.axis;
		
		return R;
	}
}
class KDNode {
	protected double[] coordinate;
	protected int depth; //how deep is this node?
	protected int axis; //coordinate on which it discriminates
	KDNode leftChild, rightChild;
	KDTree.Direction childType;
	
	/** Coordinate Equality */
	public boolean equals(KDNode n) {
		return Arrays.equals(coordinate, n.coordinate);
	}
}
All I need to do now is get the NearestNode? functions done, which is by far the hardest part. Also reading up on this, it turns out I implimented on my own what was one of the most suggested methods of removing nodes. Ergo mark it ignored and remove it on rebuild. Also notice, it uses an array that expands like an arraylist. :3
Dn't worry, I just checked out Foundations of Multidimensional and Metric Data Structures (2006) by Hanan Samet from my campus library. Which goes over in detail K-d Trees, K-d Tries, PR K-d Trees, Sliding Midpoint K-d Trees, Bucket PR K-d Trees, Path-Compressed PR K-d Trees, BD-Trees, (Balanced Box Decomposition)BBD-Trees all sorts of search trees, quad-trees, and Conjunction trees. Thats only the first half of the first section. Personally something called LSD Tree caught my eye later on in the book. All in all its a very leafy subject.
Basically, since I can't beat all y'all, i'll just have to out-source.. hem out resource you. :) --Chase-san
Not sure whether this makes any difference to your work, but I started on my own multi-dimension cluster-building tree. It will keep itself balanced with each add, all operations will be O(log n). Not sure how it will handle deletes yet ... or if it will. I don't plan on deleting scans once this is implemented. -- Simonton
True, the more dimensions a KDTree has the higher the constant. Otherwise with just about 3, its always about 12 to 14 node scans to find the nearest neighbor, most thing with 6 and less dimensions are okay, but from there is is pretty much exponential in checks it had to do. 1: low, 2: low, 3: low, 4: low, 5: ~25, 6: ~35, 7: ~50, 8: ~75, 9: ~180, 10: ~250, 11: ~340, 12: ~400, 13: ~500... etc. Out of anywhere from 2000 to 10000 total nodes. --Chase-san
But in any case, its still alot faster then O(n). --Chase-san
There fixed some bad parts in the code *whistles innocently*, okay yes, i'll get the NN search working soon. (Just gotta finish a little reading on it) --Chase-san
Okay, its all shoved into only one class now and only uses an array of an array of doubles when it needs to, otherwise its all KDNode arrays, also I added the ability to store alternate data with your Tree, and made the KDNodes public, so you can get at it :)... Alas, I have yet to impliment the radius, nearest-neighbor, and n nearest neighbors methods. But nearest neightbor is next on my list! I swear! --Chase-san
Daw!! Major logical flaw. My tree structure is broken. Well ... actually it's working just fine - without the self-balancing part. It's not even twice as fast as a linear search, though. But now that I think harder about the getting it to self-balance on each add, it's definitely not going to work. Back to the drawing board ... -- Simonton
      1000 iterations; history size=26,000 (about the # of tick waves in a fast learning challenge); cluster size=35; dimensions=5
        O(n) method: 1624
        O(log n) method: 1120
      1000 iterations; history size=370,000 (about the # of tick waves in a regular challenge); cluster size=35; dimensions=5
        O(n) method: 19168
        O(log n) method: 3033
      1000 iterations, history size=55,000 (history size I could increase to without skipping, theoretically); cluster size=35; dimensions=5
        O(n) method: 3084
        O(log n) method: 1652
The second one really shows the speed bonus of using a KDTree, i'm about to start work on the NN function, so everyone can try this in a few minutes without having to make thier own. --Chase-san
It has the nearest function now, tell me if there are any errors! I would do more but I have to get to class. --Chase-san
Unbalanced Tree -------+------- | | | +---+-*-+ | S = node you are looking for the closer to | | | * | * = normal points | | * | | * | | | | |S | | | *| | ---+---+------- Balanced Tree ------+-------- | | | | | *| | | S = node you are looking for the closer to | | * | * = normal points | +*---+---| |---*--| | | | S | | * | ------+--------
Right now I am turning some completely inreadable pseudocode into Java, so I might be awhile. The main reason I am doing this is to make sure I have it right.
Example:
   1  direction procedure KDCOMPARE(P,R)
   2  value pointer node P,R
   3  return(if COORD(P,DISC(R)) < COORD(R,DISC(R)) then 'LEFT'
   4         else 'RIGHT'
   5         endif)
I changed to
   public static enum Direction { LEFT, RIGHT };
   private static final Direction KDCOMPARE(KDNode P, KDNode R) {
       if(P.coordinate[R.axis] < R.coordinate[R.axis])
           return Direction.LEFT;
       return Direction.RIGHT;
   }
There are a few alot worse then that, almost completely incomprehensible, not the structure but the 'functions' they throw in to make thier code smaller, it requires alot of lookup. (normally if I read through the chapter I can find the reference of what it does). --Chase-san
For what it's worth, searching for "java kdtree" in Google turns up a fair number of things, including some GPLed code. You might want to take a peek at those for a reference of how to implement some parts of it. --David Alves
But thats boring. ^_^ Here is my finished set of converted code. Replacing what there was there. --Chase-san
Looks like I gave up too early on the KDTree stuff. I had a major inefficiency in my search algorithm. Here are the timings I get now. 7 dimensions, 26,000 scans, cluster size = 50. Also, this is with squaring the distances, which seems to make it run a lot faster.
    O(n) method: 2172
    O(log n) method: 1024
But here's the bonus move ... 52,000 scans:
    O(n) method: 3945
    O(log n) method: 1260
104,000
    O(n) method: 8436
    O(log n) method: 1512
Are you ready for this .... 400,000 scans.  This is more than you'll get in a targeting challenge, all for less than you WERE spending on 35 rounds.
    O(n) method: 70801
    O(log n) method: 1756
-- Simonton
So you got it working? --Chase-san
Well, yeah. This is the code I had before, just changed the order of the tree traversal when searching for nodes to seek out the closest nodes more quickly, allowing more of the tree to be skipped. I still have my eye out for a tree like this that can keep itself balanced with each add, though. I am currently pulling clusters of size 500 & doing more analysis with those, and my timings of cluster sizes that big says that using the tree would be slower, so I'm not actually using this in a bot yet. -- Simonton
Hey Simonton how exactly are you able to find the K-nearest neighbours to a node in O(log n) aka the price of an insertion. The fastest i've gotten my tree is O(n ** (1 - 1/d)) care to shed some light on this -- Gorded