[Home]Chase-san/KDTree

Robo Home | Chase-san | Changes | Preferences | AllPages

Showing revision 43
Here is the start of my KDTree implimentation, this is free code
package chase.kd;

import java.util.*;

public class KDTree {
	public static enum Direction { LEFT, RIGHT }
	
	private static final Direction kdCompare(KDNode P, KDNode R) {
		if(P.coordinate[R.axis] < R.coordinate[R.axis])
			return Direction.LEFT;
		return Direction.RIGHT;
	}
	
	private static final void kdInsert(KDNode P, KDNode R) {
		KDNode T = null;
		KDNode F = null;
		Direction Q = null;
		int depth = 0;
		if(R == null) {
			R = P
			P.axis = depth % P.coordinate.length;//0
		} else {
			T = R;
			while(T != null && !P.equals(T)) {
				F = T;
				Q = kdCompare(P,T);
				T = getChild(T,Q);
				depth++;
			}
			//P is not already in the tree
			if(T == null && F != null && Q != null) {
				setChild(F,Q,P);
				P.axis = (depth-1) % P.coordinate.length;
			}
		}
	}
	
	private static final KDNode getChild(KDNode P, Direction Q) {
		if(Q.equals(Direction.LEFT)) {
			return P.leftChild;
		}
		return P.rightChild;
	}
	
	//@_@ not sure if this will work...
	private static final void setChild(KDNode P, Direction Q, KDNode N) {
		if(Q.equals(Direction.LEFT)) {
			P.leftChild = N;
			P.childType = Direction.LEFT;
			return;
		}
		P.rightChild = N;
		P.childType = Direction.RIGHT;
	}
	
	private static final KDNode findFather(KDNode P,KDNode R, KDNode F) {
		if(P.equals(R)) return F;
		return findFather(P,getChild(R,kdCompare(P,R)),R);
	}
	
	private static final KDNode findDMinimum(KDNode P, int D) {
		KDNode L, H;
		if(P == null) return null;
		
		if(P.axis == D) {
			if(P.leftChild == null) return P;
			P = P.leftChild;
		}
		L = findDMinimum(P.leftChild,D);
		H = findDMinimum(P.rightChild,D);
		if(H != null && H.coordinate[D] <= P.coordinate[D]) {
			P = H;
			P.childType = Direction.RIGHT;
		}
		if(L != null && L.coordinate[D] <= P.coordinate[D]) {
			P = L;
			P.childType = Direction.LEFT;
		}
		
		return P;
	}
	
	private static final void kdDelete(KDNode P, KDNode R) {
		KDNode F,N;
		N = kdDelete1(P);
		F = findFather(P,R,null);
		if(F == null) R = N;
		else setChild(F,P.childType,N);
	}
	
	/**
	 * Delete node P and return a pointer to the
	 * root of the resulting subtree.
	 */
	private static final KDNode kdDelete1(KDNode P) {
		KDNode F, R;
		int D;
		if(P.leftChild == null && P.rightChild == null) {
			return null; //feeling leafy?
		}
		D = P.axis;
		
		if(P.rightChild == null) {
			setChild(P,Direction.RIGHT,P.leftChild);
			P.leftChild = null;
		}
		R = findDMinimum(P.rightChild,D);
		F = findFather(R,P.rightChild,P);
		
		//this won't work
		if(R.childType.equals(Direction.LEFT)) {
			F.leftChild = kdDelete1(R);
		} else {
			F.rightChild = kdDelete1(R);
		}
		
		setChild(R,Direction.LEFT,P.leftChild);
		setChild(R,Direction.RIGHT,P.rightChild);
		R.axis = P.axis;
		
		return R;
	}
}

class KDNode {
	protected double[] coordinate;
	protected int depth; //how deep is this node?
	protected int axis; //coordinate on which it discriminates
	KDNode leftChild, rightChild;
	KDTree.Direction childType;
	
	/** Coordinate Equality */
	public boolean equals(KDNode n) {
		return Arrays.equals(coordinate, n.coordinate);
	}
}

All I need to do now is get the NearestNode? functions done, which is by far the hardest part. Also reading up on this, it turns out I implimented on my own what was one of the most suggested methods of removing nodes. Ergo mark it ignored and remove it on rebuild. Also notice, it uses an array that expands like an arraylist. :3


Comments:

Yes, I do plan later to change it from using a double array of doubles, to using the actual node lists later. But currently thats just there for internal rebuilds. --Chase-san

Dn't worry, I just checked out Foundations of Multidimensional and Metric Data Structures (2006) by Hanan Samet from my campus library. Which goes over in detail K-d Trees, K-d Tries, PR K-d Trees, Sliding Midpoint K-d Trees, Bucket PR K-d Trees, Path-Compressed PR K-d Trees, BD-Trees, (Balanced Box Decomposition)BBD-Trees all sorts of search trees, quad-trees, and Conjunction trees. Thats only the first half of the first section. Personally something called LSD Tree caught my eye later on in the book. All in all its a very leafy subject.

Basically, since I can't beat all y'all, i'll just have to out-source.. hem out resource you. :) --Chase-san

Not sure whether this makes any difference to your work, but I started on my own multi-dimension cluster-building tree. It will keep itself balanced with each add, all operations will be O(log n). Not sure how it will handle deletes yet ... or if it will. I don't plan on deleting scans once this is implemented. -- Simonton

True, the more dimensions a KDTree has the higher the constant. Otherwise with just about 3, its always about 12 to 14 node scans to find the nearest neighbor, most thing with 6 and less dimensions are okay, but from there is is pretty much exponential in checks it had to do. 1: low, 2: low, 3: low, 4: low, 5: ~25, 6: ~35, 7: ~50, 8: ~75, 9: ~180, 10: ~250, 11: ~340, 12: ~400, 13: ~500... etc. Out of anywhere from 2000 to 10000 total nodes. --Chase-san

But in any case, its still alot faster then O(n). --Chase-san

There fixed some bad parts in the code *whistles innocently*, okay yes, i'll get the NN search working soon. (Just gotta finish a little reading on it) --Chase-san

Okay, its all shoved into only one class now and only uses an array of an array of doubles when it needs to, otherwise its all KDNode arrays, also I added the ability to store alternate data with your Tree, and made the KDNodes public, so you can get at it :)... Alas, I have yet to impliment the radius, nearest-neighbor, and n nearest neighbors methods. But nearest neightbor is next on my list! I swear! --Chase-san

Daw!! Major logical flaw. My tree structure is broken. Well ... actually it's working just fine - without the self-balancing part. It's not even twice as fast as a linear search, though. But now that I think harder about the getting it to self-balance on each add, it's definitely not going to work. Back to the drawing board ... -- Simonton

      1000 iterations; history size=26,000 (about the # of tick waves in a fast learning challenge); cluster size=35; dimensions=5
        O(n) method: 1624
        O(log n) method: 1120
      1000 iterations; history size=370,000 (about the # of tick waves in a regular challenge); cluster size=35; dimensions=5
        O(n) method: 19168
        O(log n) method: 3033
      1000 iterations, history size=55,000 (history size I could increase to without skipping, theoretically); cluster size=35; dimensions=5
        O(n) method: 3084
        O(log n) method: 1652

The second one really shows the speed bonus of using a KDTree, i'm about to start work on the NN function, so everyone can try this in a few minutes without having to make thier own. --Chase-san

It has the nearest function now, tell me if there are any errors! I would do more but I have to get to class. --Chase-san

 Unbalanced Tree
  -------+------- 
 |       |       |
 +---+-*-+       |  S = node you are looking for the closer to
 |   |   |   *   |  * = normal points
 |   |   *       |
 |   *   |       |
 |   |   |S      |
 |   |  *|       |
  ---+---+-------

 Balanced Tree
  ------+-------- 
 |      |    |   |
 |     *|    |   |  S = node you are looking for the closer to
 |      |    *   |  * = normal points
 |      +*---+---|
 |---*--|        |
 |      | S      |
 |      *        |
  ------+--------

Right now I am turning some completely inreadable pseudocode into Java, so I might be awhile. The main reason I am doing this is to make sure I have it right.

Example:
   1  direction procedure KDCOMPARE(P,R)
   2  value pointer node P,R
   3  return(if COORD(P,DISC(R)) < COORD(R,DISC(R)) then 'LEFT'
   4         else 'RIGHT'
   5         endif)

I changed to

   public static enum Direction { LEFT, RIGHT };

   private static final Direction KDCOMPARE(KDNode P, KDNode R) {
       if(P.coordinate[R.axis] < R.coordinate[R.axis])
           return Direction.LEFT;
       return Direction.RIGHT;
   }

There are a few alot worse then that, almost completely incomprehensible, not the structure but the 'functions' they throw in to make thier code smaller, it requires alot of lookup. (normally if I read through the chapter I can find the reference of what it does). --Chase-san

For what it's worth, searching for "java kdtree" in Google turns up a fair number of things, including some GPLed code. You might want to take a peek at those for a reference of how to implement some parts of it. --David Alves

But thats boring. ^_^ Here is my finished set of converted code. Replacing what there was there. --Chase-san

Looks like I gave up too early on the KDTree stuff. I had a major inefficiency in my search algorithm. Here are the timings I get now. 7 dimensions, 26,000 scans, cluster size = 50. Also, this is with squaring the distances, which seems to make it run a lot faster.

    O(n) method: 2172
    O(log n) method: 1024
But here's the bonus move ... 52,000 scans:
    O(n) method: 3945
    O(log n) method: 1260
104,000
    O(n) method: 8436
    O(log n) method: 1512
Are you ready for this .... 400,000 scans. This is more than you'll get in a targeting challenge, all for less than you WERE spending on 35 rounds.
    O(n) method: 70801
    O(log n) method: 1756
-- Simonton

So you got it working? --Chase-san

Well, yeah. This is the code I had before, just changed the order of the tree traversal when searching for nodes to seek out the closest nodes more quickly, allowing more of the tree to be skipped. I still have my eye out for a tree like this that can keep itself balanced with each add, though. I am currently pulling clusters of size 500 & doing more analysis with those, and my timings of cluster sizes that big says that using the tree would be slower, so I'm not actually using this in a bot yet. -- Simonton

Hey Simonton how exactly are you able to find the K-nearest neighbours to a node in O(log n) aka the price of an insertion. The fastest i've gotten my tree is O(n ** (1 - 1/d)) care to shed some light on this -- Gorded

I am starting work on mine again, I am trying to make a generic 'util' looking one, as per the java.util package. I mean I am trying to make the interface more user friendly, ergo getNextNearestNeighbor? and simular. --Chase-san

Mine's pretty darn generic you just extend the KDTree.Node which is an abstract class and define the distance function and what variables need to be set to replace it with another node. And all functions return a instance to your new class, k-nearest neighbour returns a heap with the furthest away at the top. And it actually removes nodes properly instead of marking them for deletion upon a rebuild. The only thing not done is balancing. I'll put up the code if you want, when i have the rest of the movement and gun code done. Hopefully by the end of the week -- Gorded

That would be interesting to see, as removing of nodes in a KD-Tree is tricky (as you have to switch all the axis on the sub-tree). --Chase-san

HUh! switching all the axis on the subtree of the node to be removed would require a total rebuild of that tree, or alot of recursive switching of nodes that makes my head hurt even thinking of it. There is a much easier solution that only requires log(n) time, its quite simple and eloquent can u think of it, or do u need a hint :D -- Gorded

I know of one that takes O(k log n), taking all the nodes below the tree and reinsearting them into the add function, but as for a flat O(log n) no. --Chase-san

Its quite simple and you'll kick yourself for not thinking of it. Traverse the tree untill you find the node to be removed and replace it whith either the node with the minimum value for that axis in the right subtree or the node with the maximum value for that axis in the left subtree. And the remove the node that is now duplicated. -- Gorded

That actually might work. --Chase-san

It does my friend... it does :D -- Gorded

However, those nodes may already have subtrees, or there may not be a node with the same axis in either subtree(if you have a high number of dimensions). --Chase-san

Your misunderstanding a bit of the algorithm. 1) as for whether the replacement nodes have subtrees that matters not because once the node has replaced the node to be remove we then remove the duplicate of it futher down the tree. 2) Your not looking for a node that is split on the same axis, your look for a node that has the greatest/least (depends on whether you choose from the left or right subtree) value in the axis that the node to be removed is split on. Hope that clears it up, if not ill post the code when i finish yoda, along with a bunch more usefull snippets , any way im off to bed its 5am -- Gorded

Okay about the subtree, but still the axis problem, as there may not be one of the same axis further down the tree, say you remove a node with only two nodes attached to it, and no subtrees at all. (unless you mean the entire tree, in which case the use of the word subtree was just plain confusing) --Chase-san

In my implementation, each leaf node has an ArrayList that holds up to X items, at which point it becomes an internal node with two leaf nodes. To remove a data point, I just remove it from the ArrayList. I never remove nodes. Since the total amount of data that I store in the tree is either increasing or staying constant, it seems like there isn't much benefit to removing nodes, it wouldn't happen often and I don't think it would offer significant memory savings. --David Alves

Has a heartattack from stack overflow! ... I don't even know HOW it can iterate that deep when it only has 25 nodes to search through..... --Chase-san


Robo Home | Chase-san | Changes | Preferences | AllPages
Edit revision 43 of this page | View other revisions | View current revision
Edited January 27, 2008 3:01 EST by Chase-san (diff)
Search: