Parallel Kd-tree


To solve the circle collision detection problem in a timely maner, the use of space partitioning data structures is needed. The trivial algorithm that consist of testing all the points with each other has a complexity of O(n^2), with n in the tens of thousands the algorithm explodes and becomes to slow using todays CPU’s. To reduce the execution time we reduce the number of comparaisons using a space partitioning data structures, there is a large catalog of data structures available, each with there cons and pros.

Today we will present how to use the K-d tree data structure to solve the 2D circle collision detection problem in a parallel way, taking advantage of todays CPU’s mutiple cores.

Building the K-d tree

To build the tree in a sequential way we can use the following algorithm from Wikipedia:

function kdtree (list of points pointList, int depth)
{
    // Select axis based on depth so that axis cycles through all valid values
    var int axis := depth mod k;

    // Sort point list and choose median as pivot element
    select median by axis from pointList;

    // Create node and construct subtree
    node.location := median;
    node.leftChild := kdtree(points in pointList before median, depth+1);
    node.rightChild := kdtree(points in pointList after median, depth+1);
    return node;
}

Building the K-d tree in a parallele way

The parallel build algorithm consists of creating a new job or thread for one subtree, because subtrees are independant we can do this in a lockfree maner. The number of threads or jobs created is ((2^n) / 2) with n equal to the height of the tree, so to limit the number of threads or jobs created, we use a condition that switches to the sequential algorithm after a certain depth.

function kdtree_parallel(list of points pointList, int max_depth)
{
    tree_node root;
    kd_tree_parallel(pointList, root, max_depth, 0);
    wait_for_all_threads();
    return root;
}

function kdtree_parallel_internal (list of points pointList, tree_node node, int max_depth, int depth)
{
    // Select axis based on depth so that axis cycles through all valid values
    var int axis := depth mod k;

    // Sort point list and choose median as pivot element
    select median by axis from pointList;

    // Create node and construct subtree
    node.location := median;
    
    if(depth < max_depth)
    {
        // Split to the left creating a new thread.
        new thread(kdtree_parallel_internal(points in pointList before median, node.leftChild, depth+1));
        kdtree_parallel_internal((points in pointList after median, node.rightChild, depth+1));
        return;
    }

    // Continue with the sequential algorithm.
    node.leftChild := kdtree(points in pointList before median, depth+1);
    node.rightChild := kdtree(points in pointList after median, depth+1);
    return node;
}

Example n = 4:
n = 0: 0 splits to the left and creates 4.
n = 1: 0 creates 1 and 4 creates 5.
n = 2: 0 creates 3, 1 creates 2, 4 creates 7, 5 creates 6.
n = 3: We have created a total of 8 threads, (0, 1, 2, 3, 4, 5, 6, 7) switch to the sequential algorithm and continue building the tree.

            0  
           / \
          /   \
         /     \
        /       \
       /         \
      4           0
     / \         / \
    /   \       /   \
   5     4     1     0
  / \   / \   / \   / \
 6   5 7   4 2   1 3   0

Range search

To find which circles are colliding with a given circle we just have to do a range search with the max distance being the diameter of the circle, we can solve the collision directly inside the range search or store the colliding circles for later resolution.

The following algorithm does a range search and returns a list of all the nodes in range of a point. A similar algorithm can be used to find the nearest neighbour.

function range_search(Point point, tree_node node, list of points list, int max_distance, int depth)
{
    if(node == NULL) return;

    if(distance(node, point) < max_distance)
    {
        // Do something with node and point because they are in range.
        // You can replace the list.add() with a do_something(point, node).
        list.add(node);
    }

    var int axis := depth mod k;
    if(axis == 0) // x axis
    {
        if(point.x - max_distance < node.location.x) range_search(point, node.leftChild, list, depth + 1);
        if(point.x + max_distance > node.location.x) range_search(point, node.rightChild, list, depth + 1);
    }
    else
    {
        if(point.y - max_distance < node.location.y) range_search(point, node.leftChild, list, depth + 1);
        if(point.y + max_distance > node.location.y) range_search(point, node.rightChild, list, depth + 1);
    }
}

You can query the tree in a lockfree parallel way if the do_something() function does not modify the tree.

Demo

Using this algorithm I was able to create a simple DEM simulator that can run about 300K particles at 30fps stable. The tree building and the range search are done in parallel using a ryzen 3600, using 12 threads with a thread pool.

Comments