Identifying Player Drawn Symbols

Published January 02, 2020 by HappyCoder
Do you see issues with this article? Let us know.
Advertisement

In my current virtual reality game project I want players to be able to craft their own spells. I decided to do this through using a visual programming language. Users draw individual symbols that each perform different actions. The symbols can be connected together into custom spells that allow for much more freedom and flexibility for solving puzzles in the game world. This offered a difficult technical challenge. How do I identify a symbol from a messy imprecise drawing?

Symbol Identification

A symbol can have any number of lines, curves and corners. A poorly drawn straight line could have some curve to it. A curve could be too straight. Distances aren't precise and players often start or end a symbol with a stray line. Trying to separate out individual lines and curves for a drawn symbol is a bad approach. Trust me, I tried. Instead you should measure how similar the drawn symbol is to each of the possible symbols. The closest match is the chosen symbol. This creates a new problem, how do you measure the "distance" between two symbols? Simple, you figure out a way to convert a symbol into a point since measuring the distance between two points is trivial. Once you have a way to convert any piece of data into a point it can be compared to other points and a simple nearest neighbor search is all it takes to classify that data.

Converting a curve to a point

The complex details of a symbol won’t fit into a simple 2D or 3D point. Instead you will need to use an arbitrary number of dimensions to represent the curve. The approach I took was to simply put all the points from the line drawn by the user into a single array. Measuring the distance between two arrays of points is as simple as summing up the distance squared for each corresponding point.

private float Similarity(Vector3[] a, Vector3[] b)
{
    float result = 0.0f;
    for (int i = 0; i < a.Length; ++i)
    {
         result += (a[i] - b[i]).sqrMagnitude;
    }

    return result;
}

This of course creates a new problem. The arrays need to be the same length. Since the user is free to draw a symbol with as many or as few points as they want you need a way to convert that to a fixed number of points. I simply redistributed a fixed number of points along the curve the user drew.

public class DistanceStepper
{
	private float startIndexLength = 0.0f;
	private int index = 0;
	public Vector3[] points;
	public float[] lengths;
	public float distance = 0.0f;
	public DistanceStepper(Vector3[] points)
	{
		this.points = points;
		lengths = new float[points.Length - 1];
		for (int i = 0; i < lengths.Length; ++i)
		{
			lengths[i] = Vector3.Distance(points[i], points[i + 1]);
			distance += lengths[i];
		}
	}

	public Vector3 PointAtDistance(float distance)
	{
		if (startIndexLength > distance)
		{
			startIndexLength = 0.0f;
			index = 0;
		}

		while (index < points.Length - 2 &amp;amp;amp;&amp;amp;amp; startIndexLength + lengths[index] < distance)
		{
			startIndexLength += lengths[index];
			++index;
		}

		return Vector3.Lerp(points[index], points[index + 1], (distance - startIndexLength) / lengths[index]);
	}
}

public Vector3[] UseNPoints(Vector3[] points, int number)
{
    DistanceStepper distanceStepper = new DistanceStepper(points);
    float distance = 0.0f;

    float distanceStep = distanceStepper.distance / (number - 1);
    Vector3[] result = new Vector3[number];

    for (int i = 0; i < number; ++i)
    {
		result[i] = distanceStepper.PointAtDistance(distance);
		distance += distanceStep;
	}

	return result;
}

You also need to pass all reference symbols through this same process so they have the same number of points, evenly distributed along the length of the curve.

With this alone you will be able to identify symbols but it will only work if the user draws the symbol in the same location size and orientation of the reference symbol. To fix that issue we need to normalize the symbol.

Normalizing a Symbol

Normalizing a symbol means removing any differences in position, size, and orientation between two line drawings.

Normalizing Position

To normalize the position in a symbol simply pick a feature on the curve and move all the points to be relative to that feature. I picked the center of mass as that feature but simply picking the first point in the curve would also work.

public Vector3 FindCenterOfMass(Vector3[] points)
{
    Vector3 midpoint = Vector3.zero;
    float midpointWeight = 0.0f;
    for (int i = 0; i < points.Length - 1; ++i)
    {
        float distance = Vector3.Distance(points[i], points[i + 1]);
        midpoint += (distance * 0.5f) * (points[i] + points[i + 1]);
        midpointWeight += distance;
    }
    return midpoint / midpointWeight;
}

public Vector3[] NormalizePosition(Vector3[] points)
{
    Vector3 center = FindCenterOfMass(points);
    return points.Select(point => point - center).ToArray();
}

Normalize Size

To normalize the size you need to pick two points in the drawing and scale the drawing so that distance is 1. Any two features will work but you want to be sure the two points won’t ever be close together since small changes in that length could mean massive differences in size between two normalized symbols. I picked the distance between the center of mass of the symbol and the point furthest from the center of mass as my reference. Since I normalize the location of the points first the origin is the center of mass so I can just use the length of each vector as its distance from the center of mass.

public Vector3[] NormalizeSize(Vector3[] points)
{
	float sizeSqrd = points.Select(point => point.sqrMagnitude).Max();
	float scale = 1.0f / Mathf.Sqrt(sizeSqrd);
	return points.Select(point => point * scale).ToArray();
}

Normalize Orientation

This is by far the hardest thing to normalize. I would only try to normalize the orientation if you absolutely have to. If you are dealing with drawing symbols on a 2D screen with a very clear up direction you could just expect the player to always draw the symbol in the same orientation. If you do need to unrotate a symbol you will need to identify a single vector to use as a reference if you are working in 2D. You will also need an additional up vector if you are working in 3D. For my primary vector I used the direction from the first point to the center of mass. Since I am working in 3D I needed a second vector and used the normal of the plane that best fit the points using a least squares regression. I don’t include my implementation for the second vector in this article to keep it simple.

Once you have your basis vectors you need to rotate the symbol so that the first basis vector points in the positive X direction and the second vector points roughly up.

public Vector3[] NormalizeOrientation(Vector3[] points, Vector3 firstBasis, Vector3 secondBasis)
{
	Quaternion rotation = Quaternion.Inverse(Quaternion.LookRotation(firstBasis, secondBasis));
	return points.Select(point => rotation * point).ToArray();
}

Keep in mind that if you chose to unrotate your points, or depending on what you chose as the basis vectors, you may limit what symbols will be considered unique. Take for example the letters ‘d’ and ‘p’. If you were to unrotate these symbols they would end up being nearly identical not allowed the computer to differentiate them. If you simply chose not to unrotate in this case they would stay distinct. The features you pick as the basis vectors could also effect the accuracy of the algorithm. In my example, using the direction from the first point to the center of mass means that if there is a symbol that has a starting point very close to the center of mass the unrotate step could yield wildly different results with only small errors in the drawing. Just be careful to pick two basis vectors that will tend to be long and roughly perpendicular.

Complete Normalization

Once you can normalize position size and orientation put it all together. Note that I used the global forward vector for the second basis when unrotating. This is to keep the below example simple.

public Vector3[] Normalize(Vector3[] points, int resolution)
{
	Vector3[] result = UseNPoints(points, resolution);
	result = NormalizePosition(result);
	result = NormalizeSize(result);
	
	// Only unrotate so that the line going from the first point the center of mass is horizontal
	result = NormalizeOrientation(result, -points[0], Vector3.forward);
	return result;
}

Putting it All Together

The implementation of normalize is what makes or breaks this algorithm. The rest of the implementation is trivial. You simply store a list of normalized symbols for reference and find the closest reference symbol when trying to identify one drawn by the user.

public class DistanceIdentifier
{
    private List<KeyValuePair<string, Vector3[]>> symbolList = new List<KeyValuePair<string, Vector3[]>>();
    
    // The number of points touse in each symbol
    private int resolution = 40;
    
    public void AddSymbol(string name, Vector3[] points)
    {
        // Be sure to Normalize the reference symbol before adding it
        symbolList.Add(new KeyValuePair<string, Vector3[]>(name, Normalize(points, resolution)));
    }

    // Measures the distance between two symbols
    private float Similarity(Vector3[] a, Vector3[] b)
    {
        float result = 0.0f;
        for (int i = 0; i < a.Length; ++i)
        {
            result += (a[i] - b[i]).sqrMagnitude;
        }
        
        return result;
    }

    public string Identify(Vector3[] points)
    {
        string result = null;
        float similarity = float.PositiveInfinity;
        Vector3[] normalized = Normalize(points, resolution);

        for (int i = 0; i < symbolList.Count(); ++i)
        {
            float test = Similarity(symbolList[i].Value, normalized);

            if (test < similarity)
            {
                similarity = test;
                result = symbolList[i].Key;
            }
        }
        
        return result;
    }
}

Where to Go From Here

This algorithm is a simplified version of what I am using in my game here are a few bugs you may encounter or features you may want and how to deal with it.

Identifying a Symbol Drawn From Either Endpoint

The algorithm detailed above would only identify a symbol if the player only starts the symbol from the same endpoint as the reference. A simple fix would be to add each symbol the the reference twice. Once with the points in the correct order and once with the points in reverse order.

Determine the Position, Size, and Orientation of the Symbol

As in my case, you may want to know the position size and orientation of the user drawn symbol. To do this you save the reference point used to normalize the position, the scale used to normalize size, and the rotation used to normalize orientation. Combine the three pieces of data into a transform and return that information with the identified symbol. You may also want to store the transforms for normalizing the reference symbols. You can concatenate the inverse of the reference symbol transform with the transform from the drawn symbol to get a transformation that can map the reference symbol into the 3D space where the user drew the symbol to render a cleaned up version of the symbol.

Machine Learning

I originally tried using neural networks to identify symbols. The normalize step was the same but the normalized data was analyzed using a neural network. What I discovered is that you definitely should not use neural networks. They are slower, less accurate, and are a pain to train. Every time I wanted to add a new symbol or modify an existing one I would have to retrain the network. That doesn’t mean you can’t use machine learning though. To have the computer “learn” simply add more variations of the same symbol as drawn by the players. This will help the computer better pick the correct symbol as it has a large database of symbols to compare to. As your data set grows you may start to see a slowdown when identifying symbols. This leads me to the next point.

Handling Large Data Sets

In my game, I only have around 30 symbols to check against. There is no need to optimize this since there is a tiny amount of data compared to what it would take to cause a noticeable slowdown. I would recommend not worrying about optimizing the search until you know you need it. If you do, then you will need to use some sort of spatial index for calculating the nearest neighbor. If you are familiar with spatial indices then you may be thinking a k-d tree is a good fit. After all, it is used in 2D and 3D nearest neighbor calculations with great success. Unfortunately the k-d suffers from the curse of dimensionality. Once you start working with a large number of dimensions a k-d tree does about as good as a linear search making it about as good as no index at all. There are solutions to this problem but you usually either need to limit the radius of the search for the nearest neighbor, possibly resulting in no match, or sacrificing accuracy by not finding the actual nearest neighbor but instead finding a near enough neighbor. This is an area of ongoing research and is out of scope for this article. If you do need a spatial index, try searching for nearest neighbor algorithms for a high number of dimensions online.

Cancel Save
1 Likes 0 Comments

Comments

Nobody has left a comment. You can be the first!
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!

Featured Tutorial

Identifying drawings can be difficult. This articles covers a basic algorithm that can classify symbols drawn by a player that is both fast and accurate.

Advertisement
Advertisement
Advertisement