Amazon has (somewhat) recently added some new services under the Artificial Intelligence offerings, one of them being a Machine Learning service. I wanted to play around with their predictive analysis service so I decided to make a really simple proof of concept.
Predictive analysis, in a nutshell, is basically looking through a large dataset of various input values that each contain an outcome. That outcome may be a true or false conditional (Binary Classification), a numerical value (Regression), or identifying a label (Multiclass Classification). This data is used to generate a model that makes a correlation between the input variables and the outcome, which can then be fed new input values to predict what the outcome will be. The catch, of course, is that you need to have this large set of training data to work with.
Since I didn’t have any data available, I wanted to see what I could possibly generate on my own. I decided on trying to make a model that could guess the name of a color based on the input value. The end result would look something like the following (once integrated into slack):
So to generate the training data, I took a list of about 500 common web hex colors and normalized all of the names to be from a common set (e.g. red, orange, yellow, etc, instead of brick, tangerine, gold, etc.) Then I split the hex value into data that AWS would be able to try and make correlations between. I started with the simplest, which was just RGB values with each channel being an integer between 0 and 255.
I organized this data in Google Doc and exported it as a CSV. This was then uploaded to the AWS Machine Learning service, and a Multiclass classification model was created. This consisted of three numerical inputs – Red, Green, and Blue values, as well as the categorical target attribute – which is essentially a color label.
The first results weren’t too accurate – in fact, there were a lot of blatantly wrong assumptions that would happen fairly often. I decided to change my training data to be something that might be “easier” for correlations to be created. So I took the same set of color values and instead used HSV (hue, saturation, and value) values to train the model. The results were much better and returning accurate guesses more frequently. There are some colors, such as white & gray and orange & brown, that still prove a little tricky to guess.
Even though there’s no real practical value to this, I still wanted a way to interface with it outside of the AWS console. I figured that this would be a good excuse to try out Lex, which is intended to be use for Alexa-like chatbots.
It was pretty straightforward to set up an intent to capture the user’s input. In this case it’s just asking for the hex value. Lex then sends this value to a lambda node function, which converts the hex value to HSV values. My lambda function then sends these values to the Machine Learning endpoint to make a realtime prediction. Lastly, the lambda function captures the result, and adds some language based on the confidence of the outcome before returning it to Lex to be presented to the user. (For example, 98% confidence or higher may result in “I am certain that is…” and less than 80% confidence may return “If I had to guess I’d say…”)
I added more intents that were, for the most part, just informational. I routed all of these intents through the same lambda function, and handled them based on the passed currentIntent.name property.
One issue I ran into with the intent was getting Lex to actually recognize the hex color as a slot. There’s no “hex color” slot type, and you can’t defined your own based on a regular expression, so I was a bit stuck. (The customs slot didn’t seem to want to learn based on sample inputs.) Luckily, through trial and error, I discovered that the built-in AMAZON.StreetAddress type actually worked quite well for identifying hex values.
Lex then needed to be hooked up to something, otherwise you’re stuck with something only accessible through the AWS console. Amazon has three integrations that are pretty much out of the box – Facebook Messenger, Slack, and SMS through Twilio.
I first set up a Slack bot, as I figured that would be the easiest way to share it with other people at work. I then did a SMS text bot as well since I already had a Twilio account setup from my Google Spreadsheet Image Generator thing.
You can try out the result yourself. If you have the rights, you can add the bot to your own Slack team with the button below:
Otherwise, you can chat by texting +1 (412) 693-6060.
You can learn more about Amazon’s Machine Learning service here and building conversational chatbots with Lex here.