Artificial Intelligence (AI) is mind-blowing for me 🤯. Even though I understand the world of Software Development, every time I try a product that has been touched by AI, I can only nod and say, ‘wow, this is like real magic’. Of course, I understand, there’s no magic in technology. Everything can be explained and, of course, can be learned.
AI is one of the things I want to explore this year. I don’t have big ambitions. This learning is not to become an AI master (like an AI Engineer, ML Engineer, and so on). But at least, I want to enjoy playing with AI through the tools I usually use as a web developer.
Learning Outcome
Every time I want to learn something new, I always set a target. The goal is to stay motivated and focused throughout the learning process. My target for learning AI this time is at least to create a web application that involves computation processes using AI technology. Here are the specifications for the application.
- Application Overview: Doll name detector application
- Interface: HTTP API (RPC)
- User Input: File -> Doll Photo
- System Output: Doll Name
- Tools to be Used: JavaScript, Node.js.
Yes! My target is to try to create an application (RPC API, because I’m lazy to create Front-End lol) that can detect the names of my child’s dolls. She often buys dolls and immediately gives them his own names. 😆
So, from the above target, there are two main things I need to achieve.
- Must be able to create my own Machine Learning model. The Machine Learning model must learn the names of my child’s dolls.
- Must be able to use the Machine Learning model in Node.js (I believe I can because there is ONNX runtime web or Tensorflow.js).
Let’s start the adventure!
Target 1: Create My Own Machine Learning Model
My adventure in the AI world begins by taking the Belajar Dasar AI class at Dicoding. I saw the material was quite suitable and not confusing. There are 4 core topics in this class, starting from Introduction to Artificial Intelligence, Data for AI, Introduction to Machine Learning, to Deep Learning for Everyone.
To pass this class, I had to answer some multiple-choice questions. Fortunately, I was able to get a score of 100 on each question 😅. Usually, in Dicoding classes, to pass, there are assignments that must be submitted. Unfortunately, this class doesn’t have that.
Here is my certificate after completing the Basic AI Learning class.
Summary of Learning Outcomes from the Basic AI Learning Class
Let me tell you some interesting things I got from this class.
Introduction to Artificial Intelligence
In this section, we are introduced to AI from the understanding to examples of popular products. I got some new insights, for example, I now understand the relationship between Artificial Intelligence, Machine Learning, Deep Learning, and Generative AI. All of these are at increasingly deeper levels and become a unity.
In this section, I also realized that the use of AI has its stages. Starting from collecting data, cleaning data, creating and training models based on data, until we can benefit from the AI model we have created. Although the technical details are not explained in detail in this class, in my opinion, for a web developer like me, just knowing “Oh” is enough (at least for now). 😋
Data for AI
This module discusses data that can be processed by AI. Basically, AI can process a lot of data, as long as that data can be stored on a computer. So, if the data is still in conventional form, for example, on paper, AI cannot process it directly. It must be digitized first, either by taking photos or writing it in document form.
There’s not much new for me in this section because as a web developer, I’m used to working with data. Although JavaScript is loosely typed, I know the types of data that computers usually process because I have dealt with Databases.
There is a quote that I think is cool and needs to be remembered, namely “Garbage in, garbage out”. When we input “garbage” data, the result will also be “garbage”. So the quality of what AI produces depends on the data we provide as input.
Introduction to Machine Learning
In this module 🧠, I started working hard processing sentence by sentence because there are many machine learning terms that I didn’t know. In essence, this module explains some types of machine learning based on how they learn. There are 3 types explained, namely supervised learning, unsupervised learning, and reinforcement learning.
In short, supervised learning is Machine Learning that learns through “labeled” data. For example, there is data that is an image of a 🐱, and the data is labeled with the text “Cat,” just like the 🐔 image labeled “Chicken.” Later, the machine will learn from what we give and finally be able to distinguish between Cat and Chicken.
Supervised learning is similar to teaching my child the names of animals when she was 1 year old. "This is a Horse," "This is a Giraffe." In my opinion, supervised learning is suitable to be used as a solution if you want to have a "detector" tool based on data (structured or unstructured).
Unsupervised learning is Machine Learning that learns from unlabeled data. Usually, this type of Machine Learning is used to group data, not to determine the type of data. For example, if the machine is given the input [🍎🍇🧀🌶️🍆🍌]
, the machine will produce output from the grouping of its input, for example, the result will be like this [🍇🍆,🌶️🍎,🧀🍌]
. So, the machine independently looks for patterns that can group the input, in this case, it is grouped based on color.
Finally, reinforcement learning is the most interesting in my opinion. This type of machine learning learns with a reward and punishment approach. The machine must act correctly to get a reward, and if wrong, it will receive punishment. Here, the machine's position must ideally get a reward as much as possible. If it gets punished even once, the process will be repeated until the machine fully understands its goal.
If you can't imagine how it works, I once watched a good video that exemplifies the implementation of reinforcement learning below.
I got a lot of new knowledge from this module, both that I got directly when I read the material or that I explored more deeply from other sources. This module also explains how to maintain AI models so that they continue to produce relevant predictions and are not consumed by time.
Deep Learning for Everyone
The module Deep Learning for Everyone is equally heavy in my opinion as the previous module. I realize this is due to the capacity of my brain that is 🤏🏻. However, I still continued to read slowly, understanding each sentence. So what did I learn from this module?
Deep learning is a subset of Machine Learning. In essence, AI built with Deep Learning has a working method inspired by the human brain. Here, the meaning of Artificial Neural Network, Multi-Layer Perceptron, and other terms related to Deep learning is explained. Honestly, when writing this article, I already forgot part of it because the information given is too deep, and I myself refuse to delve into it (again because I just want to use AI, not create it hehe).
The most interesting thing in this module and perhaps the most interesting thing in the whole class is that I found out about the existence of the Teachable Machine platform. This is a platform for creating AI models in a very beginner-friendly way, even non-developers, I’m sure, can use it. Although so, the models produced are still powerful. What makes me happiest, of course, is that this platform is free. 😛
After finding this platform, creating my own machine learning model became even easier 😂. Even though the method used is very instant, it is quite sufficient for my current needs. 😅
Create a Model Using Teachable Machine
Before I create the AI model, let me introduce my three child’s dolls that will be used for the experiment.
The doll on the left is named Gogo, short for Gosig Golden. Well, the one in the middle is Lili, taken from the name Lilieplut. Meanwhile, the one on the right is named Apikat, gibberish from the name of the Mirkat animal.
It’s time to go to the Teachable Machine platform.
To create a model using Teachable Machine, visit the website at teachablemachine.withgoogle.com. Like other Google products, the interface is cool and very user-friendly.
Teachable Machine provides three base models: image project, audio project, and pose project.
This time, I chose the image project (and the standard image model) because it suits our needs.
The model training process is super easy. Just provide a dataset (images) for each class. Here, I created three classes, namely Gogo, Lili, and Apikat. Make sure each class has an adequate dataset.
After that, click the Train Model button to start training.
Yay, now we can export the model. The export results are available in various formats. I chose the Tensorflow.js format because it is used in Node.js.
The model will be downloaded in .zip format, and when extracted, the contents will be like this:
- my-tf-model
- metadata.json
- model.json
- weights.bin
The first target is complete! Next, we will use this model in Node.js! 🚀
Target 2: Use the Machine Learning Model in Node.js
To shorten the article, let’s skip the HTTP server creation part. It’s already my job. 😬 So, I’ll provide a direct starter project, which can be downloaded from the link: starter project.
In this project, there is already an HTTP server (Hapi Framework), the POST /predict
endpoint, a machine learning model (from Teachable Machine), and a unit test to try three dolls (with different data from the training dataset). What's missing is just the integration of the Tensorflow.js model with the application.
Running the starter project is quite simple.
- Install the required dependencies:
npm install
- Run the HTTP server:
npm start
ornpm run dev
(for auto-reload) - Run the unit test:
npm test
Make sure the unit test fails because we will fix it now. Let’s go!
First, install the @tensorflow/tfjs-node
package because we will use Tensorflow.js in Node.js. It's easy, type in the terminal:
npm install @tensorflow/tfjs-node
Next, load the machine learning model located at /models/model.json
and its classification labels in /models/metadata.json
.
Create a new function named loadModel
to abstract this entire process. Put this function in a new file located at /src/ml.js
.
// file: /src/ml.js
import tf from '@tensorflow/tfjs-node';
import * as path from 'path';
import * as fs from 'fs';
async function loadModel() {
const modelUrl = tf.io.fileSystem(
path.join(process.cwd(), 'models', 'model.json'),
);
const metadata = JSON.parse(
fs.readFileSync(
path.join(process.cwd(), 'models', 'metadata.json'),
{ encoding: 'utf-8' },
),
);
const model = await tf.loadLayersModel(modelUrl);
model.classes = metadata.labels;
return model;
}
The process of loading the model is done with the tf.loadLayersModel
function. This function takes one argument, which is the url
of the model to be read. If our model is stored in cloud storage and has a public URL, we can provide the URL directly. But, because our model is local, we need to change the local address to a URL. You can see this process in the initialization of the modelUrl
variable.
Still in the same file, we need to create the transformImage
function, which will transform the image from Buffer
(user input) to Tensor
(needed by the model as input).
// file: /src/ml.js
function transformImage(image) {
return tf.node
.decodeImage(image, 3)
.expandDims()
.resizeNearestNeighbor([224, 224])
.div(tf.scalar(127))
.sub(tf.scalar(1));
}
The image transformation process is done with the tf.node.decodeImage
function. In addition, we need to perform various normalizations so that the input can be processed well by the model to produce accurate predictions. One reference for normalization can be seen at: Data augmented.
Next, create the predict
function, which is used to abstract the input prediction process. This function takes two arguments, namely model
and image
(in Tensor form).
// file: /src/ml.js
import { indexOfMaxNumber } from './utils.js';
// ... other code
async function predict(model, image) {
const result = await model.predict(image).data();
const index = indexOfMaxNumber(result);
return model.classes[index];
}
The prediction process is done through the model
using model.predict()
. Then, to get the prediction data, use the data()
function. This function returns an array where each element shows the match value of each class. Here, we need to get the index of the highest prediction value using the indexOfMaxNumber()
function and return the predict
function with the classification label according to the highest index.
Finally, in this file, we need to export the three functions we have created so that they can be used in other JavaScript files.
// file: /src/ml.js
// ... other code
export { transformImage, loadModel, predict };
The creation of functions related to Machine Learning is complete! 😌 Now, let’s use what we have created.
First, open the file /src/http.js
and call the loadModel
function in the top-level createServer
function to load and get the ML model.
// file: /src/http.js
import { loadModel, predict, transformImage } from './ml.js';
async function createServer() {
const model = await loadModel();
// ... other code
}
Next, in the handler
function, transform the image
from the request, then predict
, and return the result as an HTTP response.
// file: /src/http.js
import { loadModel, predict, transformImage } from './ml.js';
async function createServer() {
// ... other code
server.route({
// ... other code
handler: async (request) => {
const { image } = request.payload;
const processedImage = transformImage(image);
const result = await predict(model, processedImage);
return { result };
},
// ... other code
});
return server;
}
Everything looks good! Now, try testing by running npm test
.
BOOM! Even though the dataset is not that much, the prediction results are quite accurate. Remember, I didn’t use the same image as during training. So, this testing reflects the input in production. Teachable Machine is indeed cool! 👏🏻
Let’s mark all the targets as done! 😁
- (DONE) Must be able to create your own Machine Learning model. The Machine Learning model must learn the names of my child’s dolls.
- (DONE) Must be able to use the Machine Learning model in Node.js (I’m sure I can because there is ONNX runtime web or Tensorflow.js).
Final Words
Through this exploration, I not only successfully achieved the initial target but also gained a broader understanding of the world of AI. Who knows, maybe in the future, I will continue to explore and delve even deeper into this fascinating world. However, learning is an endless journey, and we must be ready to face new challenges that may arise ahead. Let’s continue to learn and grow together with technology that is always changing and developing rapidly. 🚀