Normally, the smart devices in our homes are subject to internet availability and third-party services to work correctly. Data is everywhere in the cloud. We experience the effects of latency as our commands are processed, and the action required is performed.
This project tries to solve mentioned problem by making the most of the microcontrollers we commonly use, implementing a system capable of processing voice commands in real-time, and controlling a desk lamp, all locally, without the need for the internet.
For this application, two different tools are used - Edge Impulse and Cyberon.
Hardware
To carry out this test, you need the following hardware components:
- RAK19001 Dual IO Base Board
- RAK4631 (nRF52840) Core Module or a RAK11200 WisBlock Core ESP32 Module
- RAK18000 PDM Stereo Microphone Module or RAK18031 WisBlock Audio PDM Microphone Module
- RAK13001 WisBlock Relay IO Module
- Conventional Desk Lamp
đź“ť NOTE: This is also compatible with the ESP32 and RP2040 cores. |
Software
- Edge Impulse
- Visual Studio Code for PlatformIO
- Arduino IDE
đź“ť NOTE:
|
WisBlock Assembly
To put together the different modules, you need to connect them to the respective slots in the base board.
Using Edge Impulse
Edge Impulse is an open and free Tiny Machine Learning development platform where you create your dataset, design your model, test it and deploy it back to your microcontroller.
To start with, you need to create a dataset. For this guide, a 10-sec sound recordings are uploaded from the PC headset with a repeating keyword, then split them into 1-sec windows. The keywords used for the model are the following:
- Hey RAKstar
- Lights On
- Lights Off
The impulse was created to identify the voice by adding an audio (MFCC) processing block and a classification (Keras) learning block.
- Set the Window size to 1000 ms (the same length as our recordings).
- Set the Window increase to 100 ms.
- Set the Frequency to 16 kHz (model sampling frequency).
In the Audio Processing Block (MFCC), don’t change anything, and click Save parameters.
In the Neural Network learning block, it is defined as follows:
- The number of training cycles to 100;
- The learning rate to 0.005;
- And enables Data augmentation.
After defining the Neural network architecture (used the default one for audio recognition), click Start training.
In the window, you can see the model performance results with the set training parameters. they are pretty decent…, advice: “is not always good for our model to be 100% “accurate” this could mean that our model is overfitted and will perform badly on new data”.
đź“ť NOTE: This Edge Impulse project is public, so you can clone and modify it. |
Code
Select Arduino library for your deployment.
Depending on your model’s complexity, resource requirements, and the development board you are using, you should consider enabling the EON tuner and select between 8 bits or 32 bits optimization.
Click Build and save the ZIP file with your trained model and unzip it.
Download the code from the GitHub repository.
Open the folder called Voice-lamp-Edge-Impulse with Visual Studio Code.
Drag and drop the library folder of your Edge Impulse model to the lib subfolder of the project.
đź“ť NOTE: The library in the code is named correctly. In this case, it is WisSound_inferencing.h because the Edge Impulse project is called WisSound. |
Connect your WisBlock board to the computer through a USB cable.
Compile the code and flash your board.
Testing
To control the AC lamp, the relay module is used to simply interrupt the line of the AC source to the lamp. This way, the project is repurposed to control any appliance that is in the power range of the relay.
After powering our board, we have ready and running our voice-controlled lamp, I’ve created a custom 3D enclosure for this project, but you can buy yours in our store.
📝 NOTE: The WisBlock board will turn on the blue LED when hears the trigger word “Hey RAKstar”, and will turn it off and turn on the green LED when it hears Lights On, the opposite will occur when it hears Lights Off. |
Using Cyberon
Cyberon is a pro tool, and no need to create the model because RAK supplies it with better quality and robustness. You can also request to create custom models for you for a fee.
To test the project using Cyberon, you only need to flash the core with a simple Arduino code with the trained model. Also, use a certified core to utilize Cyberon that you can find in RAK store (Voice processing variant):
And the same hardware is used:
- A base board
- A microphone
- A relay module
Code
First, install the RAKwireless Audio Library.
Download the code from the GitHub repository.
Open the Voice-lamp-Cyberon folder with the Arduino IDE.
This example code is considering the Trigger word Hey RAKstar and the command words Lights On and Lights Off. Using these keywords was as easy as defining a simple logic in the code as shown below:
if(nID == 2002){ // if the ID of the command corresponds to "Lights On"
digitalWrite(LED_GREEN, HIGH);
digitalWrite(RELAY_PIN, HIGH);
}else if(nID == 2003){ // if the ID of the command corresponds to "Lights Off"
digitalWrite(LED_GREEN, LOW);
digitalWrite(RELAY_PIN, LOW);
}
đź“ť NOTE:
|
Connect your WisBlock board to your PC through USB, select the right board and COM port in the Arduino IDE, and click Upload.
Testing
No need to change anything in the wiring or hardware setup. After the code is uploaded, the project is ready to be tested, in this case, using Cyberon.
đź“ť NOTE: The WisBlock board will turn on the blue LED when hears the trigger word Hey RAKstar. It will keep it on and turn on the green LED when it hears Lights On. To be able to say the trigger word again, you need to wait for the blue LED to turn off after +-3 seconds. |
As you can see, everything works accurately, with no false triggers and without having to get deep into Machine learning algorithms.
Thus, it is a great option for creating stable and reliable voice-controlled applications.
Updated