November 10 2020

cover image two

The Challenge

Two decades ago, machine learning tasks related to images and videos were conducted by PhD scientists, and researchers from blue-chip companies, as the tasks relied on heavy resources and computational powers. Classifying a dog from a cat was a heavy and a major problem that many scientists worked on. But, currently, such a problem is a beginner-level task, as classifying images and videos have matured exponentially in the last 10 years.

For humans, object detection is a menial task. Humans have been trained from birth to distinguish the differences between animals, plants, humans, and other objects, therefore a task such as identifying cats from dogs is simple. But the same cannot be said for machines; machines don’t have the intuition or the self-learning capabilities of humans; therefore, many images are needed to train machines. The edge that the learning models possess over humans detecting objects is that the machines can process multiple images and detect many objects and patterns, which an average human might miss.

Countless domains, ranging from agriculture to space travel, utilize Object Detection to great results. For many years, CCTV cameras have been just an eye that does not have a brain: the user has to analyze each frame by frame to obtain details from the footage; but nowadays, thanks to Artificial Intelligence, analysis of CCTV footage can be obtained almost instantly, and also at real-time too. Also, use-cases such as detecting spoilt cultivations, detecting obstacles in a self-driving car, automatically recognizing an employee from cameras are some places Object Detection has helped immensely.

On the other hand, gaming is another massive industry and in parallel with Artificial Intelligence the field of gaming has also seen massive evolution within the past 10 to 15 years, from statics keyboard controllers, we have evolved to modern joysticks with built-in gestures. And the next evolution of gaming is motion-based control, where the barrier of needing a controller to control the characters in the game is further broken down.

The whole idea of project Morpheus is to enable gamers to control first-person games through body real-time physical actions, although this concept is not totally new and products like Microsoft Kinect and PlayStation move had already implemented this idea, these products required a lot of external devices like specialized sensors and cameras and also came with a heavy price tag. In addition products like Microsoft Kinect are designed to work with only specific game consoles such as Xbox.

After a thorough analysis of gamers’ psychology and the need for the next evolution in gameplay, our AI team took up the task of building a motion-based game control application for personal computers, that user no external devices or sensors except the computer’s built-in webcam. Although the solution needed is very straightforward, the complexity peaked when we had had to track the body movements of users with high precision and interface with the operating system to map the detected movements into virtual key presses.

The Solution

As with any AI project the first barrier of project Morpheus was also data. There was very little labeled open-source datasets available to track and detect game control postures such as punches and kicks. So we at Rootcode AI took the initiative to create an action detection dataset internally by getting videos of our own Rootcoders making the postures and movements, these video streams were then split into frames and then labeled. The next step was to build the model, a combination of custom trained object detection and pose tracking models were used and finally to interface with the OS to input key presses our AI team developed several in house tools using C and C++. We are also in the process of releasing a research paper on the dataset and model we used to train this task.


Although currently in the research stage, this application could revolutionize the future of gameplay as a whole, allowing gamers to control any first-person game based on their physical movement, without requiring an external peripheral in addition this would also lead gamers to be more active when playing games instead of being idle. Our vision for the secondary evolution of this project is to incorporate VR technology.