Back to our Insights
Back to our Insights

Clay AIR Convictions: Why We Believe in The Intersection of Computer Vision, Machine Learning and Hardware Agnosticism (1/2)


Thomas Amilien

Hand tracking and gesture recognition are some of the most advanced types of new HMI software, made possible by breakthrough technologies such as computer vision and machine learning. Simply put, it’s the software behind the camera lens that enables users to interact with digital content naturally, to control devices at a distance, and to navigate displays without having to make physical contact.

At Clay AIR, we prioritize three aspects of our solutions:

  • Computer Vision that leverages a device’s cameras
  • Machine Learning that fuels continuous advancement
  • Hardware Agnosticism that enables our solutions to be implemented into any interface with a camera.  

Computer Vision: leverage cameras as they become ubiquitous

The process of hand tracking and gesture recognition begins by using the camera lens as an input to capture a real-time view of the physical world. Images are processed using computer vision, which seeks out familiar objects that it has been trained to identify.

With cameras becoming ubiquitous in devices such as laptops, mobile devices, and cars, human-computer interactions are becoming increasingly commonplace through intuitive user interfaces. Hand, eye, body and facial tracking are key in HMI development, and are easy to integrate alongside inputs like voice control.

By leveraging existing cameras, this presents an ideal form-factor where no additional hardware is required, as we will discuss below, providing benefits such as seamless and low-cost integration. 

Machine Learning: making interactions smart 

Machine learning interprets and optimizes inputs and adapts as the user’s session continues, enabling smoother interactions with devices like smart displays, holograms, robots and more. 

Machines can’t learn on their own, so they are given large amounts of data to learn from past results, to make more accurate decisions in the future. For example, to be able to identify a gesture will involve a machine learning model which has been previously trained with hundreds of thousands of images. Then, this algorithm is implemented as software in hand tracking and gesture recognition to select predefined features out of the incoming real-time camera feed. 

The input data from camera feeds is relayed in pixels, and the more pixels, or the higher quality camera, the easier it is to identify objects. Thus, hardware selection plays a significant role in the efficacy of machine learning based solutions.

Hardware Agnostic: compatibility and easy integration across devices  

In order to adapt to any use case, being hardware agnostic is key to Clay AIR’s products. Where there is a camera — whether it be RGB, monochrome, TOF, IR/NIR or fisheye  — it is possible to seamlessly integrate hand tracking and gesture recognition solutions without the need to modify a device’s existing hardware. AR/VR headsets, laptops, tablets, smart TVs, robots, and smartphones are all compatible, regardless of the OS. 

For example, hand tracking and gesture recognition is being implemented by major car manufacturers into pre-existing cameras so that drivers can control their dashboard without having to look away from the road to find a button. 

Hand tracking and gesture recognition can also be integrated into game consoles, manufacturing processes, kiosks, augmented reality and virtual reality reality headsets, and many more devices via the onboard camera, including simple webcams. 

In such a diverse and constantly changing hardware ecosystem, it would be far too expensive to customize a solution for each design. As a software product, we make it easy to integrate into your existing device with no need for expensive structural changes. 

As a result of putting hardware agnosticism, machine learning and computer vision at the core of our products, we are able to deliver agile solutions that can adapt to your situation and use cases.

If you would like to learn more about computer vision, machine learning and hardware agnosticism, and how it delivers hand tracking and gesture recognition, our team is here to answer your questions. 



Enabling next generation interactivity with digital interfaces.