---
post_type: "project"
title: Intention Recognition
blurb: "A Suslib system that let people control software with gestures and the way they handled real objects."
chapter: "HCI Research · Suslib · 2018–2020"
org: suslib
year: [2019]
tags: ["AI", "computer-vision", "gesture recognition", "HCI"]
stack: ["OpenPose", "dlib", "YOLOv3", "LSTM", "TensorFlow", "TensorRT", "WebRTC", "Jetson Nano", "Raspberry Pi"]
link: https://suslib.com/core/intention-recognition
---

If Knowledge Recognition was about a collection understanding itself, Intention Recognition was about how a person reaches it. We wanted people to control a digital system with gestures and the way they handled objects — pick up a book, wave a hand, flip through pages, and the system responds. No keyboard, no special sensor.

Underneath, it used OpenPose for skeleton tracking, dlib for facial gestures, YOLOv3 for object detection, and LSTMs in TensorFlow to read sequences of movement. I designed the system and API; our ML engineer trained the models on over 100k labeled clips, and I helped curate the dataset. The part I cared about most was combining gestures with object handling — not just waving at a camera, but how you treat things in the real world. We added an observer mode so the system could notice a new gesture and adapt over time, and in demos people liked teaching it something on the spot. It ran on cheap hardware: ordinary webcams over WebRTC, models optimized with TensorRT for Jetson Nanos and Raspberry Pis.

Intention Recognition was shown in libraries, exhibitions, and design festivals. It was early, hands-on work on a question I still carry — letting software meet people where they are, instead of asking them to adapt to it.

`with` [Martijn de Heer](https://suslib.com), [Homayoun Moradi](https://suslib.com)
