OBS Gestures: How to Use Your Webcam to Control Your Stream

Written by on November 7, 2022

In this video, you will learn how to use Gestures from your webcam to control OBS. You can switch between scenes, decide to show different sources and a whole lot more. Gesture controls are a form of Computer Vision that uses your camera to find gestures and send commands to OBS to take actions. The setup is simple enough because my friends over at Roboflow have published open-source codes you can use to configure your project.

  1. What is OBS and what can it be used for
  2. How to set up OBS for the first time
  3. How to use Gestures from your webcam to control OBS

What is OBS?

OBS, or Open Broadcaster Software, is a powerful open-source software used by content creators all over the world for live and recording videos. OBS is free to download and use, and it has a wide of features that make it perfect for anyone from beginner content creators to professional broadcasters. OBS be used to stream games, videos, or even live stream your desktop.

One of OBS’s most unique features is its ability to use computer vision to track the position and movements of your head and eyes in real time using your webcam’s video feed. This information can then be used to adjust the positioning of your webcam and overlay graphics accordingly. This will help to keep you in frame and make your live streams more professional and engaging for your viewers. OBS Gestures is a powerful tool that every content creator should learn how to use.

Getting OBS Set Up

Setting up OBS for the first time can be a little daunting, but with this guide, you’ll have it up and running in no time!

1. Download OBS. OBS available for free on Windows, Mac, and Linux. You can find the download links on the OBS website:

2. Install OBS. After downloading OBS, just run the installer and follow the instructions.

3. Open OBS. Once OBS is installed, open it up by clicking on the OBS icon on your desktop or in your Start Menu.

4. Add your sources. The first thing you’ll want to do is add your sources. To do this, go to the “Settings” tab and select the “Sources” category. Here, you can add any of your devices or software that you want to stream or record from. For example, you can add your webcam, microphone, and games or videos that you want to stream.

5. Configure your settings. Next, you’ll want to configure your settings for each of your sources. This includes things like resolution, bitrate, and frame rate. You can find these settings by going to the “Settings” tab and selecting the “Settings” category. Each source has its own set of settings that you can configure to get the best results for your stream or recording.

6. Start streaming! Once everything is configured correctly, you’re ready to start streaming! Just click on the “Start Streaming” button and OBS will start broadcasting live to your viewers.

Gesture Control of OBS

In this tutorial, we will be using gestures from your webcam to control OBS. OBS is a powerful open-source software used for recording and live streaming events. Gesture controls are a form of computer vision which uses your camera to find gestures and send commands to OBS to take actions. The setup is simple enough because my friends over at Roboflow have published open source codes you can use to configure your project.

To get started, you will need:

  • An OBS software installed on your computer
  • A webcam connected to your computer

In the example code, which you can download in the links below, you will find some sample HTML, and JavaScript code. This code is set up to use the virtual webcam output from OBS, run a computer vision model and output commands into the OBS websockets. If you are using OBS 28, you will notice OBS WebSockets are now included by default. If you have OBS 27 or an older version you will need to install OBS WebSockets. 

The code provided for this by Roboflow will need some customization to run properly. Roboflow has provided a detailed tutorial video for this, so if I breeze over a detail you are unsure of, reference the video link below from Roboflow. 

Now the first thing you need to do is create a Gesture Control model. It’s basically a dataset the computer vision uses to decide when a gesture is recognized in your video. You can decide if you want to use your own video to create a model or use the data available in the Roboflow Universe. You can start by cloning data from the Roboflow Universe to save time, but ultimately you will have the best results by training the model with your own hands in your environment. 

To do this, you need to record yourself in a video making the gestures you want to use. Then you can upload this video into Roboflow and annotate on the video to train the model. This is how computer vision works. Pretty cool right?

Once you have annotated your images and published your dataset, you are able to get an API key you can use to run the model. This API key and the name of your model is necessary to set up the gesture controls for OBS. 

Now that you have your CV model and your API key you need to open up your JavaScript file and put those into the code. You will also need to enter your OBS websocket information found in the web sockets settings area. 

The code is setup to switch between two scenes in OBS and track around an source file. If you want to customize this code, you may want to read the Roboflow blog post which digs into the details further. 

Once you have customized the code you can open the index.html file in a Google chrome web browser. This allows you to select the virtual webcam output from OBS as the video. If everything is set up properly you should see boxes around gestures you make in the video indicating the computer vision model is properly recognizing objects in the video. 

If it is not working you can use the Google console to inspect the webpage to see if the connection to both Roboflow and OBS are working. 


Gesture controls are ideal for video production with OBS because they allow you to control the software using simple gestures from your webcam. This eliminates the need to use a keyboard or mouse, which can be cumbersome and time-consuming when trying to configure or change settings during live stream or recording. Gesture controls also provide an easy way to navigate between different scenes and sources, making it quick and easy to switch between different configurations.

Current track