The Desk Buddy Ecosystem

Every piece of the system communicates through a single MQTT broker. When you send a command from Desk Buddy Web, it travels as a JSON message to the ESP32 on your desk. Requests that need intelligence — object detection, motion planning, language — are routed to the Desk Buddy AI server.

%%{init: {
  'theme': 'base',
  'themeVariables': {
    'primaryColor': '#f0efed',
    'primaryTextColor': '#111110',
    'primaryBorderColor': '#b8b4ae',
    'lineColor': '#7a756e',
    'secondaryColor': '#f5f5f3',
    'tertiaryColor': '#f0efed',
    'edgeLabelBackground': '#ffffff',
    'clusterBkg': '#fafaf9',
    'clusterBorder': '#e4e2de',
    'titleColor': '#111110',
    'fontFamily': 'Share Tech Mono, monospace'
  }
}}%%
flowchart TB
    subgraph cloud["Cloud"]
        Web["Desk Buddy Web\nBrowser control panel\nFree & paid tiers"]
        Broker["MQTT Broker\nDedicated account per user\nReal-time message routing"]
    end

    subgraph device["Hardware & Intelligence"]
        ESP["ESP32-S3 Cam\nPhysical Desk Buddy\nServo control · Camera · Firmware"]
        AI["Desk Buddy AI Server\nCustom actuation models\nLLM · Vision · Object detection"]
    end

    Web <-->|"MQTT JSON\nCommands & status"| Broker
    Broker <-->|"Commands · Photos\nStatus updates"| ESP
    Broker <-->|"Intelligent requests\nrouted when needed"| AI

    classDef broker fill:#d4631a,stroke:#d4631a,color:#fff
    classDef node fill:#1c1c1c,stroke:#3d3a35,color:#f5f5f3
    classDef hub fill:#d4631a,stroke:#a34c12,color:#fff

    class Broker hub
    class Web,ESP,AI node
      

Controlled by JSON over MQTT

Every action is a JSON message published to your device's command topic. The firmware replies with in_progress then completed (or failed). Every message includes a unique action_id so you can correlate requests and responses.

Topics

CMD esp32_5/test Send commands to the arm
HB esp32_5/HEARTBEAT Live servo angles & telemetry

Available Actions

  • baseRotate — spin the 360° continuous base
  • servo — move elbow, wrist, or twist
  • gripper — grab, drop, soft hold, or set angle
  • controlik — inverse kinematics by distance
  • perch — move to stored rest position
  • detect_object — vision: find objects by phrase
  • detect_color — vision: identify dominant colors
  • photo — capture image from onboard camera
  • calibrate — save hover & perch calibrations
  • ota_update — remote firmware update over Wi-Fi

Example — Move the Elbow

// command
{
  "action":    "servo",
  "action_id": "20",
  "servoName": "ELBOW",
  "position":  135,
  "speed":     10,
  "sender":    "ai_server"
}

Example — Base Rotate

// command
{
  "action":      "baseRotate",
  "action_id":   "69",
  "controlType": "MAGNET",
  "direction":   "RIGHT",
  "speed":       "slow",
  "value":       3,
  "sender":      "ai_server"
}
Full MQTT API Reference →

Hardware Details

Desk Buddy is built entirely from off-the-shelf components. No custom PCBs, no proprietary parts — everything can be sourced independently.

ESP32-S3 Cam

The brains of Desk Buddy. Handles Wi-Fi, MQTT, servo control, and onboard camera capture — all in one compact module.

3 Servo Joints

Standard hobby servos for elbow, wrist, and twist. Each moves 0–180° with configurable speed.

360° Continuous Base

A continuous rotation servo drives the base. Controlled by magnet count, encoder ticks, or by homing to one of 12 origin magnets.

Gripper

Servo-driven gripper with GRAB, DROP, and SOFTHOLD presets, plus raw angle positioning.

Onboard Camera

The S3 Cam module captures images on demand for object detection, color detection, and AI vision tasks.

Common Power & Frame

Frame, brackets, and power supply are all generic and available from standard hobby and electronics suppliers worldwide.

4
Degrees of Freedom

Base rotation + elbow + wrist + twist — plus a gripper end effector.

Full Hardware List →

How MQTT Works

Instead of HTTP's request/response model, MQTT uses a broker as a central hub. Clients publish messages to a topic and subscribers receive them instantly — no polling, no overhead. For Desk Buddy, the web app publishes a JSON command, the broker delivers it to your ESP32, and the ESP32 publishes its reply back — all in under a second.

Your Mosquitto Broker

Hosted for you

No broker to install. Just point your ESP32 at the host and use your credentials.

TLS encrypted

All traffic is encrypted in transit. Your commands and telemetry stay private.

Command flow

1
Publish JSON command

From the web app, workflow, or your own MQTT client.

2
Broker routes it

Mosquitto delivers the message to your ESP32 subscriber.

3
ESP32 executes & replies

status: "in_progress" then status: "completed", matched via action_id.

Connect your own client

// python — paho-mqtt
import paho.mqtt.client as mqtt
import json

client = mqtt.Client()
client.username_pw_set("your_username", "your_password")
client.connect("your_broker_host", 1883)

client.publish("esp32_5/test", json.dumps({
    "action":    "servo",
    "action_id": "my-001",
    "servoName": "ELBOW",
    "position":  90,
    "speed":     10,
    "sender":    "my_script"
}))

Any MQTT client works — same host, same credentials:

  • MQTT Explorer — GUI desktop client, great for testing
  • Node-RED — visual flow editor, connect to other services
  • Home Assistant — add Desk Buddy as a smart home device
  • paho-mqtt — Python library for scripts and automation
  • MQTT.js — JavaScript/Node client for web integrations

The AI Server

Desk Buddy AI is our own server that brings intelligence to the arm. It combines models we built ourselves with best-in-class open-source AI.

Linear Actuation Models

Homegrown models trained specifically for Desk Buddy's geometry. Given a target position, they output the servo angles needed to get there — no generic IK solver required.

Visual Detection

Open-source vision models identify objects and colors in images captured by the onboard camera. Results feed back to the firmware as pick coordinates.

LLM Integration

An open-source LLM lets you describe tasks in plain language. The AI translates your instructions into a sequence of MQTT actions sent to the arm.

All via MQTT

The AI server communicates with the arm the same way everything else does — JSON over MQTT. No special integration required on the firmware side.