Initial commit: MediaPipe landmarks demo
HTML demos for face, hand, gesture, and posture tracking using MediaPipe. Includes Python CLI tools for processing video files. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
3
.gitignore
vendored
Normal file
3
.gitignore
vendored
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
.venv/
|
||||||
|
__pycache__/
|
||||||
|
*.pyc
|
||||||
179
Training_handshape_B.md
Normal file
179
Training_handshape_B.md
Normal file
@@ -0,0 +1,179 @@
|
|||||||
|
Let’s add a custom gesture for the **ASL letter “B”** (flat hand, fingers together, thumb folded across the palm) using MediaPipe **Gesture Recognizer (Model Maker)**.
|
||||||
|
|
||||||
|
# Plan (what you’ll build)
|
||||||
|
|
||||||
|
* A custom model with a new class label, e.g. `ASL_B`, plus the required `none` class.
|
||||||
|
* A small, labeled image dataset (Model Maker will extract hand landmarks for you).
|
||||||
|
* A trained `.task` file you can drop into your Python/JS app and allowlist.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 1) Pick labels
|
||||||
|
|
||||||
|
Use:
|
||||||
|
|
||||||
|
* `ASL_B` ← your new gesture
|
||||||
|
* `none` ← anything that’s not one of your target gestures (mandatory)
|
||||||
|
|
||||||
|
Folder layout:
|
||||||
|
|
||||||
|
```
|
||||||
|
dataset/
|
||||||
|
ASL_B/
|
||||||
|
...images...
|
||||||
|
none/
|
||||||
|
...images...
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 2) Collect the right data (what to capture)
|
||||||
|
|
||||||
|
Target handshape for **B**:
|
||||||
|
|
||||||
|
* **Fingers**: index–pinky fully extended and **pressed together**
|
||||||
|
* **Thumb**: folded across palm (not sticking out to the side)
|
||||||
|
* **Palm**: facing camera (front) and also a few angles
|
||||||
|
|
||||||
|
Suggested minimums (per label):
|
||||||
|
|
||||||
|
| Bucket | Shots |
|
||||||
|
| --------------------------------------------------- | -------------------- |
|
||||||
|
| Distances: close (\~40–60 cm), medium (\~80–120 cm) | 80 |
|
||||||
|
| View angles: front, \~30°, \~60° yaw | 80 |
|
||||||
|
| Rotations: slight roll/tilt | 40 |
|
||||||
|
| Lighting: bright, dim, backlit | 40 |
|
||||||
|
| Backgrounds: plain wall, cluttered office/outdoor | 40 |
|
||||||
|
| Hands: left & right (both) | included across all |
|
||||||
|
| Skin tones / several people | as many as practical |
|
||||||
|
|
||||||
|
Do **at least \~300–500** `ASL_B` images to start.
|
||||||
|
For **`none`**, include: open palm (“High-Five”), slightly spread fingers, thumbs-up, fist, pointing, random objects/background frames, other ASL letters—especially **Open\_Palm** look-alikes so the model learns “not B”.
|
||||||
|
|
||||||
|
Quick ways to get images:
|
||||||
|
|
||||||
|
* Record short clips on laptop/phone and extract frames (e.g., 2 fps).
|
||||||
|
* Ask 3–5 colleagues to contribute a short 10–20s clip each.
|
||||||
|
|
||||||
|
Frame extraction example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Extract 2 frames/sec from a video into dataset/ASL_B/
|
||||||
|
ffmpeg -i b_sign.mov -vf fps=2 dataset/ASL_B/b_%05d.jpg
|
||||||
|
# Do the same for negatives into dataset/none/
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 3) Train with Model Maker (Python)
|
||||||
|
|
||||||
|
Create and activate a venv, then:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install --upgrade pip
|
||||||
|
pip install mediapipe-model-maker
|
||||||
|
```
|
||||||
|
|
||||||
|
Training script (save as `train_asl_b.py` and run it):
|
||||||
|
|
||||||
|
```python
|
||||||
|
from mediapipe_model_maker import gesture_recognizer as gr
|
||||||
|
|
||||||
|
DATA_DIR = "dataset"
|
||||||
|
EXPORT_DIR = "exported_model"
|
||||||
|
|
||||||
|
# Load & auto-preprocess (runs hand detection, keeps images with a detected hand)
|
||||||
|
data = gr.Dataset.from_folder(
|
||||||
|
dirname=DATA_DIR,
|
||||||
|
hparams=gr.HandDataPreprocessingParams( # you can tweak these if needed
|
||||||
|
min_detection_confidence=0.5
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Split
|
||||||
|
train_data, rest = data.split(0.8)
|
||||||
|
validation_data, test_data = rest.split(0.5)
|
||||||
|
|
||||||
|
# Hyperparameters (start small; bump epochs if needed)
|
||||||
|
hparams = gr.HParams(
|
||||||
|
export_dir=EXPORT_DIR,
|
||||||
|
epochs=12,
|
||||||
|
batch_size=16,
|
||||||
|
learning_rate=0.001,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Optional model head size & dropout
|
||||||
|
options = gr.GestureRecognizerOptions(
|
||||||
|
hparams=hparams,
|
||||||
|
model_options=gr.ModelOptions(layer_widths=[128, 64], dropout_rate=0.1)
|
||||||
|
)
|
||||||
|
|
||||||
|
model = gr.GestureRecognizer.create(
|
||||||
|
train_data=train_data,
|
||||||
|
validation_data=validation_data,
|
||||||
|
options=options
|
||||||
|
)
|
||||||
|
|
||||||
|
# Evaluate
|
||||||
|
loss, acc = model.evaluate(test_data, batch_size=1)
|
||||||
|
print(f"Test loss={loss:.4f}, acc={acc:.4f}")
|
||||||
|
|
||||||
|
# Export .task
|
||||||
|
model.export_model() # writes exported_model/gesture_recognizer.task
|
||||||
|
print("Exported:", EXPORT_DIR + "/gesture_recognizer.task")
|
||||||
|
```
|
||||||
|
|
||||||
|
Tips:
|
||||||
|
|
||||||
|
* If many `ASL_B` images get dropped at load time (no hand detected), back up the camera a little or ensure the whole hand is visible.
|
||||||
|
* If `none` is weak, add more “near-miss” negatives: open palm with fingers slightly apart, thumb slightly out, partial occlusions.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 4) Plug it into your app
|
||||||
|
|
||||||
|
**Python (Tasks API example):**
|
||||||
|
|
||||||
|
```python
|
||||||
|
import mediapipe as mp
|
||||||
|
BaseOptions = mp.tasks.BaseOptions
|
||||||
|
GestureRecognizer = mp.tasks.vision.GestureRecognizer
|
||||||
|
GestureRecognizerOptions = mp.tasks.vision.GestureRecognizerOptions
|
||||||
|
VisionRunningMode = mp.tasks.vision.RunningMode
|
||||||
|
ClassifierOptions = mp.tasks.components.processors.ClassifierOptions
|
||||||
|
|
||||||
|
options = GestureRecognizerOptions(
|
||||||
|
base_options=BaseOptions(model_asset_path="exported_model/gesture_recognizer.task"),
|
||||||
|
running_mode=VisionRunningMode.LIVE_STREAM,
|
||||||
|
custom_gesture_classifier_options=ClassifierOptions(
|
||||||
|
score_threshold=0.6, # tighten until false positives drop
|
||||||
|
category_allowlist=["ASL_B"] # only report your class
|
||||||
|
),
|
||||||
|
)
|
||||||
|
recognizer = GestureRecognizer.create_from_options(options)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Web (JS):**
|
||||||
|
|
||||||
|
```js
|
||||||
|
const recognizer = await GestureRecognizer.createFromOptions(fileset, {
|
||||||
|
baseOptions: { modelAssetPath: "exported_model/gesture_recognizer.task" },
|
||||||
|
runningMode: "LIVE_STREAM",
|
||||||
|
customGesturesClassifierOptions: {
|
||||||
|
scoreThreshold: 0.6,
|
||||||
|
categoryAllowlist: ["ASL_B"]
|
||||||
|
}
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 5) Troubleshooting & tuning
|
||||||
|
|
||||||
|
* **False positives with Open Palm:** Add more `none` examples where fingers are together but **thumb is visible** to the side. The model needs to see “almost B but not B.”
|
||||||
|
* **Left vs right hand:** Include both in training. If you only trained on right hands, left hands may underperform.
|
||||||
|
* **Distance issues:** If far-away hands fail, capture more medium/far shots. Landmarks get noisier when small.
|
||||||
|
* **Thresholds:** Raise `score_threshold` to reduce spurious detections; lower it if you miss true B’s.
|
||||||
|
* **Confusion matrix:** If accuracy is fine but live results wobble, collect more from the exact camera/lighting you’ll use.
|
||||||
|
|
||||||
|
---
|
||||||
435
face.html
Normal file
435
face.html
Normal file
@@ -0,0 +1,435 @@
|
|||||||
|
<!-- face.html • Single-file MediaPipe Face Landmarker demo -->
|
||||||
|
<!-- Copyright 2023 The MediaPipe Authors.
|
||||||
|
Licensed under the Apache License, Version 2.0 -->
|
||||||
|
<!doctype html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8" />
|
||||||
|
<meta http-equiv="Cache-control" content="no-cache, no-store, must-revalidate" />
|
||||||
|
<meta http-equiv="Pragma" content="no-cache" />
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no" />
|
||||||
|
<title>Face Landmarker</title>
|
||||||
|
|
||||||
|
<!-- Material Components (styles only for the raised button) -->
|
||||||
|
<link href="https://unpkg.com/material-components-web@latest/dist/material-components-web.min.css" rel="stylesheet" />
|
||||||
|
<script src="https://unpkg.com/material-components-web@latest/dist/material-components-web.min.js"></script>
|
||||||
|
|
||||||
|
<style>
|
||||||
|
/* Inlined CSS from your snippet (with minor cleanups) */
|
||||||
|
|
||||||
|
body {
|
||||||
|
font-family: helvetica, arial, sans-serif;
|
||||||
|
margin: 2em;
|
||||||
|
color: #3d3d3d;
|
||||||
|
--mdc-theme-primary: #007f8b;
|
||||||
|
--mdc-theme-on-primary: #f1f3f4;
|
||||||
|
}
|
||||||
|
|
||||||
|
h1 {
|
||||||
|
font-style: italic;
|
||||||
|
color: #007f8b;
|
||||||
|
}
|
||||||
|
|
||||||
|
h2 {
|
||||||
|
clear: both;
|
||||||
|
}
|
||||||
|
|
||||||
|
em { font-weight: bold; }
|
||||||
|
|
||||||
|
video {
|
||||||
|
clear: both;
|
||||||
|
display: block;
|
||||||
|
transform: rotateY(180deg);
|
||||||
|
-webkit-transform: rotateY(180deg);
|
||||||
|
-moz-transform: rotateY(180deg);
|
||||||
|
}
|
||||||
|
|
||||||
|
section {
|
||||||
|
opacity: 1;
|
||||||
|
transition: opacity 500ms ease-in-out;
|
||||||
|
}
|
||||||
|
|
||||||
|
.removed { display: none; }
|
||||||
|
.invisible { opacity: 0.2; }
|
||||||
|
|
||||||
|
.note {
|
||||||
|
font-style: italic;
|
||||||
|
font-size: 130%;
|
||||||
|
}
|
||||||
|
|
||||||
|
.videoView,
|
||||||
|
.detectOnClick,
|
||||||
|
.blend-shapes {
|
||||||
|
position: relative;
|
||||||
|
float: left;
|
||||||
|
width: 48%;
|
||||||
|
margin: 2% 1%;
|
||||||
|
cursor: pointer;
|
||||||
|
}
|
||||||
|
|
||||||
|
.videoView p,
|
||||||
|
.detectOnClick p {
|
||||||
|
position: absolute;
|
||||||
|
padding: 5px;
|
||||||
|
background-color: #007f8b;
|
||||||
|
color: #fff;
|
||||||
|
border: 1px dashed rgba(255, 255, 255, 0.7);
|
||||||
|
z-index: 2;
|
||||||
|
font-size: 12px;
|
||||||
|
margin: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.highlighter {
|
||||||
|
background: rgba(0, 255, 0, 0.25);
|
||||||
|
border: 1px dashed #fff;
|
||||||
|
z-index: 1;
|
||||||
|
position: absolute;
|
||||||
|
}
|
||||||
|
|
||||||
|
.canvas {
|
||||||
|
z-index: 1;
|
||||||
|
position: absolute;
|
||||||
|
pointer-events: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.output_canvas {
|
||||||
|
transform: rotateY(180deg);
|
||||||
|
-webkit-transform: rotateY(180deg);
|
||||||
|
-moz-transform: rotateY(180deg);
|
||||||
|
}
|
||||||
|
|
||||||
|
.detectOnClick { z-index: 0; }
|
||||||
|
.detectOnClick img { width: 100%; }
|
||||||
|
|
||||||
|
.blend-shapes-item {
|
||||||
|
display: flex;
|
||||||
|
align-items: center;
|
||||||
|
height: 20px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.blend-shapes-label {
|
||||||
|
display: flex;
|
||||||
|
width: 120px;
|
||||||
|
justify-content: flex-end;
|
||||||
|
align-items: center;
|
||||||
|
margin-right: 4px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.blend-shapes-value {
|
||||||
|
display: flex;
|
||||||
|
height: 16px;
|
||||||
|
align-items: center;
|
||||||
|
background-color: #007f8b;
|
||||||
|
color: #fff;
|
||||||
|
padding: 0 6px;
|
||||||
|
border-radius: 2px;
|
||||||
|
white-space: nowrap;
|
||||||
|
overflow: hidden;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Ensure video/canvas overlap correctly inside the container */
|
||||||
|
#liveView > div {
|
||||||
|
position: relative;
|
||||||
|
display: inline-block;
|
||||||
|
}
|
||||||
|
#webcam {
|
||||||
|
position: absolute; left: 0; top: 0;
|
||||||
|
}
|
||||||
|
#output_canvas {
|
||||||
|
position: absolute; left: 0; top: 0;
|
||||||
|
}
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<h1>Face landmark detection using the MediaPipe FaceLandmarker task</h1>
|
||||||
|
|
||||||
|
<section id="demos" class="invisible">
|
||||||
|
<h2>Demo: Webcam continuous face landmarks detection</h2>
|
||||||
|
<p>
|
||||||
|
Hold your face in front of your webcam to get real-time face landmarker detection.<br />
|
||||||
|
Click <b>enable webcam</b> below and grant access to the webcam if prompted.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<div id="liveView" class="videoView">
|
||||||
|
<button id="webcamButton" class="mdc-button mdc-button--raised">
|
||||||
|
<span class="mdc-button__ripple"></span>
|
||||||
|
<span class="mdc-button__label">ENABLE WEBCAM</span>
|
||||||
|
</button>
|
||||||
|
<div>
|
||||||
|
<video id="webcam" autoplay playsinline></video>
|
||||||
|
<canvas class="output_canvas" id="output_canvas"></canvas>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="blend-shapes">
|
||||||
|
<ul class="blend-shapes-list" id="video-blend-shapes"></ul>
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<script type="module">
|
||||||
|
// Inlined JS (converted to plain JS; removed TS types)
|
||||||
|
|
||||||
|
import vision from "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@0.10.3";
|
||||||
|
const { FaceLandmarker, FilesetResolver, DrawingUtils } = vision;
|
||||||
|
|
||||||
|
const demosSection = document.getElementById("demos");
|
||||||
|
const imageBlendShapes = document.getElementById("image-blend-shapes");
|
||||||
|
const videoBlendShapes = document.getElementById("video-blend-shapes");
|
||||||
|
|
||||||
|
let faceLandmarker;
|
||||||
|
let runningMode = "IMAGE"; // "IMAGE" | "VIDEO"
|
||||||
|
let enableWebcamButton;
|
||||||
|
let webcamRunning = false;
|
||||||
|
const videoWidth = 480;
|
||||||
|
|
||||||
|
async function createFaceLandmarker() {
|
||||||
|
const filesetResolver = await FilesetResolver.forVisionTasks(
|
||||||
|
"https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@0.10.3/wasm"
|
||||||
|
);
|
||||||
|
faceLandmarker = await FaceLandmarker.createFromOptions(filesetResolver, {
|
||||||
|
baseOptions: {
|
||||||
|
modelAssetPath:
|
||||||
|
"https://storage.googleapis.com/mediapipe-models/face_landmarker/face_landmarker/float16/1/face_landmarker.task",
|
||||||
|
delegate: "GPU",
|
||||||
|
},
|
||||||
|
outputFaceBlendshapes: true,
|
||||||
|
runningMode,
|
||||||
|
numFaces: 1,
|
||||||
|
});
|
||||||
|
demosSection.classList.remove("invisible");
|
||||||
|
}
|
||||||
|
createFaceLandmarker();
|
||||||
|
|
||||||
|
/********************************************************************
|
||||||
|
// Demo 1: Click image to detect landmarks
|
||||||
|
********************************************************************/
|
||||||
|
const imageContainers = document.getElementsByClassName("detectOnClick");
|
||||||
|
for (let imageContainer of imageContainers) {
|
||||||
|
imageContainer.children[0].addEventListener("click", handleClick);
|
||||||
|
}
|
||||||
|
|
||||||
|
async function handleClick(event) {
|
||||||
|
if (!faceLandmarker) {
|
||||||
|
console.log("Wait for faceLandmarker to load before clicking!");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (runningMode === "VIDEO") {
|
||||||
|
runningMode = "IMAGE";
|
||||||
|
await faceLandmarker.setOptions({ runningMode });
|
||||||
|
}
|
||||||
|
|
||||||
|
const parent = event.target.parentNode;
|
||||||
|
const allCanvas = parent.getElementsByClassName("canvas");
|
||||||
|
for (let i = allCanvas.length - 1; i >= 0; i--) {
|
||||||
|
const n = allCanvas[i];
|
||||||
|
n.parentNode.removeChild(n);
|
||||||
|
}
|
||||||
|
|
||||||
|
const faceLandmarkerResult = faceLandmarker.detect(event.target);
|
||||||
|
|
||||||
|
const canvas = document.createElement("canvas");
|
||||||
|
canvas.setAttribute("class", "canvas");
|
||||||
|
canvas.setAttribute("width", event.target.naturalWidth + "px");
|
||||||
|
canvas.setAttribute("height", event.target.naturalHeight + "px");
|
||||||
|
canvas.style.left = "0px";
|
||||||
|
canvas.style.top = "0px";
|
||||||
|
canvas.style.width = `${event.target.width}px`;
|
||||||
|
canvas.style.height = `${event.target.height}px`;
|
||||||
|
|
||||||
|
parent.appendChild(canvas);
|
||||||
|
const ctx = canvas.getContext("2d");
|
||||||
|
const drawingUtils = new DrawingUtils(ctx);
|
||||||
|
|
||||||
|
for (const landmarks of faceLandmarkerResult.faceLandmarks) {
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_TESSELATION,
|
||||||
|
{ color: "#C0C0C070", lineWidth: 1 }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_RIGHT_EYE,
|
||||||
|
{ color: "#FF3030" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_RIGHT_EYEBROW,
|
||||||
|
{ color: "#FF3030" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_LEFT_EYE,
|
||||||
|
{ color: "#30FF30" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_LEFT_EYEBROW,
|
||||||
|
{ color: "#30FF30" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_FACE_OVAL,
|
||||||
|
{ color: "#E0E0E0" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_LIPS,
|
||||||
|
{ color: "#E0E0E0" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_RIGHT_IRIS,
|
||||||
|
{ color: "#FF3030" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_LEFT_IRIS,
|
||||||
|
{ color: "#30FF30" }
|
||||||
|
);
|
||||||
|
}
|
||||||
|
drawBlendShapes(imageBlendShapes, faceLandmarkerResult.faceBlendshapes);
|
||||||
|
}
|
||||||
|
|
||||||
|
/********************************************************************
|
||||||
|
// Demo 2: Webcam stream detection
|
||||||
|
********************************************************************/
|
||||||
|
const video = document.getElementById("webcam");
|
||||||
|
const canvasElement = document.getElementById("output_canvas");
|
||||||
|
const canvasCtx = canvasElement.getContext("2d");
|
||||||
|
|
||||||
|
function hasGetUserMedia() {
|
||||||
|
return !!(navigator.mediaDevices && navigator.mediaDevices.getUserMedia);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (hasGetUserMedia()) {
|
||||||
|
enableWebcamButton = document.getElementById("webcamButton");
|
||||||
|
enableWebcamButton.addEventListener("click", enableCam);
|
||||||
|
} else {
|
||||||
|
console.warn("getUserMedia() is not supported by your browser");
|
||||||
|
}
|
||||||
|
|
||||||
|
function enableCam() {
|
||||||
|
if (!faceLandmarker) {
|
||||||
|
console.log("Wait! faceLandmarker not loaded yet.");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
webcamRunning = !webcamRunning;
|
||||||
|
enableWebcamButton.innerText = webcamRunning
|
||||||
|
? "DISABLE PREDICTIONS"
|
||||||
|
: "ENABLE PREDICTIONS";
|
||||||
|
|
||||||
|
const constraints = { video: true };
|
||||||
|
|
||||||
|
navigator.mediaDevices.getUserMedia(constraints).then((stream) => {
|
||||||
|
video.srcObject = stream;
|
||||||
|
video.addEventListener("loadeddata", predictWebcam);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
let lastVideoTime = -1;
|
||||||
|
let results;
|
||||||
|
const drawingUtils = new DrawingUtils(canvasCtx);
|
||||||
|
|
||||||
|
async function predictWebcam() {
|
||||||
|
const ratio = video.videoHeight / video.videoWidth;
|
||||||
|
video.style.width = videoWidth + "px";
|
||||||
|
video.style.height = videoWidth * ratio + "px";
|
||||||
|
canvasElement.style.width = videoWidth + "px";
|
||||||
|
canvasElement.style.height = videoWidth * ratio + "px";
|
||||||
|
canvasElement.width = video.videoWidth;
|
||||||
|
canvasElement.height = video.videoHeight;
|
||||||
|
|
||||||
|
if (runningMode === "IMAGE") {
|
||||||
|
runningMode = "VIDEO";
|
||||||
|
await faceLandmarker.setOptions({ runningMode });
|
||||||
|
}
|
||||||
|
|
||||||
|
const startTimeMs = performance.now();
|
||||||
|
if (lastVideoTime !== video.currentTime) {
|
||||||
|
lastVideoTime = video.currentTime;
|
||||||
|
results = faceLandmarker.detectForVideo(video, startTimeMs);
|
||||||
|
}
|
||||||
|
|
||||||
|
canvasCtx.clearRect(0, 0, canvasElement.width, canvasElement.height);
|
||||||
|
|
||||||
|
if (results && results.faceLandmarks) {
|
||||||
|
for (const landmarks of results.faceLandmarks) {
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_TESSELATION,
|
||||||
|
{ color: "#C0C0C070", lineWidth: 1 }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_RIGHT_EYE,
|
||||||
|
{ color: "#FF3030" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_RIGHT_EYEBROW,
|
||||||
|
{ color: "#FF3030" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_LEFT_EYE,
|
||||||
|
{ color: "#30FF30" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_LEFT_EYEBROW,
|
||||||
|
{ color: "#30FF30" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_FACE_OVAL,
|
||||||
|
{ color: "#E0E0E0" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_LIPS,
|
||||||
|
{ color: "#E0E0E0" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_RIGHT_IRIS,
|
||||||
|
{ color: "#FF3030" }
|
||||||
|
);
|
||||||
|
drawingUtils.drawConnectors(
|
||||||
|
landmarks,
|
||||||
|
FaceLandmarker.FACE_LANDMARKS_LEFT_IRIS,
|
||||||
|
{ color: "#30FF30" }
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
drawBlendShapes(videoBlendShapes, (results && results.faceBlendshapes) || []);
|
||||||
|
|
||||||
|
if (webcamRunning === true) {
|
||||||
|
window.requestAnimationFrame(predictWebcam);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function drawBlendShapes(el, blendShapes) {
|
||||||
|
if (!blendShapes || !blendShapes.length) {
|
||||||
|
el.innerHTML = "";
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
let htmlMaker = "";
|
||||||
|
blendShapes[0].categories.forEach((shape) => {
|
||||||
|
const label = shape.displayName || shape.categoryName;
|
||||||
|
const pct = Math.max(0, Math.min(1, Number(shape.score) || 0));
|
||||||
|
htmlMaker += `
|
||||||
|
<li class="blend-shapes-item">
|
||||||
|
<span class="blend-shapes-label">${label}</span>
|
||||||
|
<span class="blend-shapes-value" style="width: calc(${pct * 100}% - 120px)">${pct.toFixed(4)}</span>
|
||||||
|
</li>
|
||||||
|
`;
|
||||||
|
});
|
||||||
|
|
||||||
|
el.innerHTML = htmlMaker;
|
||||||
|
}
|
||||||
|
</script>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
BIN
face_landmarker.task
Normal file
BIN
face_landmarker.task
Normal file
Binary file not shown.
1
fingers_positions.sh
Executable file
1
fingers_positions.sh
Executable file
@@ -0,0 +1 @@
|
|||||||
|
python hand_landmarker_cli.py --image hand.png --model hand_landmarker.task --out annotated.png
|
||||||
290
gesture.html
Normal file
290
gesture.html
Normal file
@@ -0,0 +1,290 @@
|
|||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8" />
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1" />
|
||||||
|
<title>MediaPipe Hand Gesture Recognizer — Single File Demo</title>
|
||||||
|
<!-- Material Components (for button styling) -->
|
||||||
|
<link href="https://unpkg.com/material-components-web@latest/dist/material-components-web.min.css" rel="stylesheet" />
|
||||||
|
<script src="https://unpkg.com/material-components-web@latest/dist/material-components-web.min.js"></script>
|
||||||
|
|
||||||
|
<style>
|
||||||
|
/* Inlined from the CodePen CSS (Sass directives removed) */
|
||||||
|
body {
|
||||||
|
font-family: Roboto, system-ui, -apple-system, Segoe UI, Helvetica, Arial, sans-serif;
|
||||||
|
margin: 2em;
|
||||||
|
color: #3d3d3d;
|
||||||
|
--mdc-theme-primary: #007f8b;
|
||||||
|
--mdc-theme-on-primary: #f1f3f4;
|
||||||
|
}
|
||||||
|
|
||||||
|
h1 { color: #007f8b; }
|
||||||
|
h2 { clear: both; }
|
||||||
|
|
||||||
|
video {
|
||||||
|
clear: both;
|
||||||
|
display: block;
|
||||||
|
transform: rotateY(180deg);
|
||||||
|
-webkit-transform: rotateY(180deg);
|
||||||
|
-moz-transform: rotateY(180deg);
|
||||||
|
height: 280px;
|
||||||
|
}
|
||||||
|
|
||||||
|
section { opacity: 1; transition: opacity 500ms ease-in-out; }
|
||||||
|
.removed { display: none; }
|
||||||
|
.invisible { opacity: 0.2; }
|
||||||
|
|
||||||
|
.detectOnClick {
|
||||||
|
position: relative;
|
||||||
|
float: left;
|
||||||
|
width: 48%;
|
||||||
|
margin: 2% 1%;
|
||||||
|
cursor: pointer;
|
||||||
|
z-index: 0;
|
||||||
|
font-size: calc(8px + 1.2vw);
|
||||||
|
}
|
||||||
|
|
||||||
|
.videoView {
|
||||||
|
position: absolute;
|
||||||
|
float: left;
|
||||||
|
width: 48%;
|
||||||
|
margin: 2% 1%;
|
||||||
|
cursor: pointer;
|
||||||
|
min-height: 500px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.videoView p,
|
||||||
|
.detectOnClick p {
|
||||||
|
padding-top: 5px;
|
||||||
|
padding-bottom: 5px;
|
||||||
|
background-color: #007f8b;
|
||||||
|
color: #fff;
|
||||||
|
border: 1px dashed rgba(255, 255, 255, 0.7);
|
||||||
|
z-index: 2;
|
||||||
|
margin: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.highlighter { background: rgba(0, 255, 0, 0.25); border: 1px dashed #fff; z-index: 1; position: absolute; }
|
||||||
|
.canvas { z-index: 1; position: absolute; pointer-events: none; }
|
||||||
|
|
||||||
|
.output_canvas {
|
||||||
|
transform: rotateY(180deg);
|
||||||
|
-webkit-transform: rotateY(180deg);
|
||||||
|
-moz-transform: rotateY(180deg);
|
||||||
|
}
|
||||||
|
|
||||||
|
.detectOnClick img { width: 45vw; }
|
||||||
|
|
||||||
|
.output { display: none; width: 100%; font-size: calc(8px + 1.2vw); }
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<section id="demos" class="invisible">
|
||||||
|
<h2><br>Demo: Webcam continuous hand gesture detection</h2>
|
||||||
|
<p>Use your hand to make gestures in front of the camera to get gesture classification. <br />Click <b>enable webcam</b> below and grant access to the webcam if prompted.</p>
|
||||||
|
<PRE>
|
||||||
|
Gesture Label Description
|
||||||
|
Closed_Fist Hand fully closed into a fist
|
||||||
|
Open_Palm Flat open hand
|
||||||
|
Pointing_Up Index finger extended upward, others closed
|
||||||
|
Thumb_Down Thumb extended downward
|
||||||
|
Thumb_Up Thumb extended upward
|
||||||
|
Victory Index and middle finger extended in a “V”
|
||||||
|
ILoveYou Thumb, index, and pinky extended (ASL “I love you”)
|
||||||
|
None No recognized gesture / below confidence threshold
|
||||||
|
</PRE>
|
||||||
|
|
||||||
|
<div id="liveView" class="videoView">
|
||||||
|
<button id="webcamButton" class="mdc-button mdc-button--raised">
|
||||||
|
<span class="mdc-button__ripple"></span>
|
||||||
|
<span class="mdc-button__label">ENABLE WEBCAM</span>
|
||||||
|
</button>
|
||||||
|
<div style="position: relative;">
|
||||||
|
<video id="webcam" autoplay playsinline></video>
|
||||||
|
<canvas class="output_canvas" id="output_canvas" width="1280" height="720" style="position: absolute; left: 0; top: 0;"></canvas>
|
||||||
|
<p id="gesture_output" class="output"></p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<script type="module">
|
||||||
|
import { GestureRecognizer, FilesetResolver, DrawingUtils } from "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@0.10.3";
|
||||||
|
|
||||||
|
const demosSection = document.getElementById("demos");
|
||||||
|
/** @type {GestureRecognizer} */
|
||||||
|
let gestureRecognizer;
|
||||||
|
let runningMode = "IMAGE";
|
||||||
|
/** @type {HTMLButtonElement} */
|
||||||
|
let enableWebcamButton;
|
||||||
|
let webcamRunning = false;
|
||||||
|
const videoHeight = "360px";
|
||||||
|
const videoWidth = "480px";
|
||||||
|
|
||||||
|
// Load the WASM and model, then reveal the demos section
|
||||||
|
const createGestureRecognizer = async () => {
|
||||||
|
const vision = await FilesetResolver.forVisionTasks(
|
||||||
|
"https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@0.10.3/wasm"
|
||||||
|
);
|
||||||
|
gestureRecognizer = await GestureRecognizer.createFromOptions(vision, {
|
||||||
|
baseOptions: {
|
||||||
|
modelAssetPath: "https://storage.googleapis.com/mediapipe-models/gesture_recognizer/gesture_recognizer/float16/1/gesture_recognizer.task",
|
||||||
|
delegate: "GPU"
|
||||||
|
},
|
||||||
|
runningMode
|
||||||
|
});
|
||||||
|
demosSection.classList.remove("invisible");
|
||||||
|
};
|
||||||
|
createGestureRecognizer();
|
||||||
|
|
||||||
|
/********************************************************************
|
||||||
|
// Demo 1: Detect hand gestures in images
|
||||||
|
********************************************************************/
|
||||||
|
const imageContainers = document.getElementsByClassName("detectOnClick");
|
||||||
|
for (let i = 0; i < imageContainers.length; i++) {
|
||||||
|
const img = imageContainers[i].children[0];
|
||||||
|
img.addEventListener("click", handleClick);
|
||||||
|
}
|
||||||
|
|
||||||
|
async function handleClick(event) {
|
||||||
|
if (!gestureRecognizer) {
|
||||||
|
alert("Please wait for gestureRecognizer to load");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (runningMode === "VIDEO") {
|
||||||
|
runningMode = "IMAGE";
|
||||||
|
await gestureRecognizer.setOptions({ runningMode: "IMAGE" });
|
||||||
|
}
|
||||||
|
|
||||||
|
const parent = event.target.parentNode;
|
||||||
|
|
||||||
|
// Remove previous overlays
|
||||||
|
const allCanvas = parent.getElementsByClassName("canvas");
|
||||||
|
for (let i = allCanvas.length - 1; i >= 0; i--) {
|
||||||
|
const n = allCanvas[i];
|
||||||
|
n.parentNode.removeChild(n);
|
||||||
|
}
|
||||||
|
|
||||||
|
const results = gestureRecognizer.recognize(event.target);
|
||||||
|
console.log(results);
|
||||||
|
|
||||||
|
if (results.gestures && results.gestures.length > 0) {
|
||||||
|
const p = parent.querySelector(".classification");
|
||||||
|
p.classList.remove("removed");
|
||||||
|
|
||||||
|
const categoryName = results.gestures[0][0].categoryName;
|
||||||
|
const categoryScore = (results.gestures[0][0].score * 100).toFixed(2);
|
||||||
|
const handedness = results.handednesses[0][0].displayName;
|
||||||
|
|
||||||
|
p.innerText = `GestureRecognizer: ${categoryName}\n Confidence: ${categoryScore}%\n Handedness: ${handedness}`;
|
||||||
|
p.style.left = "0px";
|
||||||
|
p.style.top = event.target.height + "px";
|
||||||
|
p.style.width = event.target.width - 10 + "px";
|
||||||
|
|
||||||
|
const canvas = document.createElement("canvas");
|
||||||
|
canvas.setAttribute("class", "canvas");
|
||||||
|
canvas.setAttribute("width", event.target.naturalWidth + "px");
|
||||||
|
canvas.setAttribute("height", event.target.naturalHeight + "px");
|
||||||
|
canvas.style.left = "0px";
|
||||||
|
canvas.style.top = "0px";
|
||||||
|
canvas.style.width = event.target.width + "px";
|
||||||
|
canvas.style.height = event.target.height + "px";
|
||||||
|
|
||||||
|
parent.appendChild(canvas);
|
||||||
|
const canvasCtx = canvas.getContext("2d");
|
||||||
|
const drawingUtils = new DrawingUtils(canvasCtx);
|
||||||
|
if (results.landmarks) {
|
||||||
|
for (const landmarks of results.landmarks) {
|
||||||
|
drawingUtils.drawConnectors(landmarks, GestureRecognizer.HAND_CONNECTIONS, { lineWidth: 5 });
|
||||||
|
drawingUtils.drawLandmarks(landmarks, { lineWidth: 1 });
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/********************************************************************
|
||||||
|
// Demo 2: Continuously grab image from webcam stream and detect it.
|
||||||
|
********************************************************************/
|
||||||
|
const video = document.getElementById("webcam");
|
||||||
|
const canvasElement = document.getElementById("output_canvas");
|
||||||
|
const canvasCtx = canvasElement.getContext("2d");
|
||||||
|
const gestureOutput = document.getElementById("gesture_output");
|
||||||
|
|
||||||
|
function hasGetUserMedia() {
|
||||||
|
return !!(navigator.mediaDevices && navigator.mediaDevices.getUserMedia);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (hasGetUserMedia()) {
|
||||||
|
enableWebcamButton = document.getElementById("webcamButton");
|
||||||
|
enableWebcamButton.addEventListener("click", enableCam);
|
||||||
|
} else {
|
||||||
|
console.warn("getUserMedia() is not supported by your browser");
|
||||||
|
}
|
||||||
|
|
||||||
|
function enableCam() {
|
||||||
|
if (!gestureRecognizer) {
|
||||||
|
alert("Please wait for gestureRecognizer to load");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
webcamRunning = !webcamRunning;
|
||||||
|
enableWebcamButton.innerText = webcamRunning ? "DISABLE PREDICTIONS" : "ENABLE PREDICTIONS";
|
||||||
|
|
||||||
|
const constraints = { video: true };
|
||||||
|
navigator.mediaDevices.getUserMedia(constraints).then(function (stream) {
|
||||||
|
video.srcObject = stream;
|
||||||
|
video.addEventListener("loadeddata", predictWebcam);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
let lastVideoTime = -1;
|
||||||
|
let results;
|
||||||
|
async function predictWebcam() {
|
||||||
|
const webcamElement = document.getElementById("webcam");
|
||||||
|
|
||||||
|
if (runningMode === "IMAGE") {
|
||||||
|
runningMode = "VIDEO";
|
||||||
|
await gestureRecognizer.setOptions({ runningMode: "VIDEO" });
|
||||||
|
}
|
||||||
|
|
||||||
|
const nowInMs = Date.now();
|
||||||
|
if (video.currentTime !== lastVideoTime) {
|
||||||
|
lastVideoTime = video.currentTime;
|
||||||
|
results = gestureRecognizer.recognizeForVideo(video, nowInMs);
|
||||||
|
}
|
||||||
|
|
||||||
|
canvasCtx.save();
|
||||||
|
canvasCtx.clearRect(0, 0, canvasElement.width, canvasElement.height);
|
||||||
|
const drawingUtils = new DrawingUtils(canvasCtx);
|
||||||
|
|
||||||
|
canvasElement.style.height = videoHeight;
|
||||||
|
webcamElement.style.height = videoHeight;
|
||||||
|
canvasElement.style.width = videoWidth;
|
||||||
|
webcamElement.style.width = videoWidth;
|
||||||
|
|
||||||
|
if (results && results.landmarks) {
|
||||||
|
for (const landmarks of results.landmarks) {
|
||||||
|
drawingUtils.drawConnectors(landmarks, GestureRecognizer.HAND_CONNECTIONS, { lineWidth: 5 });
|
||||||
|
drawingUtils.drawLandmarks(landmarks, { lineWidth: 2 });
|
||||||
|
}
|
||||||
|
}
|
||||||
|
canvasCtx.restore();
|
||||||
|
|
||||||
|
if (results && results.gestures && results.gestures.length > 0) {
|
||||||
|
gestureOutput.style.display = "block";
|
||||||
|
gestureOutput.style.width = videoWidth;
|
||||||
|
const categoryName = results.gestures[0][0].categoryName;
|
||||||
|
const categoryScore = (results.gestures[0][0].score * 100).toFixed(2);
|
||||||
|
const handedness = results.handednesses[0][0].displayName;
|
||||||
|
gestureOutput.innerText = `GestureRecognizer: ${categoryName}\n Confidence: ${categoryScore} %\n Handedness: ${handedness}`;
|
||||||
|
} else {
|
||||||
|
gestureOutput.style.display = "none";
|
||||||
|
}
|
||||||
|
|
||||||
|
if (webcamRunning === true) {
|
||||||
|
window.requestAnimationFrame(predictWebcam);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
</script>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
5
gesture.sh
Executable file
5
gesture.sh
Executable file
@@ -0,0 +1,5 @@
|
|||||||
|
export GLOG_minloglevel=2
|
||||||
|
export TF_CPP_MIN_LOG_LEVEL=3
|
||||||
|
python recognize_gesture.py --image ily.png --model gesture_recognizer.task 2>/dev/null
|
||||||
|
|
||||||
|
|
||||||
BIN
gesture_recognizer.task
Normal file
BIN
gesture_recognizer.task
Normal file
Binary file not shown.
BIN
hand_landmarker.task
Normal file
BIN
hand_landmarker.task
Normal file
Binary file not shown.
125
hand_landmarker_cli.py
Executable file
125
hand_landmarker_cli.py
Executable file
@@ -0,0 +1,125 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Hand Landmarks on a static image using MediaPipe Tasks.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python hand_landmarker_cli.py --image hand.png --model hand_landmarker.task --max_hands 2 --out annotated.png
|
||||||
|
|
||||||
|
What it does:
|
||||||
|
• Loads the MediaPipe Hand Landmarker model (.task file)
|
||||||
|
• Runs landmark detection on a single image
|
||||||
|
• Prints handedness and 21 landmark coords for each detected hand
|
||||||
|
• Saves an annotated image with landmarks and connections
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import cv2
|
||||||
|
import numpy as np
|
||||||
|
import mediapipe as mp
|
||||||
|
|
||||||
|
# MediaPipe Tasks API aliases
|
||||||
|
BaseOptions = mp.tasks.BaseOptions
|
||||||
|
HandLandmarker = mp.tasks.vision.HandLandmarker
|
||||||
|
HandLandmarkerOptions = mp.tasks.vision.HandLandmarkerOptions
|
||||||
|
VisionRunningMode = mp.tasks.vision.RunningMode
|
||||||
|
|
||||||
|
# Landmark connection topology (same as mp.solutions.hands.HAND_CONNECTIONS, copied to avoid extra dependency)
|
||||||
|
HAND_CONNECTIONS = [
|
||||||
|
(0,1),(1,2),(2,3),(3,4), # Thumb
|
||||||
|
(0,5),(5,6),(6,7),(7,8), # Index
|
||||||
|
(5,9),(9,10),(10,11),(11,12), # Middle
|
||||||
|
(9,13),(13,14),(14,15),(15,16), # Ring
|
||||||
|
(13,17),(17,18),(18,19),(19,20), # Pinky
|
||||||
|
(0,17) # Palm base to pinky base
|
||||||
|
]
|
||||||
|
|
||||||
|
def draw_landmarks(image_bgr: np.ndarray, landmarks_norm: list):
|
||||||
|
"""
|
||||||
|
Draws landmarks and connections on a BGR image.
|
||||||
|
`landmarks_norm` is a list of normalized (x,y,z) MediaPipe landmarks (0..1).
|
||||||
|
"""
|
||||||
|
h, w = image_bgr.shape[:2]
|
||||||
|
|
||||||
|
# Convert normalized to pixel coords
|
||||||
|
pts = []
|
||||||
|
for lm in landmarks_norm:
|
||||||
|
x = int(lm.x * w)
|
||||||
|
y = int(lm.y * h)
|
||||||
|
pts.append((x, y))
|
||||||
|
|
||||||
|
# Draw connections
|
||||||
|
for a, b in HAND_CONNECTIONS:
|
||||||
|
if 0 <= a < len(pts) and 0 <= b < len(pts):
|
||||||
|
cv2.line(image_bgr, pts[a], pts[b], (0, 255, 0), 2, cv2.LINE_AA)
|
||||||
|
|
||||||
|
# Draw keypoints
|
||||||
|
for i, (x, y) in enumerate(pts):
|
||||||
|
cv2.circle(image_bgr, (x, y), 3, (255, 255, 255), -1, cv2.LINE_AA)
|
||||||
|
cv2.circle(image_bgr, (x, y), 2, (0, 0, 255), -1, cv2.LINE_AA)
|
||||||
|
|
||||||
|
def main():
|
||||||
|
ap = argparse.ArgumentParser(description="MediaPipe Hand Landmarker (static image)")
|
||||||
|
ap.add_argument("--image", required=True, help="Path to an input image (e.g., hand.jpg)")
|
||||||
|
ap.add_argument("--model", default="hand_landmarker.task", help="Path to MediaPipe .task model")
|
||||||
|
ap.add_argument("--max_hands", type=int, default=2, help="Maximum hands to detect")
|
||||||
|
ap.add_argument("--out", default="annotated.png", help="Output path for annotated image")
|
||||||
|
args = ap.parse_args()
|
||||||
|
|
||||||
|
img_path = Path(args.image)
|
||||||
|
if not img_path.exists():
|
||||||
|
print(f"[ERROR] Image not found: {img_path}", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
model_path = Path(args.model)
|
||||||
|
if not model_path.exists():
|
||||||
|
print(f"[ERROR] Model not found: {model_path}", file=sys.stderr)
|
||||||
|
print("Download the model bundle (.task) and point --model to it.", file=sys.stderr)
|
||||||
|
sys.exit(2)
|
||||||
|
|
||||||
|
# Load image for MP and for drawing
|
||||||
|
mp_image = mp.Image.create_from_file(str(img_path))
|
||||||
|
image_bgr = cv2.imread(str(img_path))
|
||||||
|
if image_bgr is None:
|
||||||
|
print(f"[ERROR] Could not read image with OpenCV: {img_path}", file=sys.stderr)
|
||||||
|
sys.exit(3)
|
||||||
|
|
||||||
|
# Configure and run the landmarker
|
||||||
|
options = HandLandmarkerOptions(
|
||||||
|
base_options=BaseOptions(model_asset_path=str(model_path)),
|
||||||
|
running_mode=VisionRunningMode.IMAGE,
|
||||||
|
num_hands=args.max_hands,
|
||||||
|
min_hand_detection_confidence=0.5,
|
||||||
|
min_hand_presence_confidence=0.5,
|
||||||
|
min_tracking_confidence=0.5
|
||||||
|
)
|
||||||
|
|
||||||
|
with HandLandmarker.create_from_options(options) as landmarker:
|
||||||
|
result = landmarker.detect(mp_image)
|
||||||
|
|
||||||
|
# Print results
|
||||||
|
if not result.hand_landmarks:
|
||||||
|
print("No hands detected.")
|
||||||
|
else:
|
||||||
|
for i, (handedness, lms, world_lms) in enumerate(
|
||||||
|
zip(result.handedness, result.hand_landmarks, result.hand_world_landmarks)
|
||||||
|
):
|
||||||
|
label = handedness[0].category_name if handedness else "Unknown"
|
||||||
|
score = handedness[0].score if handedness else 0.0
|
||||||
|
print(f"\nHand #{i+1}: {label} (score {score:.3f})")
|
||||||
|
for idx, lm in enumerate(lms):
|
||||||
|
print(f" L{idx:02d}: x={lm.x:.3f} y={lm.y:.3f} z={lm.z:.3f}")
|
||||||
|
|
||||||
|
# Draw
|
||||||
|
draw_landmarks(image_bgr, lms)
|
||||||
|
# Put label
|
||||||
|
cv2.putText(image_bgr, f"{label}", (10, 30 + i*30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0,255,0), 2, cv2.LINE_AA)
|
||||||
|
|
||||||
|
# Save annotated image
|
||||||
|
cv2.imwrite(str(args.out), image_bgr)
|
||||||
|
print(f"\nSaved annotated image to: {args.out}")
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
262
holistic.html
Normal file
262
holistic.html
Normal file
@@ -0,0 +1,262 @@
|
|||||||
|
<!doctype html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8" />
|
||||||
|
<title>MediaPipe Holistic — Main Output Only</title>
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1" />
|
||||||
|
<link href="https://fonts.googleapis.com/css2?family=Titillium+Web:wght@400;600&display=swap" rel="stylesheet">
|
||||||
|
<style>
|
||||||
|
@keyframes spin { 0% {transform: rotate(0)} 100% {transform: rotate(360deg)} }
|
||||||
|
.abs { position: absolute; }
|
||||||
|
a { color: white; text-decoration: none; } a:hover { color: lightblue; }
|
||||||
|
body {
|
||||||
|
margin: 0; color: white; font-family: 'Titillium Web', sans-serif;
|
||||||
|
position: absolute; inset: 0; overflow: hidden; background: #000;
|
||||||
|
}
|
||||||
|
.container {
|
||||||
|
position: absolute; inset: 0; background-color: #596e73; height: 100%;
|
||||||
|
}
|
||||||
|
.canvas-container {
|
||||||
|
display: flex; height: 100%; width: 100%;
|
||||||
|
justify-content: center; align-items: center;
|
||||||
|
}
|
||||||
|
.output_canvas { max-width: 100%; display: block; position: relative; }
|
||||||
|
/* Hide ALL video elements so only the processed canvas is visible */
|
||||||
|
video { display: none !important; }
|
||||||
|
.control-panel { position: absolute; left: 10px; top: 10px; z-index: 6; }
|
||||||
|
.loading {
|
||||||
|
display: flex; position: absolute; inset: 0; align-items: center; justify-content: center;
|
||||||
|
backface-visibility: hidden; opacity: 1; transition: opacity 1s; z-index: 10;
|
||||||
|
}
|
||||||
|
.loading .spinner {
|
||||||
|
position: absolute; width: 120px; height: 120px; animation: spin 1s linear infinite;
|
||||||
|
border: 32px solid #bebebe; border-top: 32px solid #3498db; border-radius: 50%;
|
||||||
|
}
|
||||||
|
.loading .message { font-size: x-large; }
|
||||||
|
.loaded .loading { opacity: 0; }
|
||||||
|
.logo { bottom: 10px; right: 20px; }
|
||||||
|
.logo .title { color: white; font-size: 28px; }
|
||||||
|
.shoutout { left: 0; right: 0; bottom: 40px; text-align: center; font-size: 24px; position: absolute; z-index: 4; }
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<div class="container">
|
||||||
|
<!-- Hidden capture element kept for MediaPipe pipeline -->
|
||||||
|
<video class="input_video" playsinline></video>
|
||||||
|
|
||||||
|
<div class="canvas-container">
|
||||||
|
<canvas class="output_canvas" width="1280" height="720"></canvas>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Loading spinner -->
|
||||||
|
<div class="loading">
|
||||||
|
<div class="spinner"></div>
|
||||||
|
<div class="message">Loading</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Logo/link -->
|
||||||
|
<a class="abs logo" href="https://mediapipe.dev" target="_blank" rel="noreferrer">
|
||||||
|
<div style="display:flex;align-items:center;bottom:0;right:10px;">
|
||||||
|
<img class="logo" alt="" style="height:50px"
|
||||||
|
src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR4nGMAAQAABQABJtqz7QAAAABJRU5ErkJggg==" />
|
||||||
|
<span class="title" style="margin-left:8px">MediaPipe</span>
|
||||||
|
</div>
|
||||||
|
</a>
|
||||||
|
|
||||||
|
<!-- Info link -->
|
||||||
|
<div class="shoutout">
|
||||||
|
<div><a href="https://solutions.mediapipe.dev/holistic" target="_blank" rel="noreferrer">Click here for more info</a></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Control panel container -->
|
||||||
|
<div class="control-panel"></div>
|
||||||
|
|
||||||
|
<!-- MediaPipe libs (globals: mpHolistic, drawingUtils, controlsNS, etc.) -->
|
||||||
|
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/holistic/holistic.js"></script>
|
||||||
|
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/drawing_utils/drawing_utils.js"></script>
|
||||||
|
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/camera_utils/camera_utils.js"></script>
|
||||||
|
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/control_utils/control_utils.js"></script>
|
||||||
|
|
||||||
|
<!-- Device detector is ESM; we import it and run the app -->
|
||||||
|
<script type="module">
|
||||||
|
import DeviceDetector from "https://cdn.skypack.dev/device-detector-js@2.2.10";
|
||||||
|
|
||||||
|
function testSupport(supportedDevices) {
|
||||||
|
const dd = new DeviceDetector();
|
||||||
|
const d = dd.parse(navigator.userAgent);
|
||||||
|
let ok = false;
|
||||||
|
for (const dev of supportedDevices) {
|
||||||
|
if (dev.client && !(new RegExp(`^${dev.client}$`)).test(d.client.name)) continue;
|
||||||
|
if (dev.os && !(new RegExp(`^${dev.os}$`)).test(d.os.name)) continue;
|
||||||
|
ok = true; break;
|
||||||
|
}
|
||||||
|
if (!ok) alert(`This demo, running on ${d.client.name}/${d.os.name}, is not well supported at this time, continue at your own risk.`);
|
||||||
|
}
|
||||||
|
testSupport([{ client: 'Chrome' }]);
|
||||||
|
|
||||||
|
const controlsNS = window;
|
||||||
|
const mpHolistic = window;
|
||||||
|
const drawingUtils = window;
|
||||||
|
|
||||||
|
const videoElement = document.getElementsByClassName('input_video')[0];
|
||||||
|
const canvasElement = document.getElementsByClassName('output_canvas')[0];
|
||||||
|
const controlsElement = document.getElementsByClassName('control-panel')[0];
|
||||||
|
const canvasCtx = canvasElement.getContext('2d');
|
||||||
|
|
||||||
|
const fpsControl = new controlsNS.FPS();
|
||||||
|
const spinner = document.querySelector('.loading');
|
||||||
|
spinner.ontransitionend = () => { spinner.style.display = 'none'; };
|
||||||
|
|
||||||
|
function removeElements(landmarks, elements) {
|
||||||
|
if (!landmarks) return;
|
||||||
|
for (const e of elements) delete landmarks[e];
|
||||||
|
}
|
||||||
|
function removeLandmarks(results) {
|
||||||
|
if (results.poseLandmarks) {
|
||||||
|
removeElements(results.poseLandmarks, [0,1,2,3,4,5,6,7,8,9,10,15,16,17,18,19,20,21,22]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
function connect(ctx, connectors) {
|
||||||
|
const c = ctx.canvas;
|
||||||
|
for (const [from, to] of connectors) {
|
||||||
|
if (!from || !to) continue;
|
||||||
|
if (from.visibility && to.visibility && (from.visibility < 0.1 || to.visibility < 0.1)) continue;
|
||||||
|
ctx.beginPath();
|
||||||
|
ctx.moveTo(from.x * c.width, from.y * c.height);
|
||||||
|
ctx.lineTo(to.x * c.width, to.y * c.height);
|
||||||
|
ctx.stroke();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
let activeEffect = 'mask';
|
||||||
|
|
||||||
|
function onResults(results) {
|
||||||
|
document.body.classList.add('loaded');
|
||||||
|
removeLandmarks(results);
|
||||||
|
fpsControl.tick();
|
||||||
|
|
||||||
|
canvasCtx.save();
|
||||||
|
canvasCtx.clearRect(0, 0, canvasElement.width, canvasElement.height);
|
||||||
|
|
||||||
|
if (results.segmentationMask) {
|
||||||
|
canvasCtx.drawImage(results.segmentationMask, 0, 0, canvasElement.width, canvasElement.height);
|
||||||
|
if (activeEffect === 'mask' || activeEffect === 'both') {
|
||||||
|
canvasCtx.globalCompositeOperation = 'source-in';
|
||||||
|
canvasCtx.fillStyle = '#00FF007F';
|
||||||
|
canvasCtx.fillRect(0, 0, canvasElement.width, canvasElement.height);
|
||||||
|
} else {
|
||||||
|
canvasCtx.globalCompositeOperation = 'source-out';
|
||||||
|
canvasCtx.fillStyle = '#0000FF7F';
|
||||||
|
canvasCtx.fillRect(0, 0, canvasElement.width, canvasElement.height);
|
||||||
|
}
|
||||||
|
canvasCtx.globalCompositeOperation = 'destination-atop';
|
||||||
|
canvasCtx.drawImage(results.image, 0, 0, canvasElement.width, canvasElement.height);
|
||||||
|
canvasCtx.globalCompositeOperation = 'source-over';
|
||||||
|
} else {
|
||||||
|
canvasCtx.drawImage(results.image, 0, 0, canvasElement.width, canvasElement.height);
|
||||||
|
}
|
||||||
|
|
||||||
|
canvasCtx.lineWidth = 5;
|
||||||
|
if (results.poseLandmarks) {
|
||||||
|
if (results.rightHandLandmarks) {
|
||||||
|
canvasCtx.strokeStyle = 'white';
|
||||||
|
connect(canvasCtx, [[
|
||||||
|
results.poseLandmarks[mpHolistic.POSE_LANDMARKS.RIGHT_ELBOW],
|
||||||
|
results.rightHandLandmarks[0]
|
||||||
|
]]);
|
||||||
|
}
|
||||||
|
if (results.leftHandLandmarks) {
|
||||||
|
canvasCtx.strokeStyle = 'white';
|
||||||
|
connect(canvasCtx, [[
|
||||||
|
results.poseLandmarks[mpHolistic.POSE_LANDMARKS.LEFT_ELBOW],
|
||||||
|
results.leftHandLandmarks[0]
|
||||||
|
]]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
drawingUtils.drawConnectors(canvasCtx, results.poseLandmarks, mpHolistic.POSE_CONNECTIONS, { color: 'white' });
|
||||||
|
drawingUtils.drawLandmarks(
|
||||||
|
canvasCtx,
|
||||||
|
Object.values(mpHolistic.POSE_LANDMARKS_LEFT).map(i => results.poseLandmarks?.[i]),
|
||||||
|
{ visibilityMin: 0.65, color: 'white', fillColor: 'rgb(255,138,0)' }
|
||||||
|
);
|
||||||
|
drawingUtils.drawLandmarks(
|
||||||
|
canvasCtx,
|
||||||
|
Object.values(mpHolistic.POSE_LANDMARKS_RIGHT).map(i => results.poseLandmarks?.[i]),
|
||||||
|
{ visibilityMin: 0.65, color: 'white', fillColor: 'rgb(0,217,231)' }
|
||||||
|
);
|
||||||
|
|
||||||
|
drawingUtils.drawConnectors(canvasCtx, results.rightHandLandmarks, mpHolistic.HAND_CONNECTIONS, { color: 'white' });
|
||||||
|
drawingUtils.drawLandmarks(canvasCtx, results.rightHandLandmarks, {
|
||||||
|
color: 'white', fillColor: 'rgb(0,217,231)', lineWidth: 2,
|
||||||
|
radius: (data) => drawingUtils.lerp(data.from?.z ?? 0, -0.15, 0.1, 10, 1)
|
||||||
|
});
|
||||||
|
drawingUtils.drawConnectors(canvasCtx, results.leftHandLandmarks, mpHolistic.HAND_CONNECTIONS, { color: 'white' });
|
||||||
|
drawingUtils.drawLandmarks(canvasCtx, results.leftHandLandmarks, {
|
||||||
|
color: 'white', fillColor: 'rgb(255,138,0)', lineWidth: 2,
|
||||||
|
radius: (data) => drawingUtils.lerp(data.from?.z ?? 0, -0.15, 0.1, 10, 1)
|
||||||
|
});
|
||||||
|
|
||||||
|
drawingUtils.drawConnectors(canvasCtx, results.faceLandmarks, mpHolistic.FACEMESH_TESSELATION, { color: '#C0C0C070', lineWidth: 1 });
|
||||||
|
drawingUtils.drawConnectors(canvasCtx, results.faceLandmarks, mpHolistic.FACEMESH_RIGHT_EYE, { color: 'rgb(0,217,231)' });
|
||||||
|
drawingUtils.drawConnectors(canvasCtx, results.faceLandmarks, mpHolistic.FACEMESH_RIGHT_EYEBROW, { color: 'rgb(0,217,231)' });
|
||||||
|
drawingUtils.drawConnectors(canvasCtx, results.faceLandmarks, mpHolistic.FACEMESH_LEFT_EYE, { color: 'rgb(255,138,0)' });
|
||||||
|
drawingUtils.drawConnectors(canvasCtx, results.faceLandmarks, mpHolistic.FACEMESH_LEFT_EYEBROW, { color: 'rgb(255,138,0)' });
|
||||||
|
drawingUtils.drawConnectors(canvasCtx, results.faceLandmarks, mpHolistic.FACEMESH_FACE_OVAL, { color: '#E0E0E0', lineWidth: 5 });
|
||||||
|
drawingUtils.drawConnectors(canvasCtx, results.faceLandmarks, mpHolistic.FACEMESH_LIPS, { color: '#E0E0E0', lineWidth: 5 });
|
||||||
|
|
||||||
|
canvasCtx.restore();
|
||||||
|
}
|
||||||
|
|
||||||
|
const holistic = new mpHolistic.Holistic({
|
||||||
|
locateFile: (file) => `https://cdn.jsdelivr.net/npm/@mediapipe/holistic@${mpHolistic.VERSION}/${file}`
|
||||||
|
});
|
||||||
|
holistic.onResults(onResults);
|
||||||
|
|
||||||
|
new controlsNS.ControlPanel(controlsElement, {
|
||||||
|
selfieMode: true,
|
||||||
|
modelComplexity: 1,
|
||||||
|
smoothLandmarks: true,
|
||||||
|
enableSegmentation: false,
|
||||||
|
smoothSegmentation: true,
|
||||||
|
minDetectionConfidence: 0.5,
|
||||||
|
minTrackingConfidence: 0.5,
|
||||||
|
effect: 'background',
|
||||||
|
})
|
||||||
|
.add([
|
||||||
|
new controlsNS.StaticText({ title: 'MediaPipe Holistic' }),
|
||||||
|
fpsControl,
|
||||||
|
new controlsNS.Toggle({ title: 'Selfie Mode', field: 'selfieMode' }),
|
||||||
|
new controlsNS.SourcePicker({
|
||||||
|
onSourceChanged: () => { holistic.reset(); },
|
||||||
|
onFrame: async (input, size) => {
|
||||||
|
const aspect = size.height / size.width;
|
||||||
|
let width, height;
|
||||||
|
if (window.innerWidth > window.innerHeight) {
|
||||||
|
height = window.innerHeight; width = height / aspect;
|
||||||
|
} else {
|
||||||
|
width = window.innerWidth; height = width * aspect;
|
||||||
|
}
|
||||||
|
canvasElement.width = width;
|
||||||
|
canvasElement.height = height;
|
||||||
|
await holistic.send({ image: input });
|
||||||
|
},
|
||||||
|
}),
|
||||||
|
new controlsNS.Slider({ title: 'Model Complexity', field: 'modelComplexity', discrete: ['Lite', 'Full', 'Heavy'] }),
|
||||||
|
new controlsNS.Toggle({ title: 'Smooth Landmarks', field: 'smoothLandmarks' }),
|
||||||
|
new controlsNS.Toggle({ title: 'Enable Segmentation', field: 'enableSegmentation' }),
|
||||||
|
new controlsNS.Toggle({ title: 'Smooth Segmentation', field: 'smoothSegmentation' }),
|
||||||
|
new controlsNS.Slider({ title: 'Min Detection Confidence', field: 'minDetectionConfidence', range: [0, 1], step: 0.01 }),
|
||||||
|
new controlsNS.Slider({ title: 'Min Tracking Confidence', field: 'minTrackingConfidence', range: [0, 1], step: 0.01 }),
|
||||||
|
new controlsNS.Slider({ title: 'Effect', field: 'effect', discrete: { background: 'Background', mask: 'Foreground' } }),
|
||||||
|
])
|
||||||
|
.on(x => {
|
||||||
|
const options = x;
|
||||||
|
videoElement.classList.toggle('selfie', !!options.selfieMode);
|
||||||
|
activeEffect = x['effect'];
|
||||||
|
holistic.setOptions(options);
|
||||||
|
});
|
||||||
|
</script>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
BIN
landmarks.png
Normal file
BIN
landmarks.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 977 KiB |
268
marker.html
Normal file
268
marker.html
Normal file
@@ -0,0 +1,268 @@
|
|||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="UTF-8" />
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1" />
|
||||||
|
<title>MediaPipe Hand Landmarker — Single File Demo</title>
|
||||||
|
|
||||||
|
<!-- Material Components (for the button styling) -->
|
||||||
|
<link href="https://unpkg.com/material-components-web@latest/dist/material-components-web.min.css" rel="stylesheet">
|
||||||
|
<script src="https://unpkg.com/material-components-web@latest/dist/material-components-web.min.js"></script>
|
||||||
|
|
||||||
|
<!-- Drawing utils (provides drawConnectors, drawLandmarks) -->
|
||||||
|
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/drawing_utils/drawing_utils.js" crossorigin="anonymous"></script>
|
||||||
|
<!-- Hands (provides HAND_CONNECTIONS constant) -->
|
||||||
|
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/hands/hands.js" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<style>
|
||||||
|
/* Inline CSS from the CodePen, cleaned for single-file use */
|
||||||
|
body {
|
||||||
|
font-family: Roboto, Arial, sans-serif;
|
||||||
|
margin: 2em;
|
||||||
|
color: #3d3d3d;
|
||||||
|
--mdc-theme-primary: #007f8b;
|
||||||
|
--mdc-theme-on-primary: #f1f3f4;
|
||||||
|
}
|
||||||
|
|
||||||
|
h1 { color: #007f8b; }
|
||||||
|
h2 { clear: both; }
|
||||||
|
em { font-weight: bold; }
|
||||||
|
|
||||||
|
video {
|
||||||
|
clear: both;
|
||||||
|
display: block;
|
||||||
|
transform: rotateY(180deg);
|
||||||
|
-webkit-transform: rotateY(180deg);
|
||||||
|
-moz-transform: rotateY(180deg);
|
||||||
|
}
|
||||||
|
|
||||||
|
section {
|
||||||
|
opacity: 1;
|
||||||
|
transition: opacity 500ms ease-in-out;
|
||||||
|
}
|
||||||
|
|
||||||
|
.removed { display: none; }
|
||||||
|
.invisible { opacity: 0.2; }
|
||||||
|
|
||||||
|
.note {
|
||||||
|
font-style: italic;
|
||||||
|
font-size: 130%;
|
||||||
|
}
|
||||||
|
|
||||||
|
.videoView, .detectOnClick {
|
||||||
|
position: relative;
|
||||||
|
float: left;
|
||||||
|
width: 48%;
|
||||||
|
margin: 2% 1%;
|
||||||
|
cursor: pointer;
|
||||||
|
}
|
||||||
|
|
||||||
|
.videoView p, .detectOnClick p {
|
||||||
|
position: absolute;
|
||||||
|
padding: 5px;
|
||||||
|
background-color: #007f8b;
|
||||||
|
color: #fff;
|
||||||
|
border: 1px dashed rgba(255, 255, 255, 0.7);
|
||||||
|
z-index: 2;
|
||||||
|
font-size: 12px;
|
||||||
|
margin: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.highlighter {
|
||||||
|
background: rgba(0, 255, 0, 0.25);
|
||||||
|
border: 1px dashed #fff;
|
||||||
|
z-index: 1;
|
||||||
|
position: absolute;
|
||||||
|
}
|
||||||
|
|
||||||
|
.canvas, .output_canvas {
|
||||||
|
z-index: 1;
|
||||||
|
position: absolute;
|
||||||
|
pointer-events: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.output_canvas {
|
||||||
|
transform: rotateY(180deg);
|
||||||
|
-webkit-transform: rotateY(180deg);
|
||||||
|
-moz-transform: rotateY(180deg);
|
||||||
|
}
|
||||||
|
|
||||||
|
.detectOnClick { z-index: 0; }
|
||||||
|
.detectOnClick img { width: 100%; }
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<h2>Demo: Webcam continuous hands landmarks detection</h2>
|
||||||
|
<p>Hold your hand in front of your webcam to get real-time hand landmarker detection.<br>Click <b>ENABLE WEBCAM</b> below and grant access to the webcam if prompted.</p>
|
||||||
|
|
||||||
|
<div id="liveView" class="videoView">
|
||||||
|
<button id="webcamButton" class="mdc-button mdc-button--raised">
|
||||||
|
<span class="mdc-button__ripple"></span>
|
||||||
|
<span class="mdc-button__label">ENABLE WEBCAM</span>
|
||||||
|
</button>
|
||||||
|
<div style="position: relative;">
|
||||||
|
<video id="webcam" style="position: absolute; left: 0; top: 0;" autoplay playsinline></video>
|
||||||
|
<canvas class="output_canvas" id="output_canvas" style="left: 0; top: 0;"></canvas>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<script type="module">
|
||||||
|
// Import the Tasks Vision ESM build
|
||||||
|
import { HandLandmarker, FilesetResolver } from "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@0.10.0";
|
||||||
|
|
||||||
|
const demosSection = document.getElementById("demos");
|
||||||
|
|
||||||
|
let handLandmarker;
|
||||||
|
let runningMode = "IMAGE";
|
||||||
|
let enableWebcamButton;
|
||||||
|
let webcamRunning = false;
|
||||||
|
|
||||||
|
// Load the model and enable the demos section
|
||||||
|
const createHandLandmarker = async () => {
|
||||||
|
const vision = await FilesetResolver.forVisionTasks(
|
||||||
|
"https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@0.10.0/wasm"
|
||||||
|
);
|
||||||
|
handLandmarker = await HandLandmarker.createFromOptions(vision, {
|
||||||
|
baseOptions: {
|
||||||
|
modelAssetPath: "https://storage.googleapis.com/mediapipe-models/hand_landmarker/hand_landmarker/float16/1/hand_landmarker.task",
|
||||||
|
delegate: "GPU"
|
||||||
|
},
|
||||||
|
runningMode,
|
||||||
|
numHands: 2
|
||||||
|
});
|
||||||
|
demosSection.classList.remove("invisible");
|
||||||
|
};
|
||||||
|
createHandLandmarker();
|
||||||
|
|
||||||
|
/********************************************************************
|
||||||
|
// Demo 1: Click images to run landmark detection
|
||||||
|
********************************************************************/
|
||||||
|
const imageContainers = document.getElementsByClassName("detectOnClick");
|
||||||
|
for (let i = 0; i < imageContainers.length; i++) {
|
||||||
|
const img = imageContainers[i].children[0];
|
||||||
|
img.addEventListener("click", handleClick);
|
||||||
|
}
|
||||||
|
|
||||||
|
async function handleClick(event) {
|
||||||
|
if (!handLandmarker) {
|
||||||
|
console.log("Wait for handLandmarker to load before clicking!");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (runningMode === "VIDEO") {
|
||||||
|
runningMode = "IMAGE";
|
||||||
|
await handLandmarker.setOptions({ runningMode: "IMAGE" });
|
||||||
|
}
|
||||||
|
|
||||||
|
const container = event.target.parentNode;
|
||||||
|
// Remove old overlays
|
||||||
|
const old = container.getElementsByClassName("canvas");
|
||||||
|
for (let i = old.length - 1; i >= 0; i--) {
|
||||||
|
old[i].parentNode.removeChild(old[i]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Run detection
|
||||||
|
const result = handLandmarker.detect(event.target);
|
||||||
|
|
||||||
|
// Create overlay canvas aligned to the image element
|
||||||
|
const canvas = document.createElement("canvas");
|
||||||
|
canvas.className = "canvas";
|
||||||
|
canvas.width = event.target.naturalWidth;
|
||||||
|
canvas.height = event.target.naturalHeight;
|
||||||
|
canvas.style.left = "0px";
|
||||||
|
canvas.style.top = "0px";
|
||||||
|
canvas.style.width = event.target.width + "px";
|
||||||
|
canvas.style.height = event.target.height + "px";
|
||||||
|
container.appendChild(canvas);
|
||||||
|
|
||||||
|
const ctx = canvas.getContext("2d");
|
||||||
|
if (result && result.landmarks) {
|
||||||
|
for (const landmarks of result.landmarks) {
|
||||||
|
// drawConnectors and drawLandmarks are provided by drawing_utils.js
|
||||||
|
// HAND_CONNECTIONS is provided by hands.js
|
||||||
|
drawConnectors(ctx, landmarks, HAND_CONNECTIONS, {
|
||||||
|
color: "#00FF00",
|
||||||
|
lineWidth: 5
|
||||||
|
});
|
||||||
|
drawLandmarks(ctx, landmarks, { color: "#FF0000", lineWidth: 1 });
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/********************************************************************
|
||||||
|
// Demo 2: Webcam stream detection
|
||||||
|
********************************************************************/
|
||||||
|
const video = document.getElementById("webcam");
|
||||||
|
const canvasElement = document.getElementById("output_canvas");
|
||||||
|
const canvasCtx = canvasElement.getContext("2d");
|
||||||
|
|
||||||
|
const hasGetUserMedia = () => !!(navigator.mediaDevices && navigator.mediaDevices.getUserMedia);
|
||||||
|
|
||||||
|
if (hasGetUserMedia()) {
|
||||||
|
enableWebcamButton = document.getElementById("webcamButton");
|
||||||
|
enableWebcamButton.addEventListener("click", enableCam);
|
||||||
|
} else {
|
||||||
|
console.warn("getUserMedia() is not supported by your browser");
|
||||||
|
}
|
||||||
|
|
||||||
|
function enableCam() {
|
||||||
|
if (!handLandmarker) {
|
||||||
|
console.log("Wait! HandLandmarker not loaded yet.");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
webcamRunning = !webcamRunning;
|
||||||
|
enableWebcamButton.innerText = webcamRunning ? "DISABLE PREDICTIONS" : "ENABLE PREDICTIONS";
|
||||||
|
|
||||||
|
if (!webcamRunning) return;
|
||||||
|
|
||||||
|
const constraints = { video: true };
|
||||||
|
navigator.mediaDevices.getUserMedia(constraints).then((stream) => {
|
||||||
|
video.srcObject = stream;
|
||||||
|
video.addEventListener("loadeddata", predictWebcam);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
let lastVideoTime = -1;
|
||||||
|
let results;
|
||||||
|
|
||||||
|
async function predictWebcam() {
|
||||||
|
// Match canvas to the video size
|
||||||
|
canvasElement.style.width = video.videoWidth + "px";
|
||||||
|
canvasElement.style.height = video.videoHeight + "px";
|
||||||
|
canvasElement.width = video.videoWidth;
|
||||||
|
canvasElement.height = video.videoHeight;
|
||||||
|
|
||||||
|
// Switch to VIDEO mode for streaming
|
||||||
|
if (runningMode === "IMAGE") {
|
||||||
|
runningMode = "VIDEO";
|
||||||
|
await handLandmarker.setOptions({ runningMode: "VIDEO" });
|
||||||
|
}
|
||||||
|
|
||||||
|
const startTimeMs = performance.now();
|
||||||
|
if (lastVideoTime !== video.currentTime) {
|
||||||
|
lastVideoTime = video.currentTime;
|
||||||
|
results = handLandmarker.detectForVideo(video, startTimeMs);
|
||||||
|
}
|
||||||
|
|
||||||
|
canvasCtx.save();
|
||||||
|
canvasCtx.clearRect(0, 0, canvasElement.width, canvasElement.height);
|
||||||
|
|
||||||
|
if (results && results.landmarks) {
|
||||||
|
for (const landmarks of results.landmarks) {
|
||||||
|
drawConnectors(canvasCtx, landmarks, HAND_CONNECTIONS, {
|
||||||
|
color: "#00FF00",
|
||||||
|
lineWidth: 5
|
||||||
|
});
|
||||||
|
drawLandmarks(canvasCtx, landmarks, { color: "#FF0000", lineWidth: 2 });
|
||||||
|
}
|
||||||
|
}
|
||||||
|
canvasCtx.restore();
|
||||||
|
|
||||||
|
if (webcamRunning) {
|
||||||
|
window.requestAnimationFrame(predictWebcam);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
</script>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
2
more_info.txt
Normal file
2
more_info.txt
Normal file
@@ -0,0 +1,2 @@
|
|||||||
|
https://ai.google.dev/edge/mediapipe/solutions/vision/hand_landmarker
|
||||||
|
https://ai.google.dev/edge/mediapipe/solutions/customization/gesture_recognizer
|
||||||
298
posture.html
Normal file
298
posture.html
Normal file
@@ -0,0 +1,298 @@
|
|||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8">
|
||||||
|
<meta http-equiv="Cache-control" content="no-cache, no-store, must-revalidate">
|
||||||
|
<meta http-equiv="Pragma" content="no-cache">
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no">
|
||||||
|
<title>Pose Landmarker — Single File Demo</title>
|
||||||
|
|
||||||
|
<!-- Material Components (for the button styling) -->
|
||||||
|
<link href="https://unpkg.com/material-components-web@latest/dist/material-components-web.min.css" rel="stylesheet">
|
||||||
|
<script src="https://unpkg.com/material-components-web@latest/dist/material-components-web.min.js"></script>
|
||||||
|
|
||||||
|
<style>
|
||||||
|
/* Copyright 2023 The MediaPipe Authors.
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
you may not use this file except in compliance with the License.
|
||||||
|
You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
Unless required by applicable law or agreed to in writing, software
|
||||||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
See the License for the specific language governing permissions and
|
||||||
|
limitations under the License. */
|
||||||
|
|
||||||
|
/* NOTE: The original CSS used `@use "@material";` which is a Sass directive.
|
||||||
|
That's not valid in plain CSS, so it's removed here. */
|
||||||
|
|
||||||
|
body {
|
||||||
|
font-family: Roboto, system-ui, -apple-system, Segoe UI, Arial, sans-serif;
|
||||||
|
margin: 2em;
|
||||||
|
color: #3d3d3d;
|
||||||
|
--mdc-theme-primary: #007f8b;
|
||||||
|
--mdc-theme-on-primary: #f1f3f4;
|
||||||
|
}
|
||||||
|
|
||||||
|
h1 { color: #007f8b; }
|
||||||
|
h2 { clear: both; }
|
||||||
|
|
||||||
|
em { font-weight: bold; }
|
||||||
|
|
||||||
|
video {
|
||||||
|
clear: both;
|
||||||
|
display: block;
|
||||||
|
transform: rotateY(180deg);
|
||||||
|
-webkit-transform: rotateY(180deg);
|
||||||
|
-moz-transform: rotateY(180deg);
|
||||||
|
}
|
||||||
|
|
||||||
|
section {
|
||||||
|
opacity: 1;
|
||||||
|
transition: opacity 500ms ease-in-out;
|
||||||
|
}
|
||||||
|
|
||||||
|
header, footer { clear: both; }
|
||||||
|
|
||||||
|
.removed { display: none; }
|
||||||
|
.invisible { opacity: 0.2; }
|
||||||
|
|
||||||
|
.note {
|
||||||
|
font-style: italic;
|
||||||
|
font-size: 130%;
|
||||||
|
}
|
||||||
|
|
||||||
|
.videoView, .detectOnClick {
|
||||||
|
position: relative;
|
||||||
|
float: left;
|
||||||
|
width: 48%;
|
||||||
|
margin: 2% 1%;
|
||||||
|
cursor: pointer;
|
||||||
|
}
|
||||||
|
|
||||||
|
.videoView p, .detectOnClick p {
|
||||||
|
position: absolute;
|
||||||
|
padding: 5px;
|
||||||
|
background-color: #007f8b;
|
||||||
|
color: #fff;
|
||||||
|
border: 1px dashed rgba(255, 255, 255, 0.7);
|
||||||
|
z-index: 2;
|
||||||
|
font-size: 12px;
|
||||||
|
margin: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.highlighter {
|
||||||
|
background: rgba(0, 255, 0, 0.25);
|
||||||
|
border: 1px dashed #fff;
|
||||||
|
z-index: 1;
|
||||||
|
position: absolute;
|
||||||
|
}
|
||||||
|
|
||||||
|
.canvas {
|
||||||
|
z-index: 1;
|
||||||
|
position: absolute;
|
||||||
|
pointer-events: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.output_canvas {
|
||||||
|
transform: rotateY(180deg);
|
||||||
|
-webkit-transform: rotateY(180deg);
|
||||||
|
-moz-transform: rotateY(180deg);
|
||||||
|
}
|
||||||
|
|
||||||
|
.detectOnClick { z-index: 0; }
|
||||||
|
.detectOnClick img { width: 100%; }
|
||||||
|
|
||||||
|
/* Simple layout fix for the video/canvas wrapper */
|
||||||
|
.video-wrapper {
|
||||||
|
position: relative;
|
||||||
|
width: 1280px;
|
||||||
|
max-width: 100%;
|
||||||
|
aspect-ratio: 16 / 9;
|
||||||
|
}
|
||||||
|
.video-wrapper video,
|
||||||
|
.video-wrapper canvas {
|
||||||
|
position: absolute;
|
||||||
|
top: 0;
|
||||||
|
left: 0;
|
||||||
|
width: 100%;
|
||||||
|
height: 100%;
|
||||||
|
}
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
|
||||||
|
<body>
|
||||||
|
<h1>Pose detection using the MediaPipe PoseLandmarker task</h1>
|
||||||
|
|
||||||
|
<section id="demos" class="invisible">
|
||||||
|
<h2>Demo: Webcam continuous pose landmarks detection</h2>
|
||||||
|
<p>Stand in front of your webcam to get real-time pose landmarker detection.<br>Click <b>enable webcam</b> below and grant access to the webcam if prompted.</p>
|
||||||
|
|
||||||
|
<div id="liveView" class="videoView">
|
||||||
|
<button id="webcamButton" class="mdc-button mdc-button--raised">
|
||||||
|
<span class="mdc-button__ripple"></span>
|
||||||
|
<span class="mdc-button__label">ENABLE WEBCAM</span>
|
||||||
|
</button>
|
||||||
|
<div class="video-wrapper">
|
||||||
|
<video id="webcam" autoplay playsinline></video>
|
||||||
|
<canvas class="output_canvas" id="output_canvas" width="1280" height="720"></canvas>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<script type="module">
|
||||||
|
// Copyright 2023 The MediaPipe Authors.
|
||||||
|
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
|
||||||
|
import {
|
||||||
|
PoseLandmarker,
|
||||||
|
FilesetResolver,
|
||||||
|
DrawingUtils
|
||||||
|
} from "https://cdn.skypack.dev/@mediapipe/tasks-vision@0.10.0";
|
||||||
|
|
||||||
|
const demosSection = document.getElementById("demos");
|
||||||
|
|
||||||
|
let poseLandmarker = undefined;
|
||||||
|
let runningMode = "IMAGE";
|
||||||
|
let enableWebcamButton;
|
||||||
|
let webcamRunning = false;
|
||||||
|
const videoHeight = "360px";
|
||||||
|
const videoWidth = "480px";
|
||||||
|
|
||||||
|
// Load the Vision WASM and the Pose Landmarker model
|
||||||
|
const createPoseLandmarker = async () => {
|
||||||
|
const vision = await FilesetResolver.forVisionTasks(
|
||||||
|
"https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@0.10.0/wasm"
|
||||||
|
);
|
||||||
|
poseLandmarker = await PoseLandmarker.createFromOptions(vision, {
|
||||||
|
baseOptions: {
|
||||||
|
modelAssetPath: "https://storage.googleapis.com/mediapipe-models/pose_landmarker/pose_landmarker_lite/float16/1/pose_landmarker_lite.task",
|
||||||
|
delegate: "GPU"
|
||||||
|
},
|
||||||
|
runningMode: runningMode,
|
||||||
|
numPoses: 2
|
||||||
|
});
|
||||||
|
demosSection.classList.remove("invisible");
|
||||||
|
};
|
||||||
|
createPoseLandmarker();
|
||||||
|
|
||||||
|
/********************************************************************
|
||||||
|
// Demo 1: Click an image to detect pose and draw landmarks.
|
||||||
|
********************************************************************/
|
||||||
|
const imageContainers = document.getElementsByClassName("detectOnClick");
|
||||||
|
for (let i = 0; i < imageContainers.length; i++) {
|
||||||
|
imageContainers[i].children[0].addEventListener("click", handleClick);
|
||||||
|
}
|
||||||
|
|
||||||
|
async function handleClick(event) {
|
||||||
|
if (!poseLandmarker) {
|
||||||
|
console.log("Wait for poseLandmarker to load before clicking!");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (runningMode === "VIDEO") {
|
||||||
|
runningMode = "IMAGE";
|
||||||
|
await poseLandmarker.setOptions({ runningMode: "IMAGE" });
|
||||||
|
}
|
||||||
|
|
||||||
|
// Remove old overlays
|
||||||
|
const allCanvas = event.target.parentNode.getElementsByClassName("canvas");
|
||||||
|
for (let i = allCanvas.length - 1; i >= 0; i--) {
|
||||||
|
const n = allCanvas[i];
|
||||||
|
n.parentNode.removeChild(n);
|
||||||
|
}
|
||||||
|
|
||||||
|
poseLandmarker.detect(event.target, (result) => {
|
||||||
|
const canvas = document.createElement("canvas");
|
||||||
|
canvas.setAttribute("class", "canvas");
|
||||||
|
canvas.setAttribute("width", event.target.naturalWidth + "px");
|
||||||
|
canvas.setAttribute("height", event.target.naturalHeight + "px");
|
||||||
|
canvas.style =
|
||||||
|
"left: 0px; top: 0px; width: " + event.target.width + "px; height: " + event.target.height + "px;";
|
||||||
|
|
||||||
|
event.target.parentNode.appendChild(canvas);
|
||||||
|
const canvasCtx = canvas.getContext("2d");
|
||||||
|
const drawingUtils = new DrawingUtils(canvasCtx);
|
||||||
|
|
||||||
|
for (const landmark of result.landmarks) {
|
||||||
|
drawingUtils.drawLandmarks(landmark, {
|
||||||
|
radius: (data) => DrawingUtils.lerp((data.from && data.from.z) ?? 0, -0.15, 0.1, 5, 1)
|
||||||
|
});
|
||||||
|
drawingUtils.drawConnectors(landmark, PoseLandmarker.POSE_CONNECTIONS);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
/********************************************************************
|
||||||
|
// Demo 2: Live webcam pose detection.
|
||||||
|
********************************************************************/
|
||||||
|
const video = document.getElementById("webcam");
|
||||||
|
const canvasElement = document.getElementById("output_canvas");
|
||||||
|
const canvasCtx = canvasElement.getContext("2d");
|
||||||
|
const drawingUtils = new DrawingUtils(canvasCtx);
|
||||||
|
|
||||||
|
const hasGetUserMedia = () => !!(navigator.mediaDevices && navigator.mediaDevices.getUserMedia);
|
||||||
|
|
||||||
|
if (hasGetUserMedia()) {
|
||||||
|
enableWebcamButton = document.getElementById("webcamButton");
|
||||||
|
enableWebcamButton.addEventListener("click", enableCam);
|
||||||
|
} else {
|
||||||
|
console.warn("getUserMedia() is not supported by your browser");
|
||||||
|
}
|
||||||
|
|
||||||
|
function enableCam() {
|
||||||
|
if (!poseLandmarker) {
|
||||||
|
console.log("Wait! poseLandmarker not loaded yet.");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (webcamRunning === true) {
|
||||||
|
webcamRunning = false;
|
||||||
|
enableWebcamButton.innerText = "ENABLE PREDICTIONS";
|
||||||
|
} else {
|
||||||
|
webcamRunning = true;
|
||||||
|
enableWebcamButton.innerText = "DISABLE PREDICTIONS";
|
||||||
|
}
|
||||||
|
|
||||||
|
const constraints = { video: true };
|
||||||
|
|
||||||
|
navigator.mediaDevices.getUserMedia(constraints).then((stream) => {
|
||||||
|
video.srcObject = stream;
|
||||||
|
video.addEventListener("loadeddata", predictWebcam);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
let lastVideoTime = -1;
|
||||||
|
async function predictWebcam() {
|
||||||
|
canvasElement.style.height = videoHeight;
|
||||||
|
video.style.height = videoHeight;
|
||||||
|
canvasElement.style.width = videoWidth;
|
||||||
|
video.style.width = videoWidth;
|
||||||
|
|
||||||
|
if (runningMode === "IMAGE") {
|
||||||
|
runningMode = "VIDEO";
|
||||||
|
await poseLandmarker.setOptions({ runningMode: "VIDEO" });
|
||||||
|
}
|
||||||
|
|
||||||
|
const startTimeMs = performance.now();
|
||||||
|
if (lastVideoTime !== video.currentTime) {
|
||||||
|
lastVideoTime = video.currentTime;
|
||||||
|
poseLandmarker.detectForVideo(video, startTimeMs, (result) => {
|
||||||
|
canvasCtx.save();
|
||||||
|
canvasCtx.clearRect(0, 0, canvasElement.width, canvasElement.height);
|
||||||
|
for (const landmark of result.landmarks) {
|
||||||
|
drawingUtils.drawLandmarks(landmark, {
|
||||||
|
radius: (data) => DrawingUtils.lerp((data.from && data.from.z) ?? 0, -0.15, 0.1, 5, 1)
|
||||||
|
});
|
||||||
|
drawingUtils.drawConnectors(landmark, PoseLandmarker.POSE_CONNECTIONS);
|
||||||
|
}
|
||||||
|
canvasCtx.restore();
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
if (webcamRunning === true) {
|
||||||
|
window.requestAnimationFrame(predictWebcam);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
</script>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
151
process_mp4_facial.py
Executable file
151
process_mp4_facial.py
Executable file
@@ -0,0 +1,151 @@
|
|||||||
|
import cv2
|
||||||
|
import mediapipe as mp
|
||||||
|
from mediapipe.tasks import python
|
||||||
|
from mediapipe.tasks.python import vision
|
||||||
|
import numpy as np
|
||||||
|
from mediapipe.framework.formats import landmark_pb2
|
||||||
|
import argparse
|
||||||
|
import os
|
||||||
|
import csv
|
||||||
|
|
||||||
|
# --- NEW: Helper function to create the landmark-to-feature map ---
|
||||||
|
def create_landmark_map():
|
||||||
|
"""Creates a mapping from landmark index to facial feature name."""
|
||||||
|
landmark_map = {}
|
||||||
|
|
||||||
|
# Define the connection groups from MediaPipe's face_mesh solutions
|
||||||
|
connection_groups = {
|
||||||
|
'lips': mp.solutions.face_mesh.FACEMESH_LIPS,
|
||||||
|
'left_eye': mp.solutions.face_mesh.FACEMESH_LEFT_EYE,
|
||||||
|
'right_eye': mp.solutions.face_mesh.FACEMESH_RIGHT_EYE,
|
||||||
|
'left_eyebrow': mp.solutions.face_mesh.FACEMESH_LEFT_EYEBROW,
|
||||||
|
'right_eyebrow': mp.solutions.face_mesh.FACEMESH_RIGHT_EYEBROW,
|
||||||
|
'face_oval': mp.solutions.face_mesh.FACEMESH_FACE_OVAL,
|
||||||
|
'left_iris': mp.solutions.face_mesh.FACEMESH_LEFT_IRIS,
|
||||||
|
'right_iris': mp.solutions.face_mesh.FACEMESH_RIGHT_IRIS,
|
||||||
|
}
|
||||||
|
|
||||||
|
# Populate the map by iterating through the connection groups
|
||||||
|
for part_name, connections in connection_groups.items():
|
||||||
|
for connection in connections:
|
||||||
|
landmark_map[connection[0]] = part_name
|
||||||
|
landmark_map[connection[1]] = part_name
|
||||||
|
|
||||||
|
return landmark_map
|
||||||
|
|
||||||
|
# --- Helper Function to Draw Landmarks ---
|
||||||
|
def draw_landmarks_on_image(rgb_image, detection_result):
|
||||||
|
"""Draws face landmarks on a single image frame."""
|
||||||
|
face_landmarks_list = detection_result.face_landmarks
|
||||||
|
annotated_image = np.copy(rgb_image)
|
||||||
|
|
||||||
|
# Loop through the detected faces to visualize.
|
||||||
|
for face_landmarks in face_landmarks_list:
|
||||||
|
face_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
|
||||||
|
face_landmarks_proto.landmark.extend([
|
||||||
|
landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in face_landmarks
|
||||||
|
])
|
||||||
|
|
||||||
|
mp.solutions.drawing_utils.draw_landmarks(
|
||||||
|
image=annotated_image,
|
||||||
|
landmark_list=face_landmarks_proto,
|
||||||
|
connections=mp.solutions.face_mesh.FACEMESH_TESSELATION,
|
||||||
|
landmark_drawing_spec=None,
|
||||||
|
connection_drawing_spec=mp.solutions.drawing_styles
|
||||||
|
.get_default_face_mesh_tesselation_style())
|
||||||
|
mp.solutions.drawing_utils.draw_landmarks(
|
||||||
|
image=annotated_image,
|
||||||
|
landmark_list=face_landmarks_proto,
|
||||||
|
connections=mp.solutions.face_mesh.FACEMESH_CONTOURS,
|
||||||
|
landmark_drawing_spec=None,
|
||||||
|
connection_drawing_spec=mp.solutions.drawing_styles
|
||||||
|
.get_default_face_mesh_contours_style())
|
||||||
|
mp.solutions.drawing_utils.draw_landmarks(
|
||||||
|
image=annotated_image,
|
||||||
|
landmark_list=face_landmarks_proto,
|
||||||
|
connections=mp.solutions.face_mesh.FACEMESH_IRISES,
|
||||||
|
landmark_drawing_spec=None,
|
||||||
|
connection_drawing_spec=mp.solutions.drawing_styles
|
||||||
|
.get_default_face_mesh_iris_connections_style())
|
||||||
|
|
||||||
|
return annotated_image
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description='Process a video to detect and draw face landmarks.')
|
||||||
|
parser.add_argument('input_video', help='The path to the input video file.')
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
input_video_path = args.input_video
|
||||||
|
base_name, extension = os.path.splitext(input_video_path)
|
||||||
|
output_video_path = f"{base_name}_annotated{extension}"
|
||||||
|
output_csv_path = f"{base_name}_landmarks.csv"
|
||||||
|
|
||||||
|
# --- NEW: Create the landmark map ---
|
||||||
|
landmark_to_part_map = create_landmark_map()
|
||||||
|
|
||||||
|
# --- Configuration & Setup ---
|
||||||
|
model_path = 'face_landmarker.task'
|
||||||
|
base_options = python.BaseOptions(model_asset_path=model_path)
|
||||||
|
options = vision.FaceLandmarkerOptions(base_options=base_options,
|
||||||
|
output_face_blendshapes=True,
|
||||||
|
output_facial_transformation_matrixes=True,
|
||||||
|
num_faces=1)
|
||||||
|
detector = vision.FaceLandmarker.create_from_options(options)
|
||||||
|
|
||||||
|
# --- Video and CSV Setup ---
|
||||||
|
cap = cv2.VideoCapture(input_video_path)
|
||||||
|
if not cap.isOpened():
|
||||||
|
print(f"Error: Could not open video file {input_video_path}")
|
||||||
|
return
|
||||||
|
|
||||||
|
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
|
||||||
|
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
|
||||||
|
fps = int(cap.get(cv2.CAP_PROP_FPS))
|
||||||
|
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
|
||||||
|
out = cv2.VideoWriter(output_video_path, fourcc, fps, (frame_width, frame_height))
|
||||||
|
|
||||||
|
# Open CSV file for writing
|
||||||
|
with open(output_csv_path, 'w', newline='') as csvfile:
|
||||||
|
csv_writer = csv.writer(csvfile)
|
||||||
|
# NEW: Write the updated header row
|
||||||
|
csv_writer.writerow(['frame', 'face', 'landmark_index', 'face_part', 'x', 'y', 'z'])
|
||||||
|
|
||||||
|
print(f"Processing video: {input_video_path} 📹")
|
||||||
|
frame_number = 0
|
||||||
|
while(cap.isOpened()):
|
||||||
|
ret, frame = cap.read()
|
||||||
|
if not ret:
|
||||||
|
break
|
||||||
|
|
||||||
|
frame_number += 1
|
||||||
|
rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
|
||||||
|
mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=rgb_frame)
|
||||||
|
detection_result = detector.detect(mp_image)
|
||||||
|
|
||||||
|
# Write landmark data to CSV
|
||||||
|
if detection_result.face_landmarks:
|
||||||
|
for face_index, face_landmarks in enumerate(detection_result.face_landmarks):
|
||||||
|
for landmark_index, landmark in enumerate(face_landmarks):
|
||||||
|
# NEW: Look up the face part name from the map
|
||||||
|
face_part = landmark_to_part_map.get(landmark_index, 'unknown')
|
||||||
|
# NEW: Write the new column to the CSV row
|
||||||
|
csv_writer.writerow([frame_number, face_index, landmark_index, face_part, landmark.x, landmark.y, landmark.z])
|
||||||
|
|
||||||
|
# Draw landmarks on the frame for the video
|
||||||
|
annotated_frame = draw_landmarks_on_image(rgb_frame, detection_result)
|
||||||
|
bgr_annotated_frame = cv2.cvtColor(annotated_frame, cv2.COLOR_RGB2BGR)
|
||||||
|
out.write(bgr_annotated_frame)
|
||||||
|
|
||||||
|
# Release everything when the job is finished
|
||||||
|
cap.release()
|
||||||
|
out.release()
|
||||||
|
cv2.destroyAllWindows()
|
||||||
|
|
||||||
|
print(f"\n✅ Processing complete.")
|
||||||
|
print(f"Annotated video saved to: {output_video_path}")
|
||||||
|
print(f"Landmarks CSV saved to: {output_csv_path}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
||||||
214
process_mp4_holistic.py
Executable file
214
process_mp4_holistic.py
Executable file
@@ -0,0 +1,214 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
holistic_mp4.py
|
||||||
|
Process an MP4 with MediaPipe Holistic:
|
||||||
|
- Saves annotated video
|
||||||
|
- Exports CSV of face/pose/hand landmarks per frame
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python holistic_mp4.py /path/to/input.mp4
|
||||||
|
python holistic_mp4.py /path/to/input.mp4 --out-video out.mp4 --out-csv out.csv --show
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import csv
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import cv2
|
||||||
|
import mediapipe as mp
|
||||||
|
|
||||||
|
mp_holistic = mp.solutions.holistic
|
||||||
|
mp_drawing = mp.solutions.drawing_utils
|
||||||
|
mp_styles = mp.solutions.drawing_styles
|
||||||
|
|
||||||
|
|
||||||
|
def parse_args():
|
||||||
|
p = argparse.ArgumentParser(description="Run MediaPipe Holistic on an MP4 and export annotated video + CSV landmarks.")
|
||||||
|
p.add_argument("input", help="Input .mp4 file")
|
||||||
|
p.add_argument("--out-video", help="Output annotated MP4 path (default: <input>_annotated.mp4)")
|
||||||
|
p.add_argument("--out-csv", help="Output CSV path for landmarks (default: <input>_landmarks.csv)")
|
||||||
|
p.add_argument("--model-complexity", type=int, default=1, choices=[0, 1, 2], help="Holistic model complexity")
|
||||||
|
p.add_argument("--no-smooth", action="store_true", help="Disable smoothing (smoothing is ON by default)")
|
||||||
|
p.add_argument("--refine-face", action="store_true", help="Refine face landmarks (iris, lips).")
|
||||||
|
p.add_argument("--show", action="store_true", help="Show preview window while processing")
|
||||||
|
return p.parse_args()
|
||||||
|
|
||||||
|
|
||||||
|
def open_video_writer(cap, out_path):
|
||||||
|
# Properties from input
|
||||||
|
fps = cap.get(cv2.CAP_PROP_FPS)
|
||||||
|
if fps is None or fps <= 0:
|
||||||
|
fps = 30.0 # sensible fallback
|
||||||
|
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
|
||||||
|
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
|
||||||
|
|
||||||
|
# Writer
|
||||||
|
fourcc = cv2.VideoWriter_fourcc(*"mp4v")
|
||||||
|
writer = cv2.VideoWriter(out_path, fourcc, float(fps), (width, height))
|
||||||
|
if not writer.isOpened():
|
||||||
|
raise RuntimeError(f"Failed to open VideoWriter at {out_path}")
|
||||||
|
return writer, fps, (width, height)
|
||||||
|
|
||||||
|
|
||||||
|
def write_landmarks_to_csv(writer, frame_idx, ts_ms, kind, landmarks, world_landmarks=None, handedness=None):
|
||||||
|
"""
|
||||||
|
landmarks: NormalizedLandmarkList (x,y,z, visibility?) -> face/hand have no visibility; pose has visibility.
|
||||||
|
world_landmarks: LandmarkList in meters (optional, pose_world_landmarks available).
|
||||||
|
handedness: "Left"|"Right"|None (we label hand sets by field name; not a confidence score here)
|
||||||
|
"""
|
||||||
|
if not landmarks:
|
||||||
|
return
|
||||||
|
|
||||||
|
# index by position; world coords may be absent or differ in length
|
||||||
|
wl = world_landmarks.landmark if world_landmarks and getattr(world_landmarks, "landmark", None) else None
|
||||||
|
|
||||||
|
for i, lm in enumerate(landmarks.landmark):
|
||||||
|
world_x = world_y = world_z = ""
|
||||||
|
if wl and i < len(wl):
|
||||||
|
world_x, world_y, world_z = wl[i].x, wl[i].y, wl[i].z
|
||||||
|
|
||||||
|
# Some landmark types (pose) include visibility; others (face/hands) don't
|
||||||
|
vis = getattr(lm, "visibility", "")
|
||||||
|
writer.writerow([
|
||||||
|
frame_idx,
|
||||||
|
int(ts_ms),
|
||||||
|
kind, # e.g., face, pose, left_hand, right_hand
|
||||||
|
i,
|
||||||
|
lm.x, lm.y, lm.z,
|
||||||
|
vis,
|
||||||
|
"", # presence not provided in Holistic landmarks
|
||||||
|
world_x, world_y, world_z,
|
||||||
|
handedness or ""
|
||||||
|
])
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
args = parse_args()
|
||||||
|
in_path = Path(args.input)
|
||||||
|
if not in_path.exists():
|
||||||
|
print(f"Input not found: {in_path}", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
out_video = Path(args.out_video) if args.out_video else in_path.with_name(in_path.stem + "_annotated.mp4")
|
||||||
|
out_csv = Path(args.out_csv) if args.out_csv else in_path.with_name(in_path.stem + "_landmarks.csv")
|
||||||
|
|
||||||
|
cap = cv2.VideoCapture(str(in_path))
|
||||||
|
if not cap.isOpened():
|
||||||
|
print(f"Could not open video: {in_path}", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
writer, fps, (w, h) = open_video_writer(cap, str(out_video))
|
||||||
|
|
||||||
|
# Prepare CSV
|
||||||
|
out_csv.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
csv_file = open(out_csv, "w", newline="", encoding="utf-8")
|
||||||
|
csv_writer = csv.writer(csv_file)
|
||||||
|
csv_writer.writerow([
|
||||||
|
"frame", "timestamp_ms", "type", "landmark_index",
|
||||||
|
"x", "y", "z", "visibility", "presence",
|
||||||
|
"world_x", "world_y", "world_z", "handedness"
|
||||||
|
])
|
||||||
|
|
||||||
|
# Holistic configuration
|
||||||
|
holistic = mp_holistic.Holistic(
|
||||||
|
static_image_mode=False,
|
||||||
|
model_complexity=args.model_complexity,
|
||||||
|
smooth_landmarks=(not args.no_smooth),
|
||||||
|
refine_face_landmarks=args.refine_face,
|
||||||
|
enable_segmentation=False
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
frame_idx = 0
|
||||||
|
print(f"Processing: {in_path.name} -> {out_video.name}, {out_csv.name}")
|
||||||
|
while True:
|
||||||
|
ok, frame_bgr = cap.read()
|
||||||
|
if not ok:
|
||||||
|
break
|
||||||
|
|
||||||
|
# Timestamp (ms) based on frame index and fps
|
||||||
|
ts_ms = (frame_idx / fps) * 1000.0
|
||||||
|
|
||||||
|
# Convert to RGB for MediaPipe
|
||||||
|
image_rgb = cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2RGB)
|
||||||
|
image_rgb.flags.writeable = False
|
||||||
|
results = holistic.process(image_rgb)
|
||||||
|
image_rgb.flags.writeable = True
|
||||||
|
|
||||||
|
# Draw on a BGR copy for output
|
||||||
|
out_frame = frame_bgr
|
||||||
|
|
||||||
|
# Face
|
||||||
|
if results.face_landmarks:
|
||||||
|
mp_drawing.draw_landmarks(
|
||||||
|
out_frame,
|
||||||
|
results.face_landmarks,
|
||||||
|
mp_holistic.FACEMESH_TESSELATION,
|
||||||
|
landmark_drawing_spec=None,
|
||||||
|
connection_drawing_spec=mp_styles.get_default_face_mesh_tesselation_style(),
|
||||||
|
)
|
||||||
|
write_landmarks_to_csv(csv_writer, frame_idx, ts_ms, "face", results.face_landmarks)
|
||||||
|
|
||||||
|
# Pose
|
||||||
|
if results.pose_landmarks:
|
||||||
|
mp_drawing.draw_landmarks(
|
||||||
|
out_frame,
|
||||||
|
results.pose_landmarks,
|
||||||
|
mp_holistic.POSE_CONNECTIONS,
|
||||||
|
landmark_drawing_spec=mp_styles.get_default_pose_landmarks_style()
|
||||||
|
)
|
||||||
|
write_landmarks_to_csv(
|
||||||
|
csv_writer, frame_idx, ts_ms, "pose",
|
||||||
|
results.pose_landmarks,
|
||||||
|
world_landmarks=getattr(results, "pose_world_landmarks", None)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Left hand
|
||||||
|
if results.left_hand_landmarks:
|
||||||
|
mp_drawing.draw_landmarks(
|
||||||
|
out_frame,
|
||||||
|
results.left_hand_landmarks,
|
||||||
|
mp_holistic.HAND_CONNECTIONS,
|
||||||
|
landmark_drawing_spec=mp_styles.get_default_hand_landmarks_style()
|
||||||
|
)
|
||||||
|
write_landmarks_to_csv(csv_writer, frame_idx, ts_ms, "left_hand", results.left_hand_landmarks, handedness="Left")
|
||||||
|
|
||||||
|
# Right hand
|
||||||
|
if results.right_hand_landmarks:
|
||||||
|
mp_drawing.draw_landmarks(
|
||||||
|
out_frame,
|
||||||
|
results.right_hand_landmarks,
|
||||||
|
mp_holistic.HAND_CONNECTIONS,
|
||||||
|
landmark_drawing_spec=mp_styles.get_default_hand_landmarks_style()
|
||||||
|
)
|
||||||
|
write_landmarks_to_csv(csv_writer, frame_idx, ts_ms, "right_hand", results.right_hand_landmarks, handedness="Right")
|
||||||
|
|
||||||
|
# Write frame
|
||||||
|
writer.write(out_frame)
|
||||||
|
|
||||||
|
# Optional preview
|
||||||
|
if args.show:
|
||||||
|
cv2.imshow("Holistic (annotated)", out_frame)
|
||||||
|
if cv2.waitKey(1) & 0xFF == 27: # ESC
|
||||||
|
break
|
||||||
|
|
||||||
|
# Lightweight progress
|
||||||
|
if frame_idx % 120 == 0:
|
||||||
|
print(f" frame {frame_idx}", end="\r", flush=True)
|
||||||
|
frame_idx += 1
|
||||||
|
|
||||||
|
print(f"\nDone.\n Video: {out_video}\n CSV: {out_csv}")
|
||||||
|
|
||||||
|
finally:
|
||||||
|
holistic.close()
|
||||||
|
writer.release()
|
||||||
|
cap.release()
|
||||||
|
csv_file.close()
|
||||||
|
if args.show:
|
||||||
|
cv2.destroyAllWindows()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
98
recognize_gesture.py
Executable file
98
recognize_gesture.py
Executable file
@@ -0,0 +1,98 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
import argparse
|
||||||
|
import sys
|
||||||
|
import mediapipe as mp
|
||||||
|
|
||||||
|
BaseOptions = mp.tasks.BaseOptions
|
||||||
|
VisionRunningMode = mp.tasks.vision.RunningMode
|
||||||
|
GestureRecognizer = mp.tasks.vision.GestureRecognizer
|
||||||
|
GestureRecognizerOptions = mp.tasks.vision.GestureRecognizerOptions
|
||||||
|
|
||||||
|
def _first_category(item):
|
||||||
|
"""
|
||||||
|
Accepts either:
|
||||||
|
- a Classifications object with .categories
|
||||||
|
- a list of Category
|
||||||
|
- None / empty
|
||||||
|
Returns the first Category or None.
|
||||||
|
"""
|
||||||
|
if item is None:
|
||||||
|
return None
|
||||||
|
# Shape 1: Classifications with .categories
|
||||||
|
cats = getattr(item, "categories", None)
|
||||||
|
if isinstance(cats, list):
|
||||||
|
return cats[0] if cats else None
|
||||||
|
# Shape 2: already a list[Category]
|
||||||
|
if isinstance(item, list):
|
||||||
|
return item[0] if item else None
|
||||||
|
return None
|
||||||
|
|
||||||
|
def _len_safe(x):
|
||||||
|
return len(x) if isinstance(x, list) else 0
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Recognize hand gestures in a still image with MediaPipe.")
|
||||||
|
parser.add_argument("-i", "--image", default="hand.jpg", help="Path to input image (default: hand.jpg)")
|
||||||
|
parser.add_argument("-m", "--model", default="gesture_recognizer.task",
|
||||||
|
help="Path to gesture_recognizer .task model (default: gesture_recognizer.task)")
|
||||||
|
parser.add_argument("--num_hands", type=int, default=2, help="Max hands to detect")
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
options = GestureRecognizerOptions(
|
||||||
|
base_options=BaseOptions(model_asset_path=args.model),
|
||||||
|
running_mode=VisionRunningMode.IMAGE,
|
||||||
|
num_hands=args.num_hands,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Load the image
|
||||||
|
try:
|
||||||
|
mp_image = mp.Image.create_from_file(args.image)
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Failed to load image '{args.image}': {e}", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
with GestureRecognizer.create_from_options(options) as recognizer:
|
||||||
|
result = recognizer.recognize(mp_image)
|
||||||
|
|
||||||
|
if result is None:
|
||||||
|
print("No result returned.")
|
||||||
|
return
|
||||||
|
|
||||||
|
n = max(
|
||||||
|
_len_safe(getattr(result, "gestures", [])),
|
||||||
|
_len_safe(getattr(result, "handedness", [])),
|
||||||
|
_len_safe(getattr(result, "hand_landmarks", [])),
|
||||||
|
)
|
||||||
|
if n == 0:
|
||||||
|
print("No hands/gestures detected.")
|
||||||
|
return
|
||||||
|
|
||||||
|
for i in range(n):
|
||||||
|
handed = None
|
||||||
|
if _len_safe(getattr(result, "handedness", [])) > i:
|
||||||
|
cat = _first_category(result.handedness[i])
|
||||||
|
if cat:
|
||||||
|
handed = cat.category_name
|
||||||
|
|
||||||
|
top_gesture = None
|
||||||
|
score = None
|
||||||
|
if _len_safe(getattr(result, "gestures", [])) > i:
|
||||||
|
cat = _first_category(result.gestures[i])
|
||||||
|
if cat:
|
||||||
|
top_gesture = cat.category_name
|
||||||
|
score = cat.score
|
||||||
|
|
||||||
|
header = f"Hand #{i+1}" + (f" ({handed})" if handed else "")
|
||||||
|
print(header + ":")
|
||||||
|
if top_gesture:
|
||||||
|
print(f" Gesture: {top_gesture} (score={score:.3f})")
|
||||||
|
else:
|
||||||
|
print(" Gesture: none")
|
||||||
|
|
||||||
|
# If you want pixel landmark coordinates later:
|
||||||
|
# if _len_safe(getattr(result, "hand_landmarks", [])) > i:
|
||||||
|
# for j, lm in enumerate(result.hand_landmarks[i]):
|
||||||
|
# print(f" lm{j}: x={lm.x:.3f} y={lm.y:.3f} z={lm.z:.3f}")
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
2
server_holistic.sh
Executable file
2
server_holistic.sh
Executable file
@@ -0,0 +1,2 @@
|
|||||||
|
echo "Go to: http://localhost:8001/holistic.html "
|
||||||
|
python -m http.server 8001
|
||||||
14
source_activate_venv.sh
Executable file
14
source_activate_venv.sh
Executable file
@@ -0,0 +1,14 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# AlERT: source this script, don't run it directly.
|
||||||
|
# source source_activate_venv.sh
|
||||||
|
|
||||||
|
if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
|
||||||
|
echo "This script must be sourced, not run directly."
|
||||||
|
echo "source source_activate_venv.sh"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# rest of your script here
|
||||||
|
echo "Script is being sourced. Continuing..."
|
||||||
|
source ./.venv/bin/activate
|
||||||
Reference in New Issue
Block a user