Tactile Language Learning

Overview

How can we make language learning more tactile for students to enhance pronunciation and writing skills?

Mandarin Chinese presents a unique learning challenge: characters encode both meaning and pronunciation, and handwriting stroke order shapes fluency. Most digital learning tools address these separately — or not at all.

This project creates a physical system where spoken Mandarin directly controls physical blocks that form characters. Speaking a character correctly moves the blocks; the blocks' movement traces the correct stroke order. Pronunciation and writing are learned simultaneously, through interaction.

The Challenge

Connecting voice input to physical output through machine learning.

The system scope was limited to five Mandarin characters: 一 (yī, "1"), 二 (èr, "2"), 三 (sān, "3"), 口 (kǒu, "mouth"), and 日 (rì, "sun"). Building even this narrow vocabulary required integrating audio capture, machine learning classification, and motor control.

01

Audio Input & Machine Learning

Audio captured through microphone is sent to Wekinator — an open-source machine learning tool — which classifies five voice gestures at 1,600 inputs via port 6448.

02

Motor Output

Processing receives the Wekinator classification and controls motors that move physical blocks, forming three-dimensional character visualizations.

Wekinator ML interface and motor circuit.

System process: audio input → ML classification → motor output → character formation.

Outcome

Physical blocks that respond to voice — forming characters in real time.

The completed prototype demonstrates the full loop: a student speaks a Mandarin character into the microphone; Wekinator classifies the pronunciation; the correct motor sequence fires; the blocks move to form the character in the correct stroke order.

The tactile feedback loop reinforces both pronunciation accuracy and character formation in a way that screen-based tools cannot replicate.

The character 口 (kǒu, "mouth") formed by the physical block system.

Results

A cyber-physical model for embodied language learning.

The project demonstrated that machine learning classification can be integrated into a physical educational tool within a 5-week prototype cycle. By linking voice input directly to motor-driven physical output, the system creates a new modality for language learning — one that makes the invisible (pronunciation accuracy) visible and the abstract (stroke order) physical.

Video and cover photography by Marco Da Re and Kevin Lee.