Translation from one language to another is often like swapping variable values; you need something in the middle. That’s why translators were invented, but the opportunity for them to interject obscene comments made communication between people of different languages uneasy. And since my understanding of other languages is a bit rusty, and my group needed a project with a biomedical bent, we decided to translate to American English from American Sign Language. That’s right, you should be really excited right now.
Pretty simple premise, if you ask me. In fact, if it’s on wikipedia then everybody should already know about it. There’s two options to go about sign language translation, but both need to measure a person’s hand movements. This can be done visually or mechanically; because it’d be a bit lame to have to carry around video cameras and computers necessary to do the processing, my group went with a glove based system. By the by, ‘my group’ consisted of a few buds in CMU’s Biomedical Engineering Design Capstone class: Allen Ambulo, Andrew S.D. Tsai, Michelle Lin, Sherry Huang, and Eric Wideburg. We also had the awesome Professors Dr. Conrad Zapanta, and Dr. James Antaki.So, clearly, the idea of a sign language recognizing glove is not new, but two things have not been done–at least we didn’t find evidence of it.
1) With the abundance of iPods and other media devices, why can’t this device also make noise? And if it could make noise that corresponds to whatever is being signed by the user, that’d be extra impressive.
2) Nobody likes to spell, so why do all currently made gloves mainly focus on finger spelling. Damn, you’d have some strong hands if all you were able to do was finger spell… Why not include good gesture recognition? Wiimotes do it, and 3 year olds are better than me at playing Nintendo Wii.
Anyway, lets talk about implementation. I wont go into the significantly boring detail in this post; I’ll probably put up guides on specific aspects (the Recognition, the Sensors, etc.) of the project later on.
Like any good embedded system, this glove is merely a system of input and output, with some processing in the middle. Like a kind of mathematical system of equations sandwich. Yum. Input comes from the sensors or from user input. Output is the LCD screen and the tiny speaker I stole off one of those cards that sing at you (thanks Grandma for the birthday card! I really like it!).
There’s two things to look for in sign language: the movement of the hands and the position of the fingers. Thus, we’ve got an accelerometer and flex sensors. While some might see the limitations of just these sensors, I had a few work arounds for this initial proof-of-concept version. The trackball is the same kind as those found on Blackberrys, and I had an old ominous looking LCD screen (red on black, ooooh~) lying around.
All this plugs into an Arduino Mega, because it has a lot of input/outputs, and looks badass when strapped to your wrist. The output is this Sparkfun made Speakjet Arduino shield; think of it as text-to-speech. It is capable of pronouncing a list of phonetics, from which you can configure it to say them in the right order to make words, or gobbledegook. This pushed out an audio signal to the previously mentioned tiny speaker, and mirrored the results onto the LCD screen.
That about does it for a hardware overview; software from here on out. The reason the glove you see is Version 1 is because so much time was spent getting the hardware together and reliable, the software is not as robust as a daily use version. Don’t get me wrong, this thing can work fine and dandy, but there’s some improvements I would like to make. Let’s start from the top of what V.1 is, and then I’ll discuss future improvements.
Sensor data comes in and, due in no small part to how they are attached, are pretty free of movement artifacts; the data is pretty reliable and consistent, only a simple low pass filter is used to just take an exponential moving average of the data. Mechanically, the sensors are attached to the glove through the use of small metal brackets made with garden wire (this too forever…) or with button snaps like you’d find on clothing (hey, we are dealing with a glove here). This allowed the flex sensors to remain fixed at their base and slide through the brackets, but also allowed the stretchiness of the glove to act as the spring–and the user’s hand as the damper–in a simple spring-mass-damper system. Fancy words for, “I sewed things on to a stretchy glove. And the glove was on my hand at the time.”
After the sensor data comes in and is converted to digital values through the Arduino’s on board ADC, and is slightly filtered, it gets formatted into a simple state matrix: 5 values for the flex sensors, 3 for each axis of the accelerometer. This state matrix gets run through a Naive Bayesian Classifier whenever the state has stabilized, i.e. the user has performed a gesture/letter and holds that position for a specified amount of time. This delay signals the microprocessor to compute the most likely gesture that has just been performed based on the current state of the sensor data out of a list of possible gestures that the Arduino knows about. Because I have no idea what I’m doing when it comes to ASL, I configured the delay to be 2 seconds for myself. Gimme a break, I learned cello on weekends, not ASL.
After the classifier has done its duty, the Arduino takes the gesture of what it thinks to have just been done, and looks it up in it’s dictionary–for us, this was the alphabet and like 10 words, due to memory constraints. The recognizable gestures corresponded to entries in it’s recognition dictionary, which translated between the gesture to the requisite phonetic commands for the Speakjet. These phonetics get sent to the Speakjet chip, and a freakishly robotic voice then says the word. Hey, they included a volume dial thankfully.
So some optimizations included using letter frequencies in the classifier (an ‘e’ is more likely to show up than a ‘z’), and code was optimized for performance; it’s pretty slick how much a 16MHz processor can do. While there were more optimizations that could be done (e.g. letter frequency based on the previous letter), it just was not worth it. The Bayesian Classifier is very limited in capability, but great for a proof-of-concept.
Thus, for Version 2, I’ve got Hidden Markov Models planned and the Arduino Mega will be a “training” unit (both for the user, and the HMM), and I want to miniaturize it to an Arduino Pro Mini. HMM’s are awesome for identifying things that can’t be observed directly; they are frequently used for speech recognition. But yeah, things to do things to do. I’ll revise this article shortly, as its level of ‘snarky’ is probably too high.







You could definitely commercialize this.
this is amazing! i can’t wait for a video! i’m probably going to take a course in ASL when i go to college next year, and this would be awesome to have as a training tool. there you go, you have 2 things to accomplish: 1. a video – 2. make them for sale
great job!
Amazing! I would love to see a video of this working.
Hey! Beautifully done! I’m starting with Arduino now (really, i got mine not a week ago) and human interface of this kind is my major goal for some time. I’m planning my graduation project in Product Design to be something like your glove, some kind of prothesis or enhancer for people. Very cool glove. Cheers!
I’ve been working on a computer vision based approach to do hand pose extraction, inspired by sign language. I use it to make music.
http://code.google.com/p/sonic-gesture/
maybe you will like it.
Not bad at all!
However you can only measure the efficiency of your work when working with more than 10 words
Neet, I have 2 original powergloves in the basement , I should pull them out and see if they still work.
Great project.
Great project that opens the door to other possible applications. Will you be sharing the code?
when it’s not embarrassing to share, sure! haha, it’s pretty rudimentery right now, but i hope to make version 2.
Have you gotten around to cleaning up the code? I’m working on a similar gesture recognition project and would love to see how you did it.
I am in contact with TEC a technologic univerty in México
Could you send me a full article in this application?
Thank you very much
Gabriel parrodi
http://WWW.MELEKTRONIKOS.COM
Haha, just heard about this in 18-200 today! Good job, guys! (Also I think it made HaD)
Your project is amazing!!! I hope you could make it open source/diy… This is very good for charity work. anybody could learn sign language and confirm it with this tool, and a mute person could communicate with anyone…
This is a great project thank you friends
now i’m surprised..
actually this is my project at my university, I’m gonna make similar things like you guys do…
and I need several clue I you guys want to share to me,
1. The output data, are you train the data using neural or fuzzy or maybe just put the raw data on your code?
2.Is it enough for just using 1 flex sensor in every finger? because some the ASL gesture could be read similar, before I planned to use 2 flex in every finger
sorry for the slow response:
1) we took raw data and stored averaged values over a few trials for use in the bayesian classifier. it’s a really simple way to probabilistically determine what something is doing; i’ll try to make a post about it soon.
2) each finger is basically two joints (the knuckle and the first joint), and the flex sensors give slightly different data if you’re flexing one, the other, or both, because both joint’s are not exactly the same, the sensor flexes differently at different places, etc etc. While this is not a perfect solution, it was the best we could do. There’s much more exotic options if people are up for it, but that would be an entirely different project (small strain gauges, fiber optics, emf/rf interference, etc etc)
what about the power supply? did you build your own power supply?
we are working on the similar project for our senior project now, wondering if you could help us out….
thanks
This is flippin’ awesome dude! I’m working on something similar and here you’ve done a proof of concept. Great job!
Hey, just placed a link to your project at my page.
Congrats on the project.
> we decided to translate from American English to American Sign Language
Shouldn’t that be “we decided to translate from American Sign Language to American English”?
Nice work, the project makes good sense to me (no need for the mute to have a translator around all the time.
thanks for the catch! project documentation is always the most valued, most rushed part of any project.
My brother is deaf, and this promises a tremendous improvement for the Deaf.
His wife has developed numerous ASL courses, and may be of significant development insight for you (without having to learn ASL, it does does take a while to learn, regular practice to retain).
They reside in W Va.
If they or I can be of assistance (no charge), please let me know.
This could be worth a Nobel prize my friend.
I wish I had more time to work on this, but unfortunately for now it’s just a side project. Apparently similar devices were made in the 90′s but never caught on commercially. I believe it’s definitely viable nowadays as a multifunction device: ASL users could use this for communication when needed, or to train others in ASL, or as a more general communication device, such as a novel user interface for a computer.
Hi,
A very innovative project done very elegantly in the most efficient possible way. I have always wondered if those FSRs have to be good for something else. You have provided the best possible motivation for me to create my own version. The ARM platform these days are more scalable with more RAM for real time signal processing and hence are more conducive for these types of applications.
Thanks for sharing this with the rest of us. I am truly inspired!.
regards and best of luck,
Ananth
Check this out:
Jim Kramer’s Talking Glove (1990s)
http://www.stanford.edu/group/rrd/TTran/glove.html
amazing project!
How did you program the arduino board ? and interface the flex sensor?
Could the glove be adapted for animals? You know chimpanzees can perform ASL. In theory, they could “vocalize” their thoughts. Now that would interesting!
Of course! I imagine your thinking of the movie “Congo” or something, but this could totally be used for animals if it was adapted and the data retrained. While my implementation was a bit naive, a more modern device could be easily transferable.