Latest Entries »

One thing about Matlab that’s good: everyone uses it. Another thing: plots are pretty useful!

One bad thing: matrix manipulations are slow. Wait, really? Matlab, as in Matrix Laboratory? A laboratory of something that doesn’t do its job well is never a good thing…

Enter MMX. It allows for faster Matlab and faster Matlab multiplication, and is similar to ndfun or mtimesx. Who doesn’t want Matlab fast matrix stuff? I’m not 100% sure what the MMX stands for, but Yuval Tassa wrote it with my contributions to be a faster way of crunching some sweet numbers. I can let some pictures do the non-verbal talking:

aka squiggly lines

What is happening here is that many matrix manipulations can be completed if the matrices are stacked: N-by-M matrix, with D pages. Two stacks of matrix pages can be manipulated at the same time with one computational thread per page. What is happening around dimension 36 in the plot above is that some optimization libraries like to sub-divide a matrix to be handled by multiple threads. Obviously, the overhead is terrible for performance. I would like to note that the above shows the native C compiled code as doing ridiculously well; it will pretty quickly reach the natural computational capacity of your computer (read: number of cores).

How do we get the speedups over Matlab code or even compiled C? BLAS. Basic linear algebra subprograms are a set of data manipulations that are a like a nerdy way to getting more power from your engine; highly optimized and ever benchmarked to provide efficient computations. Many libraries are available, but we went with Intel’s MKL (Math Kernel Libraries). While usually commercial, you can test out the headers and libraries for BLAS for free. It would be awesome to see how CUDA or OpenCL does with this in the future (maybe).

In summary, using BLAS and correctly allocating software threads provides a much faster way of computing many matrix manipulations. While you may marginally have to rethink some scripts to take true advantage of MMX, the improvements are obvious. Check it out and bug me if you have questions.

One of the interesting parts of robotic learning is defining what can be learned by programming the structure the the robot will operate in. I’m not even to the point of a robot that has hands or talks or whatever, but an input/output system that somehow “thinks”, and in this case, learns to think new things. However, implementation belies functional ability; how you make something, not just what you make it do, defines how it works.

One of the projects I’m working on at UW is for making a robot that will involve human interaction to make block structures, with Mike Chung of the Neural Systems Lab. It’s a start to somethings more interesting. Basically, given an input of blocks in a certain working area (the tabletop), how can the robot learn certain structures–shapes the blocks create–and potentially create them on it’s own, or with the user. An example might be demonstrating a “square” shape with four blocks to teach the robot, and then, after putting down 3 blocks, allowing the robot to place the fourth to finish the square (or make a T shape, whatever). Of course, as an engineer, you could program it to recognize these thing by hand, or allow a system to learn it itself, but what the system is measuring and how it does so determines how the robot functions: it simply can’t be very useful (ie approaching human complexity) if the underlying system is not complicated. But that’s a butthurt post for another day (of philosophical BS).

Instead is a brief overview of making an input system for this kind of robot.

Our first version of this sort of framework (ugh, I hate that word) was tracking color blobs (bright foam blocks) and allowing the system to recognize a before and after state to let it learn some actions. This fell short in a bunch of different ways, but it was able to localize blobs in a 3d space around the table. A note on hardware: Kinect. Yeah, there’s nothing fancy going on, which lead to some problems, but again, post for another day.

Version two: a real deal. Thanks to the use of PCL and it’s development outside of ROS, it’s awesomely easy to do complicated things like grab a frame from the Kinect, do a spatial segmentation of a specific area, detect and remove the table’s plane, and do a cartesian cluster extraction of the remaining points in the depth point cloud to get a block, all at about 12 frames per second on an average computer. By golly that’s great!


Some obvious problems: what happens if the block moves? What about occlusion? What about resource contention? What about giving that data to another process?

1) Being academia, we can make all the assumptions we want => blocks don’t move after they’ve been placed. However, I’m not an academic. Blocks were detected if a certain cluster of points persisted for a time without moving, and had a certain magnetism in persisting so that occlusion for a few frames wouldn’t lose sight of the block. Hey, if I wanted a PHD in tracking blocks with CV, this would be a different kind of post.

2) A neat trick with PCL is using octrees. In fact, they’re made with two pointers to hit up two different point clouds, so you can do comparisons between them, such as measuring differences. If a user’s arm moves within frame, that movement will be picked up and ignored by this technique. Again, we’re assuming that blocks don’t move.

3) Everything is very nicely resource locked to prevent fighting over specific point clouds / Kinect input, but I think I can optimize this more. In the next version, of course…

4) Apache Thrift makes it super easy to have this code executed as a remote procedure call and with data passing, it connects super easy to a client program. In our case, it was connected to ROS to do some reference frame transformation (don’t ask why), pushing the data to another ROS node which then called out commands to the MATLAB engine. A separate ROS node, of course, to have the Matlab stuff controlled in a separate process. Matlab, of course, because its academia, and there are some libraries common to the subject of Bayesian networks.

Anyway, here’s a vid of the single layer HMM working to discretize the location of the block (we’re basically just looking for up, down, left, right relative to the last block). The second layer will come soon! I’d like to think that the robot manipulation part of this will come soon too, but that’s a post for another day.

Note the jump when the block is placed to the left.


FYI, on most *Nix systems you can run the following to do a screen capture to a video. Bam, simple.

ffmpeg -f x11grab -s 800×600 -r 25 -i :0.0 -sameq ~/out.mpg


In the course of development of the Water-to-Wine-o’matic, I’ve come across a way to make wine into… better wine! Apologies for the terrible cell phone pictures.

It's already perfect; what products from Costco are not?

I’ll start with a brief, non-sommelier description of the situation. Actually, I’ll give an ENGINEER’s description: it all starts when grapes are fermented with yeast. While I’m sure there are some optimizations here, it probably can’t be done for cheap. After some more work I don’t care about, the wine is put into bottles with a few target characteristics. Each bottling is designed such that there’s a target time that the wine has is best drinkability because the wine changes over time in the bottle. From what I understand this is very complicated process involve all the chemicals within the bottle; some bottles age much better, and longer, than others.

Can we speed this up? In short, no, not really. However, we can induce a chemical change to improve a wine’s flavor a bit. The target wine should taste a bit young and you’ll need a french press or awesome blender.

Vitamix watching jealously

Step 1) Pour servings of wine into french press (or blender).

Step 2) Pump french press to froth the wine (or hit the blender on high) for about 20-30 seconds. This isn’t a consistent number between wines, of course.

Step 3) ??? Let the wine chill out and the bubbles to dissipate, I guess.

Step 4) Serve and enjoy!

Whats going on here in both the french press and the blender is the wine is getting hit with a ton more oxygen than just letting it breath in a glass. This induces some chemical changes that remove that new, young taste, giving you a slightly more mature wine. While I’d love to pass some samples through a spectrometer, I can only give a subjective observation based on my taste. It is noticeably not the same wine before the process, and personally I’d prefer it after this hackery. I’m sure some would balk at the though of this man-handling of their booze, but hey, not their kitchen.

Ah, the MSP430. What a widely used and smartly designed little processor. However, it’s not without it’s quirks. It’s idiosyncrasies. It’s problems. I spent about a week trying to figure out why a piece of code given to me to build off of wasn’t working when I attempted to run the board at a lower voltage. What’s that? A calibrated CPU frequency is not the same frequency at all times? Well I’m dumb… that is a problem.

Running an MSP430 at lower voltages changes the frequency of the internal DCO, and high frequency external crystals have their own entire set of problems. Not to mention clock settings between chips are not transferable, meaning setting the DCO and BCS values for one chip doesn’t not result in the same frequency for another, meaning that every chip needs to be calibrated to run at a custom frequency before running the actual code you’re trying to deploy. Oh bother.

Building off of this excellent piece of work here, I modified it to use the series of chip for this wireless power project. First off, it doesn’t overwrite the default calibrated settings of the board; using the setupClk() macro you can load your custom clock speed in your own application. Secondly, the assumptions: it needs a 32Khz watch crystal, and the timer settings needs SMCLK as the timer clock and ACLK as the capture input, so you’ll have to find the right timer settings for your series of chip. Finally, to get this stuff working at a lower clock speed, you can set your programmers voltage to be lower, calibrate it that way, or, if it can’t go low enough (again, working on wireless power stuff), you’ll just have to guess. Ah, precision.

If this is sufficiently helpful, I can post other workarounds to various MSP430 problems I’ve encountered.

Download link: MSP430 Calibration

// Calibrate DCO settings- based on TI code example
// MSP430F20xx Demo - DCO Calibration Constants Programmer
// MSP430F20xx
// ---------------
// /|\ | XIN|-
// | | | 32kHz
// --|RST XOUT|-
// | |
// | P1.0|--> LED
// | P1.4|--> SMLCK = target DCO
// Orignal Code By
// A. Dannenberg
// Texas Instruments Inc.
// May 2007
// Built with CCE Version: 3.2.0 and IAR Embedded Workbench Version: 3.42A

#include “msp430x22x4.h”

// calibration is:
// ACLK_Divided * SMCLK = Target Frequency
// SMCLK will be at the DCO frequency

#define DELTA_1MHZ 245 // 245 x 4096Hz = 1003520Hz or 1.04MHz
#define DELTA_2MHZ 489 // 489 x 4096Hz = 2002944Hz or 2.03MHz
#define DELTA_4MHZ 978 // 978 x 4096Hz = 4005888Hz or 4.00MHz
#define DELTA_4237 1035 // 1035 x 4096Hz = 4237499Hz or 4.2375Mhz
#define DELTA_8MHZ 1953 // 1953 x 4096Hz = 7.99MHz
#define DELTA_12MHZ 2930 // 2930 x 4096Hz = 12.00MHz
#define DELTA_16MHZ 3906 // 3906 x 4096Hz = 15.99MHz

#define SMCLK_PIN BIT1
#define ACLK_PIN BIT0

#define INFO_A_START (0x10C0)
#define INFO_A_CALIB (0x10F6)
#define INFOA_ADDR_MBCS (0x0001 + INFO_A_CALIB)
#define INFOA_ADDR_MDCO (0x0000 + INFO_A_CALIB)

#define setupClk() DCOCTL = *((unsigned char*)INFOA_ADDR_MDCO); \
BCSCTL1 = *((unsigned char*)INFOA_ADDR_MBCS); \

unsigned char CAL_DATA[10]; // Temp. storage for constants
volatile unsigned int i;
int j;
char *Flash_ptrA; // Segment A pointer

void Set_DCO(unsigned int Delta_L, unsigned int Delta_H);

int window = 0;

void main(void)
for (i = 0; i < 0xfffe; i++); // Delay for XTAL stabilization
P1OUT = 0x00; // Clear P1 output latches
P1SEL = 0x10; // P1.4 SMCLK output
P1DIR = 0x12; // P1.0,4 output

P2DIR |= SMCLK_PIN + ACLK_PIN; // set p2.1 to be output
P2SEL |= SMCLK_PIN + ACLK_PIN; // set high for SMCLK
P2OUT = SMCLK_PIN + ACLK_PIN; // 0 = pull down

j = 0; // Reset pointer
unsigned int d_l, d_h;

d_l = DELTA_4237 – window;
d_h = DELTA_4237 + window;
Set_DCO(d_l, d_h); // Set DCO and obtain constants

d_l = DELTA_16MHZ – window;
d_h = DELTA_16MHZ + window;
Set_DCO(d_l, d_h); // Set DCO and obtain constants

d_l = DELTA_12MHZ – window;
d_h = DELTA_12MHZ + window;
Set_DCO(d_l, d_h); // Set DCO and obtain constants

d_l = DELTA_8MHZ – window;
d_h = DELTA_8MHZ + window;
Set_DCO(d_l, d_h); // Set DCO and obtain constants

d_l = DELTA_1MHZ – window;
d_h = DELTA_1MHZ + window;
Set_DCO(d_l, d_h); // Set DCO and obtain constants

Flash_ptrA = (char *)INFO_A_START; // Point to beginning of seg A
FCTL2 = FWKEY + FSSEL0 + FN1; // MCLK/3 for Flash Timing Generator
FCTL1 = FWKEY + ERASE; // Set Erase bit
FCTL3 = FWKEY + LOCKA; // Clear LOCK & LOCKA bits
*Flash_ptrA = 0x00; // Dummy write to erase Flash seg A

// Set WRT bit for write operation

// Point to beginning of cal consts
Flash_ptrA = (char *)INFO_A_CALIB;

// re-flash DCO calibration data
for (j = 0; j < 10; j++) {
*Flash_ptrA++ = CAL_DATA[j];

FCTL1 = FWKEY; // Clear WRT bit


while (1)
P1OUT ^= 0x02; // Toggle LED
for (i = 0; i < 0x4000; i++); // SW Delay

void Set_DCO(unsigned int Delta_L, unsigned int Delta_H) // Set DCO to selected frequency
unsigned int Compare, Oldcapture = 0;

Timer clock needs to be SMCLK
Timer capture input needs to be ACLK

TACCTL2 = CM_1 + CCIS_1 + CAP; // CAP, ACLK input — only for timerA2, not ta0
TACTL = TASSEL_2 + MC_2 + TACLR; // SMCLK, cont-mode, clear

while (1)
// if difference between SMCLK captures == delta
while (!(CCIFG & TACCTL2)); // Wait until capture occurred
TACCTL2 &= ~CCIFG; // Capture occurred, clear flag
Compare = TACCR2;
Compare = Compare – Oldcapture;
Oldcapture = TACCR2;
if (Delta_L = Compare)
else if ((Delta_L + window) < Compare)
if (DCOCTL == 0xFF)
if (BCSCTL1 & 0x0f)
if (DCOCTL == 0x00)
if ((BCSCTL1 & 0x0f) != 0x0f)
TACCTL2 = 0; // Stop TACCR2
TACTL = 0;


As is traditional for robot movement, and dancing to imitate robot movement, zero moment point is the control scheme to get that perfectly awkward, jerky look. It’s just another way of saying that the robot center of mass is perfectly above the center of contact area with the ground, meaning that if it stops moving, there will be no horizontal movement, also known as falling. A fun experiment is to try to walk like this yourself… difficult, painful, and very slow.

yeah, he's adorable

Ladies: he's single.

At the Neural Systems Lab where I work, there’s a good buddy of mine called the HOAP-2. He’s a shy fellow, but with just the right amount of hacking (read: a lot of hacking), I got him working much more fluidly. Of course, the limitations are still clear: the movement of walking is not the same as walking successfully.

The control system consists of a desktop computer running an old Red Hat install with RTLinux. Yeah, before they sold themselves, so clearly this stuff was outdated. Taking a CSV file of a walking gait (I think from the original makers of the HOAP robot) that delineated a specific walking speed, I converted this gait to one that could be more useful. By representing each joint angle in the 25 DOF robot as a sum of sines of the original CSV file, each joint could be moved to the correct position based off one input. I represented this value as Pi; the cyclic nature of a walking gait spoke to me as a circle. This circle could be looped as fast or slow as desired, giving the robot a range of speeds in which to walk.

He can do the twist.

Like I already mentioned, just sending joint angles to servos is hardly a “walk”. The contact forces of the robot hitting the table surface adds a bit of non-linearity to the problem space, but it was able to successfully walk beyond a limited ZMP. A much better, but more technically complex option would be model predictive control that actually know’s what’s going on with the robot’s body AND how it interacts with the world, but that’s a project for another day.

He didn't want to hold hands; I didn't want him to fall.

While it’s recommended to not mix electronics and human bodies (most living bodies, for that matter), people still try. Having been commissioned to do just that, I’ve been working to bring together an ARM processor, 1.3 megapixel camera, and awesome UV-reactive flourescein.

Presentation: 5/5 Taste: 0/5

Proven to turn you into a Teenage Mutant Ninja Turtle.

The problem: glucose sensors work great, but under the assumption that new sample is getting to the biosensor. You can have a continuous monitor glucose sensor sticking in your side all day long, but how many hours can you guarantee accurate measurements? As such, this device by Invivomon, Inc. helps by measuring the sample quality. You must be gripping your seat with rampant expectation.


I made such a mess of my desk with damn flourescein.

Glass tube of UV reacting sample. Definitely cooler zoomed in like this.

This is where the flourescein comes in: pump a certain amount in, and measure how much flourescein comes out. Bam! You’ve got flow quality and flow rate, if you’re clever. If you’re a bit confused, a more complete breakdown is this. A catheter stuck into a patient’s vein allows flourescein into the body (which is harmless), and allows flourescein and bodily fluid samples out. By measuring the UV reaction of this outflow, we can measure how much flourescein is coming out, and thus how much sample. Similarly, by bleaching the flourescein we can measure when the dip in UV reactance happens, and accurately predict flow rate. Such power!


No, it's secretly beautiful, you just don't understand!

Yeah, I know it's ugly, ok? It was just a test!

Invivomon provided the board: ARM processor with what amounts to a cell phone camera, and attachments for samples to flow through some tiny tubes pass a calibrated lens setup. I wrote some software to control the camera, LEDS, memory, simple UI, and the LCD screen, which arguably was the most fun. Yeah, font design! Hey, function belies form, so what if it’s ugly: it works.

The character ‘E’ is very popular in the English language; in fact, it shows up about 12% of the time. This data was helpful to me in making the first version of the sign language glove, but I wanted kick it up a notch (BAM!) by using letter frequency based off the previous letter. We already know how often certain letters start words (as you can see here), but after the first letter, what is the probability of the next letter being any other given letter?

The goal was to make it easier to letter spell words more accurately; if you’re spelling something, then there’s fewer letters you actually have to consider if you know the likelihood of various characters showing up. So, I found a gigantic list of English words, butchered some C++, and let my computer do all the work. Sorry, ‘puter.

So in all it’s glory, here’s the data (26×26) of letter frequencies in double double array form. Double array, ohmygod, double array!

And because I love numbers, here’s ‘A’ for you:




The data moves from A-Z from the top, down and from left to right, and each column displays the probability of being the letter that comes after that row’s letter: from the top left, the value 0.0510151 shows that there is a 5% chance of a ‘B’ coming after an ‘A’. Another example is the highest association value being that a ‘U’ has a 99% chance of showing up after ‘Q’. If you didn’t already know this.

This could be helpful as a sort of Transition Matrix for a Hidden Markov Model, but I’m moving away from that for the sign language glove because I don’t think an Arduino can push through a HMM fast enough (at least not the hierarchical one I’m thinking about. If anyone wants to ‘donate’ me one of them Gumstix with breakout board, I’d bottle a biiiiig hug and send it your way. Or do something more substantial, it’s all good. Otherwise it’ll just be a fancier classifier.

Line Driving Buggy

line following robot, basically. the emphasis is on speed and looking hot.

sneak peak:

Vroom vroom.

Translation from one language to another is often like swapping variable values; you need something in the middle. That’s why translators were invented, but the opportunity for them to interject obscene comments made communication between people of different languages uneasy. And since my understanding of other languages is a bit rusty, and my group needed a project with a biomedical bent, we decided to translate to American English from American Sign Language. That’s right, you should be really excited right now.

there's no Nike symbol on there. Nope.

It currently doesn't work because it's missing a user's hand.

Pretty simple premise, if you ask me. In fact, if it’s on wikipedia then everybody should already know about it. There’s two options to go about sign language translation, but both need to measure a person’s hand movements. This can be done visually or mechanically; because it’d be a bit lame to have to carry around video cameras and computers necessary to do the processing, my group went with a glove based system. By the by, ‘my group’ consisted of a few buds in CMU’s Biomedical Engineering Design Capstone class: Allen Ambulo, Andrew S.D. Tsai, Michelle Lin, Sherry Huang, and Eric Wideburg. We also had the awesome Professors Dr. Conrad Zapanta, and Dr. James Antaki.So, clearly, the idea of a sign language recognizing glove is not new, but two things have not been done–at least we didn’t find evidence of it.

1) With the abundance of iPods and other media devices, why can’t this device also make noise? And if it could make noise that corresponds to whatever is being signed by the user, that’d be extra impressive.

2) Nobody likes to spell, so why do all currently made gloves mainly focus on finger spelling. Damn, you’d have some strong hands if all you were able to do was finger spell… Why not include good gesture recognition? Wiimotes do it, and 3 year olds are better than me at playing Nintendo Wii.

Anyway, lets talk about implementation. I wont go into the significantly boring detail in this post; I’ll probably put up guides on specific aspects (the Recognition, the Sensors, etc.) of the project later on.

Like any good embedded system, this glove is merely a system of input and output, with some processing in the middle. Like a kind of mathematical system of equations sandwich. Yum. Input comes from the sensors or from user input. Output is the LCD screen and the tiny speaker I stole off one of those cards that sing at you (thanks Grandma for the birthday card! I really like it!).


robo hand!

Accelerometers, flex sensors, and my beautiful sewing.

roly poly!

Trackball for user input. Or to just click. Click click click click


There’s two things to look for in sign language: the movement of the hands and the position of the fingers. Thus, we’ve got an accelerometer and flex sensors. While some might see the limitations of just these sensors, I had a few work arounds for this initial proof-of-concept version. The trackball is the same kind as those found on Blackberrys, and I had an old ominous looking LCD screen (red on black, ooooh~) lying around.

All this plugs into an Arduino Mega, because it has a lot of input/outputs, and looks badass when strapped to your wrist. The output is this Sparkfun made Speakjet Arduino shield; think of it as text-to-speech. It is capable of pronouncing a list of phonetics, from which you can configure it to say them in the right order to make words, or gobbledegook. This pushed out an audio signal to the previously mentioned tiny speaker, and mirrored the results onto the LCD screen.


limited on space, my desk is a mess

Temperature outside is directly proportional to the amount of solder vapor I inhale.

nothing shorted... yay!

There was some prototyping area on the speakjet, so all the connections were routed through here.

That about does it for a hardware overview; software from here on out. The reason the glove you see is Version 1 is because so much time was spent getting the hardware together and reliable, the software is not as robust as a daily use version. Don’t get me wrong, this thing can work fine and dandy, but there’s some improvements I would like to make. Let’s start from the top of what V.1 is, and then I’ll discuss future improvements.


Sensor data comes in and, due in no small part to how they are attached, are pretty free of movement artifacts; the data is pretty reliable and consistent, only a simple low pass filter is used to just take an exponential moving average of the data. Mechanically, the sensors are attached to the glove through the use of small metal brackets made with garden wire (this too forever…) or with button snaps like you’d find on clothing (hey, we are dealing with a glove here). This allowed the flex sensors to remain fixed at their base and slide through the brackets, but also allowed the stretchiness of the glove to act as the spring–and the user’s hand as the damper–in a simple spring-mass-damper system. Fancy words for, “I sewed things on to a stretchy glove. And the glove was on my hand at the time.”

these took forever to put on

Future posts: making brackets. sewing. sewing with your off hand.

After the sensor data comes in and is converted to digital values through the Arduino’s on board ADC, and is slightly filtered, it gets formatted into a simple state matrix: 5 values for the flex sensors, 3 for each axis of the accelerometer. This state matrix gets run through a Naive Bayesian Classifier whenever the state has stabilized, i.e. the user has performed a gesture/letter and holds that position for a specified amount of time. This delay signals the microprocessor to compute the most likely gesture that has just been performed based on the current state of the sensor data out of a list of possible gestures that the Arduino knows about. Because I have no idea what I’m doing when it comes to ASL, I configured the delay to be 2 seconds for myself. Gimme a break, I learned cello on weekends, not ASL.

After the classifier has done its duty, the Arduino takes the gesture of what it thinks to have just been done, and looks it up in it’s dictionary–for us, this was the alphabet and like 10 words, due to memory constraints. The recognizable gestures corresponded to entries in it’s recognition dictionary, which translated between the gesture to the requisite phonetic commands for the Speakjet. These phonetics get sent to the Speakjet chip, and a freakishly robotic voice then says the word. Hey, they included a volume dial thankfully.

So some optimizations included using letter frequencies in the classifier (an ‘e’ is more likely to show up than a ‘z’), and code was optimized for performance; it’s pretty slick how much a 16MHz processor can do. While there were more optimizations that could be done (e.g. letter frequency based on the previous letter), it just was not worth it. The Bayesian Classifier is very limited in capability, but great for a proof-of-concept.

Thus, for Version 2, I’ve got Hidden Markov Models planned and the Arduino Mega will be a “training” unit (both for the user, and the HMM), and I want to miniaturize it to an Arduino Pro Mini. HMM’s are awesome for identifying things that can’t be observed directly; they are frequently used for speech recognition. But yeah, things to do things to do. I’ll revise this article shortly, as its level of ‘snarky’ is probably too high.