Thursday, July 26, 2012

The hashing trick for a dynamically changing vocabulary

I recently learned about the "hashing" trick in Machine Learning. It is typically used to handle dynamically changing vocabulary in large-scale machine learning algorithms. With the hashing trick, we always have a "fixed" vocabulary. Only thing is what this vocabulary is, we dont know. The words are hashed into a M- length hash table, and no matter how many new words come in the hash contains all of these tables. Its amazing that Yahoo research is the company that came up with the idea and implemented a widely used open source software called the vowpal wabbit. Researchers who propose new online algorithms with a dynamically changing feature set implement their algorithm in vowpal Wabbit. I think there is also an effort to implement vowpal wabbit on top of hadoop.

It is indeed sad that a company like Yahoo which has been an innovator of so many cool ideas is under the weather. Hopefully Marissa Mayer will turn things around.

For more information on Vowpal Wabbit visit

Tuesday, July 3, 2012

PdfLatex wont render images in the pdf output

If you are writing a conference paper and using latex to create your paper, then this post might be useful to you.

Typically each document has a header of the following sort.
 If you have chosen the "draft" option as in "\documentclass[10pt,a4paper,draft]{article}", then a simple code as given below, will not work.

\caption{Pipeline: Fetch Instruction}
 The figure wont render in the output pdf if the "draft" option is selected in the \document command.

Make sure you say its final, or just leave that option alone, like this:  \documentclass[10pt,a4paper]{article}.