automatic speech recognition
In computer science, speech recognition is the translation of spoken words into text. ASR system can be trained by allowing an individual speaker to read sections of text into the system which then analyzes a person's specific voice and uses it to fine tune the recognition of that person's speech, resulting in more accurate transcription. This kind of system is called a "speaker dependent" system. A speaker independent system is one that does not use training. Some applications of speech recognition applications include voice user itnerfaces such as call routing, voice dialing, search, simple data entry, text-to-speech processing, etc. [Speech Recognition - Wikipedia]
deep belief networks
Deep belief nets are probabilistic generative models that are composed of multiple layers of stochastic, latent variables. The latent variables typically have binary values and are often called hidden units or feature detectors. The top two layers have undirected, symmetric connections between them and form an associative memory. The lower layers receive top-down, directed connections from the layer above. The states of the units in the lowest layer represent a data vector.The two most significant properties of deep belief nets are: 1) There is an efficient, layer-by-layer procedure for learning the top-down, generative weights that determine how the variables in one layer depend on the variables in the layer above. 2) After learning, the values of the latent variables in every layer can be inferred by a single, bottom-up pass that starts with an observed data vector in the bottom layer and uses the generative weights in the reverse direction. Deep belief nets are learned one layer at a time by treating the values of the latent variables in one layer, when they are being inferred from data, as the data for training the next layer. This efficient, greedy learning can be followed by, or combined with, other learning procedures that fine-tune all of the weights to improve the generative or discriminative performance of the whole network. Discriminative fine-tuning can be performed by adding a final layer of variables that represent the desired outputs and backpropagating error derivatives. When networks with many hidden layers are applied to highly-structured input data, such as images, backpropagation works much better if the feature detectors in the hidden layers are initialized by learning a deep belief net that models the structure in the input data (Hinton & Salakhutdinov, 2006).[Deep belief networks - Scholarpedia]
My current project is a collaborative effort between the MLSP Group and the Air Force Research Laboratory in Dayton, OH. The primary objective of my current work is building models to integrate into collaborative systems within a human-human-computer envrionment which leverage mulit-sensory cues in an effort to interact within a collaborative capacity.