Structural processing methods for speech signal analysis

Show simple item record

dc.contributor.author Bhagath, Parabattina
dc.date.accessioned 2021-10-22T06:18:49Z
dc.date.available 2021-10-22T06:18:49Z
dc.date.issued 2020
dc.identifier.other ROLL NO.146101017
dc.identifier.uri http://gyan.iitg.ernet.in/handle/123456789/1949
dc.description Supervisor: P K Das en_US
dc.description.abstract Speech signal analysis is a crucial study that helps to develop methods for problems like phoneme segmentation, speech recognition, speaker verification, etc. There are various frameworks and techniques that support these problems. Frameworks like Hidden Markov Modeling and Deep Learning are popular. The frameworks are efficient with large data sets where intensive training is possible. However, this becomes challenging in case of underresourced language since sufficient data cannot be provided for the intensive training. To address the needs of these languages, suitable methods are required with the capability to seek for significant clues with less amount of data. Structural processing methods focus on understanding the signals differently compared to signal processing methods. In this approach, a signal is treated as an image rather than a time series with different samples at different time stamps. The need for these methods arises due to the limitations in Hidden Markov Models. HMM contains states in which each state depends on at most two neighboring states. This limit HMM to have a holistic view of the entire signal. Recent developments in graph signal processing techniques give a way to analyze the signals by using graph data structures. These methods enable to use combination of temporal relations and frequency components while modeling the signals. The thesis addresses the problems of speech characterization and segmentation while considering the above-mentioned issues. Different features like trajectories and Tree structures are proposed and found to be useful for modeling speech signals that can be used further for recognition. Three different features based on trajectories, graph structures and fractals are proposed for segmentation task. The experiments were conducted on Indian accented spoken English vowels, words and TIMIT sentence data. Tree structures and trajectories were found to be useful in characterizing vowels and words, respectively. In the phoneme segmentation experiments, words data were collected from people belonging to different regions of India. The segmentation approaches are ascertained to be appropriate for finding phoneme boundaries of phonetic units in spoken words and sentences. The algorithms and obtained results are discussed in the thesis. en_US
dc.language.iso en en_US
dc.relation.ispartofseries TH-2501;
dc.subject COMPUTER SCIENCE AND ENGINEERING en_US
dc.title Structural processing methods for speech signal analysis en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Browse

My Account