Automatic Speech Recognition: The Development of the SPHINX by Kai-Fu Lee

By Kai-Fu Lee

Speech popularity has an extended heritage of being one of many tough difficulties in man made Intelligence and laptop technological know-how. As one is going from challenge fixing initiatives akin to puzzles and chess to perceptual initiatives corresponding to speech and imaginative and prescient, the matter features switch dramatically: wisdom bad to wisdom wealthy; low facts premiums to excessive facts charges; sluggish reaction time (minutes to hours) to prompt reaction time. those features taken jointly raise the computational complexity of the matter by means of numerous orders of significance. additional, speech offers a not easy activity area which embodies a number of the necessities of clever habit: function in genuine time; take advantage of massive quantities of data, tolerate errorful, unforeseen unknown enter; use symbols and abstractions; speak in usual language and research from the surroundings. Voice enter to pcs bargains a number of benefits. It presents a usual, quickly, palms loose, eyes unfastened, position unfastened enter medium. notwithstanding, there are various as but unsolved difficulties that hinder regimen use of speech as an enter gadget by way of non-experts. those comprise price, actual time reaction, speaker independence, robustness to adaptations corresponding to noise, microphone, speech expense and loudness, and the facility to address non-grammatical speech. passable recommendations to every of those difficulties might be anticipated in the subsequent decade. attractiveness of unrestricted spontaneous non-stop speech seems unsolvable at the moment. although, through the addition of straightforward constraints, resembling explanation conversation to solve ambiguity, we think it will likely be attainable to strengthen platforms in a position to accepting very huge vocabulary non-stop speechdictation.

Show description

Read Online or Download Automatic Speech Recognition: The Development of the SPHINX System PDF

Similar intelligence & semantics books

Artificial Intelligence in Education: Building Technology Rich Learning Contexts that Work

The character of know-how has replaced considering that synthetic Intelligence in schooling (AIED) was once conceptualised as a learn neighborhood and Interactive studying Environments have been at first constructed. expertise is smaller, extra cellular, networked, pervasive and sometimes ubiquitous in addition to being supplied through the normal computing device computer.

Towards a Unified Modeling and Knowledge-Representation based on Lattice Theory: Computational Intelligence and Soft Computing Applications

Through ‘model’ we suggest a mathematical description of an international point. With the proliferation of desktops quite a few modeling paradigms emerged lower than computational intelligence and smooth computing. An advancing know-how is at present fragmented due, in addition, to the necessity to do something about forms of information in numerous software domain names.

Parallel Processing for Artificial Intelligence (Machine Intelligence & Pattern Recognition) (v. 3)

This can be the 3rd quantity in an off-the-cuff sequence of books approximately parallel processing for synthetic intelligence. it's in accordance with the belief that the computational calls for of many AI initiatives may be larger served via parallel architectures than by way of the presently renowned workstations. besides the fact that, no assumption is made concerning the form of parallelism for use.

Exploring Computer Science with Scheme

A presentation of the critical and uncomplicated strategies, thoughts, and instruments of laptop technological know-how, with the emphasis on offering a problem-solving procedure and on offering a survey of the entire most crucial subject matters lined in measure programmes. Scheme is used all through because the programming language and the writer stresses a practical programming method of create uncomplicated features to be able to receive the specified programming aim.

Additional info for Automatic Speech Recognition: The Development of the SPHINX System

Sample text

In practice, this may be undesirable. For example, if we wanted to train a word model with 10 sequential states and 20 transitions, each with a distinct output pdf, we would have to estimate a tremendous number of parameters. Instead, we could use a lot of states to model duration, and allow adjacent sets of transitions to share the same output pdf. In another example, since certain states in a model may be the same (for example, the / s / 's and / ih/ 's in Mississippii), we want different states to share the same output pdf's.

Here we will describe an iterative procedure, the forward-backward algorithm, also known as the Baum-Welch algorithm. 1, we defined aj(t), or the probability that an HMM M has generated and is in state i. We now define its counterpart, y: ~j(t), or the probability that M is in state i, and will generate y~+l' Like can be computed with recursion on t: ~j(t) ={ ~ * SFl\t=T o i 1 i=SFl\t=T Lj aijbjj(yl+l)~j(t+l) a, O~ (16) t< T Let us now define Yij (t), which is the probability of taking the transition 24 AUTOMATIC SPEECH RECOGNITION observation sequence yi: Yj/t) =P(X,=i,X'+1 =j I =a j yi) (t-l) aijbij(y,) ~/t) (17) asF(n as (n, also known as the alpha terminal, is the probability that M generated F yi- Now, the expected number (or count) of transitions from state i toj given yi at any time is simply L::l Yij(t), and the expected number of counts from Lie Yile(t).

If we represent 29 illDDEN MARKOV MODELING OF SPEECH probability P with its log, 10gbP, we couId get more precision by setting b closer to one. To multiply two numbers, we simply add their logarithms. Adding two numbers is more complicated. 5 otherwise (30) The number of possible values depends on the magnitude of b. 0001, which resulted in a table size of 99,041. With the aid of this table, 10gb(P I +P2) = {log~1 +T(log~2-log~l) log ~ 2+ T(log bP I-log ~2) if PI> P2 otherwise (31) This implements the addition of two probabilities as one integer add, one subtract, two compares, and one table lookup.

Download PDF sample

Rated 4.21 of 5 – based on 13 votes