The application of voice technology is constantly increasing, which brings rare opportunities for application developers to add high-value functions in handheld devices, mobile devices and wireless personal devices. The voice of today's personal handheld devices is mostly limited to voice dialing, but there have been technologies that are suitable for the wider development of speech recognition and text-to-speech applications. Developers who intend to add voice capabilities need to be familiar with all aspects of voice technology. These issues include not only processing and memory requirements, but also how specific platform architecture and support can facilitate the development process and shorten time to market.
Using voice applications to add value can bring huge potential benefits. According to estimates by various market research companies, the combined annual growth rate of personal handheld devices is expected to reach 20% in the next two years, and the total global device delivery volume will reach 700 million pieces by 2004. In order to tap into this huge market with value-added voice applications, developers must turn to the underlying technology that can bring them high performance and low power consumption and support that can help them quickly launch new products.
The voice function provides users with natural input and output methods. It is safer than other forms of I / O, especially when the user is driving. In most applications, voice is an ideal addition to keyboards and monitors, not a replacement for them. For example, in a very noisy environment, listening and speaking may not be realistic, so users may have to rely on keyboard input and display reading. Similarly, users usually like to use the keyboard to enter certain things, such as PIN numbers and passwords, rather than speaking out loud to others.
Voice dialing is the most commonly used voice technology in personal wireless devices today. Voice dialing usually makes calls without hands and ears, which is a particularly important feature when driving. Voice dialing includes name dialing, that is, calling by name in the address book, and number dialing, that is, speaking the phone number. As shown in Figure 1, other potential voice applications include:
1. Voice email? D? D includes browsing mailboxes, writing emails using voice input, and listening to the readout of emails.
2. Information retrieval? D? D stock prices, headline news, flight information, weather forecasts, etc. can be heard from the Internet through voice. For example, instead of entering a web site and entering a stock name or browsing a predefined list, users can order: "My stock quote, Texas Instruments."
3. Personal information management? D? D allows users to specify appointments by voice, view calendars, add contact information, etc.
4. Voice browsing? D? D use the voice program menu, users can surf the Internet, add voice favorites and listen to the reading of web content.
5. Voice navigation? D? D is a complete voice input / output driving system that obtains navigation under the conditions of automatic and insufficient eyesight.
![]() |
Voice technical issues
The voice system must meet certain basic usage requirements. Obviously, the voice output must be clear so that users can understand it. In a given application, ASR must also support natural speech. What is natural can be described as fickle, ranging from simple names and instructions issued verbatim to continuous sentences that speak a large number of words. In addition, each person's natural speech and pronunciation are also different, so the system should be able to flexibly accept different speakers. The recognition engine must be accurate, otherwise users will not use this technology.
The system requirements for voice require a lot of processing and may contain huge memory, depending on the vocabulary supported. For server-based applications, the use of wireless bandwidth will increase. These factors also affect other system considerations. The higher the MIPS and transmission requirements of the application, the higher the power consumption of a given system, so it will shorten battery life or cause more frequent charging. When the application needs to use the processor external memory, the response time may also increase.
Some application trade-off considerations can help reduce system requirements by abandoning unnecessary functions on handheld devices. Speaker-based systems that recognize only a small number of words and disperse speech require much less resources than speaker-based systems that recognize large lexicons and continuous speech. Support for other languages ​​increases processing requirements and doubles the memory required by the application. Anti-noise and anti-interference are important features, but they increase complexity and memory requirements.
Obviously, developers want to reduce the performance of basic applications as little as possible when adding features such as speaker dependency, continuous speech, lexicon size, and language support. There are certain options that help reduce performance degradation in speech technologies, such as: Distributed Speech Recognition (DSR). DSR divides the recognition task so that the handheld device can convert the original voice into a spectrum characteristic vector, and the server performs the recognition process. This method and similar distributed TTS methods rely on the standardization of processing methods and transmission protocols. Although these technologies are promising, developers still face limited resources for voice applications in personal handheld devices.
Therefore, choosing an appropriate platform for high-performance applications such as voice is as important as carefully designing the application's functionality. Such a platform must have powerful processing power and can achieve a high level of power efficiency, not only in kernel operations, but also in processing memory. There should be enough MIPS to support multimedia, security and other supplementary applications. It is also important to provide programmability to integrate new algorithm capabilities. Finally, this platform must include a software architecture designed to support modular application development to help developers quickly bring products to market.
OMAP technology: an excellent voice platform
TI's OMAP platform provides excellent solutions for developing voice applications in personal handheld devices. The dual-core architecture of the OMAP1510 and OMAP5910 processors integrates a high-efficiency TMS320C55x? Digital signal processor (DSP) and a high-performance ARM9RISC microprocessor. Therefore, these OMAP processors can provide the signal processing capabilities of the arithmetic concentration required for voice, while also providing the general performance required for system layer operations. The OMAP710 processor is a highly integrated single-chip solution with a DSP-based GSM / GPRS baseband for wireless communication processing and a dedicated TI enhanced ARM925 processor that can perform multimedia applications with low power consumption. OMAP1510, OMAP5910 and OMAP710 processors can support low-end ARM-based voice applications. They also have coding compatibility, allowing developers to integrate software applications into personal products for different markets. OMAP1510 and OMAP5910 have DSP processing capabilities and can handle more concentrated voice applications.
Dual-core hardware architecture
The dual-core hardware platforms of OMAP1510 and OMAP5910 are designed to maximize system performance and minimize power consumption. When used in personal handheld devices, the combination of DSP and RISC cores provides these processors with unparalleled performance and power consumption advantages. RISC is extremely suitable for handling control codes, such as: user interface, OS and advanced applications. On the other hand, DSP is more suitable for real-time signal processing functions required for voice applications.
As shown in Figure 2, the OMAP1510 architecture includes on-chip cache memory for two processors, which can reduce the average number of transmissions to external memory while eliminating unnecessary external access power consumption. The memory management unit (MMU) of the two cores provides virtual physical memory conversion. The low-power operating mode preserves the capacity during periods when the processor is not in use or rarely used.
The OMAP1510 architecture also contains two external memory interfaces and a single memory port. These three memory interfaces are completely independent of each other and can be accessed from any core or DMA unit at the same time. Each processor has its own peripheral interface, which not only supports direct connection to peripheral devices but also supports DMA connection from the processor DMA unit. On-chip peripherals including timers, general-purpose I / O, UART, and watchdog timers, and color LCD controllers all support the general requirements of OS.
The OMAP5910 architecture not only provides system-on-chip functions but also has features such as 192KbytesRAM, USB1.1 host and client, MMC / SD card interface, multi-channel buffered serial port, real-time clock, GPIO and UART, LCD interface, SPI, uWire and i2s, etc. Including peripheral equipment. Similar to OMAP1510, OMAP5910 also includes a built-in inter-processor communication mechanism, which provides a transparent interface to the DSP to achieve easier code development.
![]() |
Design voice application for OMAP platform
In the OMAP developer network, TI is working with a number of major third-party developers that are developing voice technologies including ASR, TTS, DSR, and speaker verification. These companies have their own unique advantages in the market, and they can also bring these advantages to OMAP users. At the same time, TI has internally developed speech recognition software that takes full advantage of the dual-core architecture of the OMAP platform and is specifically designed for small lexicons and small speech recognition. TI Embedded Speech Recognizer (TIESR) can provide the following functions: speaker-independent instructions and control functions speaker-independent continuous digit recognition speaker-independent continuous speech recognition speaker-related name dialing, instructions and control Dynamic grammar and vocabulary function, which can improve the immunity of noise browsing applications such as voice browsing. Optional speaker adaptation function for enhanced performance
Voice application examples
InfoPhone is a typical example of voice applications based on this embedded architecture, which was developed by TI and is specifically used in the wireless field. InfoPhone is a Java application that can realize voice functions, and it can also realize voice retrieval of useful information. TI has developed three prototype voice-based information services for InfoPhone, such as providing users with stock quotes, flight information and weather forecasts. Each service contains a vocabulary of 50 words. Because of the dynamic vocabulary function, the system can perfectly switch between lexicons. The application design keeps keyboard input active during speech, providing flexibility when the environment is interrupted or users need to enter privately. Figure 3 illustrates the speech recognition architecture in the InfoPhone example.
![]() |
Development support
TI's OMAP software and development support services can help developers quickly bring voice applications to market. Developers can use TI's eXpressDSP â„¢ real-time DSP technology, including DSP / BIOS real-time operating system (RTOS), Code Composer Studio IDE, and TI algorithm standards to ensure modular development of field software, for DSP development. Code Composer Studio for OMAP platform integrates all hosts and target tools including ARM9RISC core in a unified environment for easy configuration and optimization. In order to further simplify the development process, the built-in inter-processor communication mechanism of the OMAP5910 and OMAP1510 processors is designed to eliminate the need for developers to independently program RISC and DSP, thereby greatly reducing programming time and complexity.
In addition, TI also developed the Innovator set of development tools for the OMAP platform. The innovative development kit provides hardware and key software for personal systems to facilitate the development of voice applications under real-world user conditions.
Phase Control Thyristor is the abbreviation of thyristor, also known as silicon controlled rectifier, formerly referred to as thyristor; thyristor is PNPN four-layer semiconductor structure, it has three poles: anode, cathode and control pole; thyristor has silicon rectifier The characteristics of the parts can work under high voltage and high current conditions, and their working processes can be controlled and widely used in electronic circuits such as controlled rectifiers, AC voltage regulators, contactless electronic switches, inverters, and inverters.
Phase Control Thyristor,Ir Phase Control Thyristor,Stud Phase Control Thyristor,Electronic Component Phase Control Thyristor,High Power Phase Control Thyristor
YANGZHOU POSITIONING TECH CO., LTD. , https://www.cnchipmicro.com