Innovative technology provides a practical system for voiceprint identification

“Soundprint identification and automatic recognition technology research” project is completed by the Ministry of Public Security Material Evidence Identification Center and other units. Its main research achievement is to implant the voiceprint automatic recognition function into the VS99 voice workstation. This system can automatically analyze speaker characteristics. Judgment and vocabulary display and measurement, combined with expert appraisal to determine the identity of the speaker, is suitable for the practical application of forensic science. This project has developed a very practical speech workstation that integrates the sound spectrometer and the speaker automatic identification system in the current voiceprint appraisal work, which greatly improves the accuracy of the conclusions and provides a practical system for voiceprint identification.

â—† Innovative Technology:

1. Anti-noise processing

The effect of noise on test results is a problem that cannot be ignored. In this system, for non-stationary noise, the researchers proposed an SS method that uses the HMM of the even-numbered frame segment feature input combined with the smoothing of the time direction to improve the robustness of the Chinese continuous speech recognition system in noisy environments. The method to obtain better recognition results.

2. Voice endpoint detection

Endpoint detection can avoid the malfunction caused by noise and the false recognition caused by noise. It is of great significance to accurately detect the start of the speech signal and improve the accuracy of the recognition system. The use of the traditional voice endpoint detector SAD can easily cause missed voice activation detection. In addition, large interference signals may be considered as the activation of voice, causing false detection of voice activation. To overcome this shortcoming, the researchers used a correlation-based speech activation detector to define an effective correlation function, found a method to determine the threshold of discrimination, and methods to prevent missed detection and false detection.

3. Identification algorithm

The system uses an optimization algorithm based on the GMM model.

(1) Improved GMM model training method

In the experiment, it was found that the EM algorithm has significant defects in singular arrays, and the maximum likelihood estimation (ML), although the recognition rate is relatively low, does not appear singular array. Therefore, the researchers used the maximum likelihood estimation (ML) model as the initial model, and then used the EM algorithm for each step of the model to correct the correction ratio by using the α value to correct it, and called the improved EM algorithm.

(2) GMM model optimization algorithm based on genetic algorithm

The researchers improved the traditional genetic algorithm and used it in the optimization of GMM parameters, which greatly improved the optimization degree of the model.

(3) Optimization of speaker recognition method for GMM

The researchers proposed a new optimized GMM-based speaker recognition scheme by first making a specific change in the likelihood of each frame of a model corresponding to a sound and then calculating the total likelihood of the syllable. That is, the total score of the syllable corresponding to the model is denoted by Sc. The speaker corresponding to the model to which the largest Sc belongs is the target speaker.

â—†Social Benefits:

At present, the national “Ninth Five-Year Plan” research achievement VS99 voice workstation completed by the National Bureau of Physical Examination and Identification Center of the Ministry of Public Security has been popularized in China and has played an important role in the actual handling of cases. The project is based on VS99 to increase the automatic identification function, thereby further improving the efficiency of the case and the accuracy of the identification.

The automatic identification system for voiceprint identification developed by this project has complete independent intellectual property rights and strong practicability. It is ideally suited to the actual needs of public security work. A large number of suspects can be investigated in investigations, which can effectively provide investigation directions and narrow the investigation scope. Improve work efficiency. At the same time, the system has a real-time display of speech maps, which is suitable for speech signal acquisition in mobile technology. Since 2002, 200 cases have been actually tested and identified. The types of cases include criminal, economic, civil, and public security cases. From the conclusion of the case feedback and court trial results, the positive judgment rate was 100%.