로그인회원등록 내글장바구니주문조회현재접속자
 상품 검색

 게시판 검색

회원등록 비번분실

온라인 입금계좌

      거래은행 바로가기
 Sensor Applications
아듀이노 응용소스
작성자 avrtools™        
작성일 2013/09/14
Link#1 (Down:74)
ㆍ추천: 0  ㆍ조회: 1581   
  Arduino 음성인식 Speech/Voice Recognition
DSP or related to signal processing projects build with Arduino and like boards.
아듀이노와 선호하는 기판으로 DSP 혹은 신호처리 프로젝트를 만든다.
결론적으로, 나는 아직은 마무리가 멀었지만 가장 멋진 프로젝트의 발표를 하고 싶다.
적은 기억장치의 크기와 느린 속도의 작은 MPU 지만 결과는 정말로 감동적이다.
아듀이노는 한-두 마디의 명령에서 신경망 혹은 다른 매우 대중적인
그리고 과학적으로 소리영역은, 알고리즘에서 인정받고 있는 1GB/2.2GHz의 하드웨어 윈도우 비스타
VR 시스템보다 좋은 결과를 제공한다.
,,,, 이후는 중요하지 않으므로 생략, 결론은 간단한 2D cross-correlation 이다.
기본적으로 음성 인식의 심장은 영상적응 프로그램과 유사하다.
영상(image) 영역분포(Spectrogram)를 만들려면, 아듀이노는 마이크의 음성레벨을 계속 감시하고,
VOX의 문턱을 넘을 때 데이터를 저장해야 한다. (VOX는 음성의 유무를 감지하는 장치)

이어서 X축 영역을 채우고 나면, 데이터는 다음 수순인 FFT 처리(processing)로 전송된다. 
컨베어 벨트와 같은 FFT와 필터(filter) 처리의 중에는, 데이터가 준비되면 깃발(flag)을 세운다,
그리고 처리(processing)가 끝나면 깃발(flag)이 내려진다.
한번의 만능 처리기로 64 주기(cycles)로 영상 영역을 완료하는 동기단-여과기(correlation stage -Filter)는 처리 속도가 느리다. 컨베어 벨트는 데이터의 ADC-FFT 보다 빠르게 달린다.
대부분의 시간을 소비하는 부분은 모서리 향상(Edge Enhancement)-고역통과 여과(HPF Filtering)- 이다.
나는 실시간으로 모든 처리(process)가 정지(hold)되는 이 부분(part)의 성능 향상을 지켜보고 있다.

-  4 kHz sampling rate:  2 kHz voice freq. range;
-  64 FFT subroutine,    62.5 Hz spectral resolution;
-  16 x 64 Spectrogram Image, around 1 second max   length of the voice password;
-  duration of the Cross-Correlation < 5 milliseconds;
-  duration of the FFT+SQRT+Compression < 4 milliseconds;
-  duration of the Edge Enhancement (EE) ~ 35 milliseconds;

 Main cycle time frame is 16 milliseconds,
it’s defined by sampling rate x FFT size, 0.25 x 64 = 16 millisecond.
Super-cycle 1.024 milliseconds is needed only because EE prevents all processes to be completed in less than 16 milliseconds.
There is a resources left, to increase sampling up to 8 or even 12 kHz,
 I just had no time to conduct experiments if it is beneficial.
There is a Command Line Interface, built-in the software,
which control “record” and debug “print” functions, 7 commands for now:

if (incomingByte == ‘x’) {           // INPUT ADC DATA
 if (incomingByte == ‘f’) {           // FFT OUTPUT
 if (incomingByte == ‘s’) {           // SPECROGRAMM PRE  FILTERED
 if (incomingByte == ‘g’) {           // SPECROGRAMM POST FILTERED
 if (incomingByte == ‘r’) {           // RECORD SPECROGRAMM TO EEPROM
 if (incomingByte == ‘p’) {           // PLAY SPECROGRAMM FROM EEPROM
 if (incomingByte == ‘m’) {           // FREE MEMORY BYTES

Software is written for AtMega328p microprocessor, Arduino Uno board or similar.
 For others, all referenced registers has to be replaced with appropriate names for microprocessor.
Compiles on 022 IDE, there are some conflicts with 1.0 IDE,
 that I was not feel myself right to troubleshoot yet.
For better understanding some math background, have a look at my previous posts.
Link to download a sketch:  
Voice_Recognition_24_01 : Voice_Recognition_24_01.pde(14.4KB)
Analog front-end is the same, as I used in my first project: Color Ogran
 There is not much could be improved on this part, and I again used both inputs ?
 from microphone to do tests with my own voice, and also from “line” input,
for single tone test generated by computer during debugging.
Next picture shows “s” command print-out in the serial monitor window,
 after I pronounce a word : “Spectrogram” .
Due limited size of the window, data printed with 90 degree rotation,
left-right is frequencies bands direction, and up-down is time.
 Lower freq. on left side (60 Hz) and higher (2 kHz) on the right. 
The same time 3D images generated in right view angle.
This is how spectrogram looks like after “g” command entered in serial monitor
and word sounds just right after that:

Next couple images created with single tone frequency  (320 Hz),
 just to show more clear “internal properties” of the filtering,
again “s” and “g” commands were entered:
Well, as tone sounds continuously, it shows filtering in one direction only,
and not the best tutorial on edge-enhancement theory. (“Home brew” lab limits).
The same time last picture shows, that each “peek” on the original spectrogram,
become surrounded by negative smaller peeks, resulting in “0″ overall sum  on 3×3 foot-print,
and consequently on the whole map.
In electronics it goes under HPF name, and essence of process is to remove DC component,
 plus attenuate  Low Frequencies.
Excelent on-line book : http://www.dspguide.com/ch24/1.htm
RADIX-4 FFT (integer math).
참조 : http://coolarduino.wordpress.com/2012/03/24/radix-4-fft-integer-math/

Tweaking the FFT code, that I’ve published earlier in my series of blogs, I hit a “stone wall”.
There are nothing could be improved in the “musical note recognition” version of the code,
in order to make it faster.
At least, nothing w/o completely switching to assembler language, what I’m trying to avoid for now. 
 I’m sure, it’s the fastest C algorithm. Looking around it didn’t take long to find out that there is other option: change RADIX-2 algorithm for RADIX with higher order, 4, 8, or split-radix approach.
Putting split-radix aside, (would it be my next adventure?), RADIX-4 looks promising, with theoretically 1/4 reduction in number of multiplications (what I believe is an “Achilles heel”).
Googling for awhile, I couldn’t find fixed point version in plain C or C++ language.
There is TI’s “Autoscaling Radix-4 FFT for MS320C6000TM” application report, which I find useful ,
but the problem is it’s ”bind” with TI microprocessors hardware multiplier,
and any attempt to re-write code would, probably, make it’s performance even worse than RADIX-2.
Having “tweaking” experience with fix_fft source code from:  http://www.jjj.de/            
 I decide to follow same path, as I did before, adapting fix_fft for arduino: take their floating point source, disassemble it to the pieces, and than combine all parts back as fixed point or integer math components.   
And you know what ? Thanks God, I successed!!!
I decided not all parts to re-assemble back again,
this is why fft_size has to be power of 4 ( 16, 64, 256, 1024 etc.).
Next, the software is “adjustable” for different level of the optimization.
Trade is always the same, accuracy against speed. I’d highlight 3 level at this point:

1. No optimization, all math operation 15-bits.  
The slowest version. Not tested at all.

2. Compromise version.  Switches:
12-bits Sine table,
regular multiplication (long) right shifted >>12,
Half-Scaling in the sum_dif_I (RSL) >>1.
Recorded measurements result:  24 milliseconds with N = 256 fft_size.

3. Maximum optimization. Switches:
8-bits Sine table,
macro assembler multiplication short cut,
no scaling in the core.
Timing 10.1 millisecond!!!
Fastest. Best of the Best Ever written FFT code for 8-bit microprocessor.
Spectrum_Analyzer_RADIX_4_FFT_v3.ino :  Spectrum_Analyzer_RADIX_4_FFT_v3.ino(16KB)
Here is slightly modified copy,
where I moved sine table from RAM to FLASH memory using progmem utility.
For someone, who was curious to find the answer:
how much progmem slower compare to access data in the RAM, there is an answer.
10.16 milliseconds become 10.28, or 120 usec slower.
Divide by 84 x 6 = 504 number of readings, each progmem costs 0.24 useconds. Its about 4 cycles CPU.

Spectrum_Analyzer_RADIX_4_FFT_v3pm.ino : Spectrum_Analyzer_RADIX_4_FFT_v3pm.ino(16.2KB)
Screenshot from the running application, signal generator running on the computer,
feeding audio wave to OPA and than analog input 0.
Look for hardware setup configuration on the “color organ” blog-post.

LInk to first version based on RADIX-2 FFT:  Arduino Coloe Organ
BTW, there is one more important thing,
 I missed to emphasize in my short introductory paragraph, code offers FLEXIBILITY over SNR ratio.
Basic FFT algorithm has an intrinsic “build-in” GAIN: G(in) = FFT_SIZE / 2 . (in) stands for intrinsic.
That is perfect value for fft_size = 64 ( Gain = 64 / 2 = 32)
and arduino (Atmel AtMega328)  10-bit ADC ( max value = 1023 ).
 FFT output would be 32 x 1023 = 32736, exactly 15 bit + sign.
 In other words, scaling in the algorithm core doesn’t required at all!
That alone improve speed and lower rounding noise error significantly.
The same time G(in)  grows too high with FFT_SIZE = 256,
when G = 256 / 2 = 128 and output of the FFT would overflow size of 16-bit integer math.
But again, scaling don’t have to be 100%, as long as there is a way to keep it in balance with ADC data.

 In this particular case, with 10-bit ADC, we can keep gain just below 32,
it’s not necessary to make it exactly “1″. 
 For 12-bit ADC upper G limit would be 8, still not “1″.
To manipulate the gain, division by 2 (>> 1) in the “sum_dif_I” could be set,
to prevent overflow with fft_size > 64. Right shift “gain limiter” creates a square root adjustment,
according to new formula: G(rsl) = SQRT (FFT_SIZE) / 4 . (rsl) stands for right-shift-limiter.
1. G = 1 for fft_size = 16,
2. G = 2 for fft_size = 64,
3. G = 4 for fft_size = 256,
4. G = 8 for fft_size = 1024.

Summing up,
for using RADIX-4 with arduino ADC and FFT_SIZE <= 64,
keep division by 2 (>> 1) in the “sum_dif_I” commented out.
In any other circumstances, >10 bits external ADC, >64 fft_size, uncomment it.
Color Organ / Spectrum Analyzer, Arduino project with FFT algorithm.

The basic idea was to create color organ / spectrum analyzer on arduino board,
trying to minimize quantity of external components, like analog filters, LED display drivers, etc.
Spend a lot of time in search on internet I was able to find only two ! project ,  
which implemented FFT in order to solve a problem.

in a few days it will celebrate 6-th anniversary. 
The obstacle, at least for me, was to compile / adapt his software for Arduino IDE platform,
as it written in assembly and C.  
So, I’ve moved on, and was lucky to discover an excellent chunk of code dated back to 1989!
They didn’t have floating point co-processor or “blue deep” around at that time,
so mathematical skills were in high demand.

FFT algorithm could find application in wide variety of projects,
 for example, musical note recognition, voice recognition, sound localization etc.
 It could be done in Arduino, or in combination with PC.
In this project all functionality implemented in Arduino.  
Sampling, FFT processing and visualization of music, everything done by single arduino Uno board alone!.  

The same time, after each stage data could be extracted via serial link to PC. 
To use the the data provided by Arduino in different application on PC (like interactivity / processing),  
you can just pull data over serial link, as it was done for debugging purposes with “f” command.

After FFT-processing input data array x, generated output array fx
with  32 elements, “bins”, each representing a range of frequencies.
The width of a bin equals: D = 1 / T, where T is input array sampling timing,
in our case T = 14.6 millisecond. D = 1 / (14.6 * 10^-3) = ~ 70 Hz.

So, the value of fx[0] is amplitude DC offset up to 35 Hz;
fx[1] is amplitude in a range   35 <—> 105 Hz;
fx[2] is amplitude in a range 105 <—> 175 Hz;
fx[3] is amplitude in a range 175 <—> 245 Hz;
fx[4] is amplitude in a range 245 <—> 315 Hz;
fx[31] is amplitude in a range 2135 <—> 2170 Hz;
Upper limits could be extended up to 76 kHz.(*)

Sum up first 10 fx bins, I’m getting 35 <—> 735 Hz frequency range for red LED’s,
from 11-th to 20-th consequently provides me 735 <—> 1435 Hz  for green LED’s,
and from 21-st to 31-st 1435 <—> 2170 Hz range for blue LED’s.

hardware part.

I was considering two way to build display, using PWM or BarGraph. In my opinion,
PWM is not quite suited with LED, due nonlinearity in their current-brightness response.
Probably, PWM approach would be O’K with incandescent lights.
BarGraph design gives better impression, when higher sound volume highlighted bigger area.
Just imagine, how beautiful Fireworks are!

 I’ve used Christmas LEDs, that left over since last holiday 
 This is why I put 3 of them in each string, and have to use ULN2003 with 12V power source.
Basically, all you need is 12 Leds ( 4 – red, 4 – green and 4 blue ) and 12 resistors,
connect led+resistor between output of arduino board  and ground.
For sound input I used a kit MK136 ( 2 mic’s + amplifier IC NE5532 ), powered up from arduino board +5V. 
Why kit? It provides a board and components.
Next, you can easily reconfigure input circuitry for different sound sources as:
1. signal from 3.5 mm jack connector
2. pick up sound using microphone.
3. configure one channel with mic and another with 3.5 jack, that is what I did.
Plus it has a pot, to adjust sensitivity on the fly. Don’t install DC blocking capacitors  at the outputs.

Sparkfun’s breakout board for electret microphone will works too:
Here how to create DC offset with just couple resistors + cap.
Circuit requires line level of the input signal (~1 V). You can use headphones output.

Link to download sketch: Music_color : music_color.pde(15.1KB)

Sine Tables: Sine_Tables : SineTables.txt(3.5KB)
Thats it for now, will come back to answer your questions if you have any.
이 프로그램은 무료 소프트웨어로, 신체와 재산 상의 어떤 위험과 손해를 보상하지 않습니다.
이 프로그램은 GNU 무료 소프트웨어 배포규정을 따릅니다.
Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA 

윗글 QTouch ADC 근접검출 스위치
아래글 Arduino Uno로 만드는 3축 CNC
    N         제목    글쓴이 작성일 조회 추천
아듀이노 응용소스 게시판 avrtools™ 2016/02/05 (금) 129 0
25 ESP8266 MQTT Relay Control avrtools™ 2016/03/03 (목) 177 0
24 2 채널 ESP8266 WiFi Switch의 제작 avrtools™ 2016/02/25 (목) 208 0
23 ESP-12E SDK 0.9.5 사용방법 avrtools™ 2016/02/18 (목) 257 0
22 ESP8266 ESP-12E WiFi 센서 서버의 제작 avrtools™ 2016/02/17 (수) 240 0
21 Arduino DS3231 RTC to 5110 LCD avrtools™ 2016/02/16 (화) 299 0
20 ESP8266 Weather Server의 제작 avrtools™ 2016/02/15 (월) 275 0
19 Arduino 온습도 센서 DHT-22 avrtools™ 2016/02/12 (금) 243 0
18 ESP8266 WiFi 펌웨어 업그레이드 avrtools™ 2016/02/11 (목) 429 0
17 Arduion ESP8266 WiFi 설정 방법 avrtools™ 2016/02/10 (수) 341 0
16 Arduino 정전용량식 수분센서의 분석과 제작 avrtools™ 2016/02/07 (일) 204 0
15 Arduino 전극식 수분센서의 분석과 제작 avrtools™ 2016/02/07 (일) 268 0
14 Arduino 정밀 전력계의 ADC avrtools™ 2016/02/02 (화) 337 0
13 Arduino 정밀 전력계의 LPF avrtools™ 2016/02/02 (화) 303 0
12 Ardunio 16비트 ADC Data Logger avrtools™ 2016/01/31 (일) 182 0
11 Arduino AC/DC Power Meter의 제작 avrtools™ 2016/01/29 (금) 295 0
10 Arduino 교류 역율계(power factor)의 제작 avrtools™ 2016/01/29 (금) 268 0
9 Arduino DUE based DDS Synthesizer avrtools™ 2016/01/24 (일) 161 0
8 QTouch ADC 근접검출 스위치 avrtools™ 2016/01/21 (목) 287 0
7 Arduino 음성인식 Speech/Voice Recognition avrtools™ 2013/09/14 (토) 1581 0
6 Arduino Uno로 만드는 3축 CNC avrtools™ 2013/09/10 (화) 2420 0
5 Arduino로 만드는 mySpectral 분광기 avrtools™ 2013/09/04 (수) 1980 0
4 8채널 12비트 ADC MCP3208 오실로스코프 avrtools™ 2012/03/29 (목) 369 0
3 교류저항 (impedance) 측정 AD5933 avrtools™ 2012/03/17 (토) 296 0
2 Arduino DMX512 수신기 제작 avrtools™ 2012/03/15 (목) 3556 0
1 TSL2561 조도 측정기의 제작 avrtools™ 2011/09/11 (일) 2540 0

바구니 : 0
 보관함 : 0
오늘뷰 : 0
HOME   |   회사소개   |   제휴안내   |   회사위치   |   서비스이용 약관   |   개인정보 보호정책   |   사이트맵
17015 경기도 용인시 기흥구 동백중앙로16번길 16-25, 508호. 전화 : 031-282-3310
사업자 등록번호 : 697-47-00075 / 대표 : 이건영 / 업태 : 제조업 / 종목 : LED조명, LED전원, 제어장치.
개인정보 관리책임자 : 홈페이지 관리자 . Copyright ⓒ2016 아크레즈 (ACLEDS INC.)