You are on page 1of 8

Volume 2, Issue 10, October 2017 International Journal of Innovative Science and Research Technology

ISSN No:-2456 2165

Intelligence Hands-Free Speech Based System on


Android

Shaikh Shaheda Faiyaz Khan Arshiya Rakeeb


Department of Computer Engineering Department of Computer Engineering
Anjuman-I-Islam's Kalsekar Technical Campus Anjuman-I-Islam's Kalsekar Technical Campus
New Panvel, India New Panvel, India
shaheedashaikh23@gmail.com arshiyak307 @gmail.com

Inamdar Mohsin Harun Prof. Ansari Mukhtar Amir


Department of Computer Engineering Department of Computer Engineering
Anjuman-I-Islam's Kalsekar Technical Campus Anjuman-I-Islam's Kalsekar Technical Campus
New Panvel, India New Panvel, India
inamdarm87@gmail.com mukhtar.amir44@gmail.com

AbstractSMS and Texting is an important feature of I. INTRODUCTION


using Mobile phone and we also know that the mobile
phone usage is spreading over the World rapidly and Now a days Android try to make an applications more
has gone through the number of features due to new attractive for each and every categories of people. It try to
techniques and Developers. This paper is based on cover every people in our society. Likewise, android try to
creating an application that works on Google libraries improve a speech recognition system for comfort of people
and APIs for conversion of Text-To-Speech and Speech- who are physically disabled, people who are having a less
To-Text converter. It also works for Searching Contact knowledge about language and to prevent a people from
with the Alphabets and Numeric read. Mainly the goal accidents. In this application user is able to access the
of the project is, it is for those who not be in the position services of smart phone with their SR (speech recognition)
of using mobile phones for texting ,surfing on web and Command. This application is also developed for making a
dialing calls such kind of the communicating features ,so conversation in a very short period of a time. Speech can be
we called it as an Application that is useful for society processed faster than a text. Sender can send a message from
.In other words,messaging can be completely based on their contact list as well as from a speaker, which
speech recognition. The Application converts your text automatically select a contact from a user list. This
into the speech, speech into text, search a contacts application contains a different services and functionality:
manually from contact list or can be selected by taking speech to text, text to speech, and making a selection of
name of a person which is voice based. We can select a contacts by using numbers, manually and by using a name
multiple contacts for sending a message to multiple of a recipient. Speech is a natural way of communication;
people at a time. Previous speech recognition system was conversations which are voice based are very clear and
difficult to use and it was having a lots of drawbacks, understanding. In message system their may be a
with leads in new technologies and techniques it is misunderstanding between people just because accents are
possible to generate a desire speech recognition system. hidden from receivers. Sender should speak in a clear
This comes with lots of features by using an algorithm i.e manner so that it can be understandable by a system. System
Hidden Markov Model(HMM),which makes it possible uses different HMM models for every word of sentence.
to get a desire output. Another technologies are Android There are lots of HMM models are using for making a
System, SR (speech recognition) libraries i.e. speech conversation possible. HMM models keep every state of
APIs which is used in this paper. words different from each other, so that it can make a
correct sentence.

KeywordsSpeech-To-Text, Text-To-Speech Converter The Application which are going to make will use SR with
(Both Side), Contacts Selection with Numeric and Google server which uses HMM model. A detail description
Alphabets about working of this system is as follows: Initially speech
taken as input and recorded by mice .When user speaks
sounds will be fluctuating in a form signals, fluctuations of
signal depends on users quality of voice. Input speech will
divided into different set of words, which are in different set

IJISRT17OC201 www.ijisrt.com 606


Volume 2, Issue 10, October 2017 International Journal of Innovative Science and Research Technology
ISSN No:-2456 2165

of frames. First input is inputted, sound can be fluctuating extraction, acoustic model, dictionary, speech recognition
set of signals which are recorded. Then these words process algorithm and language model. Input speech first convert
by system to execute it and convert it in desire text. Speech into digital signals, then it divided into small intervals.
will recognize by using different methods such as feature These digital signals then process by using an algorithm.

II. LITERATURE SURVEY

IJISRT17OC201 www.ijisrt.com 607


Volume 2, Issue 10, October 2017 International Journal of Innovative Science and Research Technology
ISSN No:-2456 2165

III. OVERVIEW OF SYSTEM increases at a high level. Speech recorded a recorder .After a
recording a done, speech divided into set of frames or words
For the visually impaired peoples it is not easy to handle that and every words and phrases works as independently
particular speech icon so here it is a problem to implement Additional sounds comes with speech is filtered by a MFCC
such kind of the application is accessible for impaired model, so that it can be easily understood by a system
peoples. The technologies and algorithm use for this .Background voice and low quality voice all should be filter
application are such as: HMM (Hidden Markov model), to convert it into desired text. Then algorithm is used for
MFCC (Mel Frequency Cepstrum coefficient), android, making a conversion from speech to text at sender site.
forward algorithm, SMS manager class, java script, N-gram These converted texts send to receivers.
Database and Artificial Intelligence. Speech and text to be
understood by the system is now popularly called as Speech IV. SYSTEM ARCHITECTURE
recognition (SR).Different types of speech are as follows:
Connected Words: Separate utterances together with A. Working of the System
minimum pause are input requirement of this system.
Continuous Speech: A dictation by computer to the speaker, When application initiates, it ask contact number from user.
it is the most difficult recognizers to create. Spontaneous User can provide contact number from two way; either it
speech: Speakers natural speech acts as the input for the can be manual or voice based. Below contact filed their be a
system. It needs careful speaker, otherwise it generates message field which we are going to sent a receiver. Sender
excessive error. speaks a message that would be converted at a sender site
into text.
A. Existing System
Message input is converted into speech to text and vice
Speech recognition adds tremendous changes Into the versa by using message converter. HMM is most successful
classic keyboard input which leads to the manipulation of and most flexible approach to speech recognition. It is used
text is easier the the classic method. This application uses to send SMS,
the Google API which uses the hidden Markov models
(HMM) method. HMM use to send message to receivers in HMM is state independent i.e current state, past state and
this application the speech is recorded and user selects the future state are calculated independently. HMM method is
contacts from their list of contacts and then send a SMS to basically used for recognition of speech. It converts a speech
specific person. First Software was developed 1994 was into text. This method is more flexible and efficient method
dictation software, which is based on discrete speech. if we used it properly. It keeps different states of HMM
Discrete speech works slowly and not a natural means of independent from each other to make a proper or desire
communication, after every word spoke, it needs a pause.2 nd pattern. MFCC is used for extracting a feature. This method
speech based software developed by IBM which is based on is used to filter a speech, so that it must be understandable
continuous speech. Continuous speech based system was by a system.
very flexible and a natural conversation, but it was too
expensive and needs a costly PCs.

B. Proposed System

The application here will use the SR with Google server


which uses HMM method. The description of how the
speech recognized are as follows. Initially a speech inputted
and sound fluctuates which can be represented by set of
signals. Signals which are generates application is depends
on quality of sound .If sound quality is high then signal level

IJISRT17OC201 www.ijisrt.com 608


Volume 2, Issue 10, October 2017 International Journal of Innovative Science and Research Technology
ISSN No:-2456 2165

Fig. 1 System Architecture

Dictionary database contain all the words related to


particular language. When user speaks, it checks words from
dictionary and extract the work from database. N-gram
Data-Base is performs error correction mechanism and data
can be collected from world web pages and Internet
document

B. Working of Speech-To-Text Recognition

First the speech is taken as the input, now it analyzed by the


speech analysis with the help of speech dictionary or speech
to text conversion database and then it further checking by
the vocabulary database database by the selection of words
,phrases according to the sound and ascent of the user then it
finally converts all the speech into the text and can send this
speech by the text message.

Fig 2: Speech-To-Text Recognition.

IJISRT17OC201 www.ijisrt.com 609


Volume 2, Issue 10, October 2017 International Journal of Innovative Science and Research Technology
ISSN No:-2456 2165

C . Text-To-Speech Recognition 5. The initially state distributions are: = i where i = P


[q1 = S i ] 1 i N [2]
First the input is taken as the text it analyzed by text analysis For appropriate value of N,M,A,B and , HMM Model
with the help of text dictionary or text to Speech converter work as a generator. This generator will give an
and it sends to the speech database which selects the units of observation sequence.
words spoken on the mike now it further sends speech O=O1O2O3OT
generation module and on the basis of this process text is
converted into the text. 3) Types of HMM:

1. Context-Independent Phoneme HMM:

Number Of State : d-state HMM for each phoneme


(d is normally equal to 3)

Accuracy : not accurate in continuous speech


recognition

Compact : d-state HMM lead to less parameter to


be calculated
General : Yes, we can build HMM for new word
using existing phoneme HMM

2. Context-Dependent Trip hone HMM:

Number Of State : d-state HMM for each phoneme


Accuracy : Accurate, as it has left-right phoneme
relation.
Compact : Each phoneme has immediate left-right
relation, more parameter needs to be calculate
Fig 3: Text-To-Speech Recognition
General : Yes
D. Hidden Markov Model
3.Whole-Word HMM
1) Why Using HMM For Speech Conversion?
Number Of state : No phoneme generation
HMM are having very good calculations structures. This
model are using in a wide range of applications, If we use
this model properly it works in an efficient manner. It can assign number of state to model a word as whole
perform a complex calculations are very rich in
mathematical structure and hence can form the theoretical Accurate: It is accurate and having large no of
basis for use in a wide range of application. training data. It works for small vocabulary.

2) Hidden Markov Model elements are as follows Compact: It is not compact; it requires many
states as vocabulary increases.
HMM elements can be categorized as follows:
1. Number of state N General: this HMM cant make new. Words HMM are
2. Number of distinct observation symbol per state M, V having very good calculations structures. This model are
= V 1, V 2, , V M using in a wide range of applications, If we use this model
properly it works in an efficient manner. It can perform a
3. State transition probability, a i = P [q t+1 = S i |q t complex calculations are very rich in mathematical structure
= S j ], 1 i, j N and hence can form the theoretical basis for use in a wide
4. Observation symbol probability distribution in each state range of application.
j,B j (K) = P [V k at t|q t = S j ]

IJISRT17OC201 www.ijisrt.com 610


Volume 2, Issue 10, October 2017 International Journal of Innovative Science and Research Technology
ISSN No:-2456 2165

Fig 4: Working of HMM

E. Mel Frequency Cepstral Coefficients (MFCC)

MFCC is a method used in speech recognition for feature extraction. Before MFCC LPC were available,but it has lots of
drawbacks which overcome by MFCC. It uses a frequency domain, which I s more accurate than a time domain. MFCC can be
derived from FFT.

Fig 5 shows Steps involved in MFCC feature extraction.

Fig 5: Working of MFCC

IJISRT17OC201 www.ijisrt.com 611


Volume 2, Issue 10, October 2017 International Journal of Innovative Science and Research Technology
ISSN No:-2456 2165

F. Technologies, Method and Algorithms: algorithm which makes use of NLP,SoundX selects best
possible match words
HMM[Hidden Markov Model]: User can select a multiple contacts of same person to
reduce multiple reduction
Most successful and most flexible approach to speech It recognizes the speech to a more than 90%
recognition. IT is used to send SMS. HMM feature is that accuracy,delay form recognition is less than 100ns it
it's state are independent i.e current state,past state and gives a voice guidance for direction and destination of
future state are calculated independently. moving,
It gives alarm services and calling services phone
MFCC[Mel Frequency Cepstrum coefficient]: number can be selected manually or by using a voice.
Timer for unread message, notification and alerts are
It extracts features and also select parametric provided when new message arrives,the timer will
representation remind after a time period to read unread message.
User can monitor their voice signal level by a red signal
N-gram DataBase: bar.

It performs error correction mechanism and data can be H. Disadvantages


collected from world web pages and Internet document.
It cannot perform a multiple language selection. User should
Android: know just one standard language.

It is complete ,open and free platform. This application is only based on English language.

Forward Algorithm: IV. ACKNOWLEDGEMENT

It is used for efficient output sequence. We are extremely thankful to our guide Prof. ANSARI
MUKHTAR AMIR for their valuable guidance and for
Viterbi Algorithm: providing all the necessary facilities, which were
indispensable in the completion of this project report. We
To get better observe state. are also thankful to Department of Computers of Anjuman-
i-islam Kalsekar Campus,New Panvel for their valuable
Baum Welch Algorithm: time, support, comments, suggestions and persuasion.
required facilities , Internet access and important books.
To choose computing parameters.
REFERENCES
AI(Artificial Intelligence) :
[1]. Intelligence Hands-Free Speech based system on
It is used for check validity of input speech for phone. android. Institute of Electrical and Electronics
Engineers (issued on : 11 April 2016 ).
SMS manager class: [2]. Android Speech to Text Converter for SMS Application
(IOSR Journal of Engineering Mar. 2012, Vol. 2(3) pp:
It is provided by android to handle SMS default activity. 420-423).
[3]. Android text messaging application for visually
JavaScript: impaired people (IRACST Engineering Science and
Technology: An International Journal (ESTIJ), ISSN:
for recording panel we use javascript 2250-3498,Vol.3, No.1,February 2013.
[4]. A REVIEW ON SPEECH TO TEXT CONVERSION
JavaApplet: METHODS International Journal of Advanced
Research in Computer Engineering & Technology
Pure JavaApplet is used button to record. (IJARCET) Volume 4 Issue 7, July 2015.
[5]. International Journal of Innovative Research in Science,
Engineering and Technology (An ISO 3297: 2007
Eclipse WorkBench:
Certified Organization) VOL. 4, ISSUE 7, JULY 2015.
[6]. Intelligent Hands Free Speech based SMS System on
It is used for text reconsecration.
Android(The Master of IEEE Projects Copyright
2015 LeMeniz Infotech).
G. Advantage
[7]. International Journal of Innovative Research in
Science,Engineering and Technology(An ISO 3297:
It uses special technologies do it must be very fast and
2007 Certified Organization) Vol. 4, Issue 7, July 2015.
almost 100% correct to be understandable used SoundX

IJISRT17OC201 www.ijisrt.com 612


Volume 2, Issue 10, October 2017 International Journal of Innovative Science and Research Technology
ISSN No:-2456 2165

[8]. Brahim Patel, Dr. Y. SrinivasRao ,Speech recognition


using HMM with MFCC- an analysis using frequency
spectral decomposion technique , SIPIJ Dec 2010.
[9]. B. Raghavendhar Reddy, E. Mahender, Speech to text
conversion using android platform , IJERA Feb-2013.
[10]. International Research Journal of Engineering and
Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04
Issue: 01 | Jan -2017.
[11]. Intelligent Hands Free Speech based SMS System on
Android International Conference on Emerging Trends
in Applications of Computing ( ICETAC 2K17 ).
[12]. Text to Speech Conversion System using OCR
International Journal of Emerging Technology and
Advanced Engineering Website: www.ijetae.com
(ISSN 2250-2459, ISO 9001:2008 Certified Journal,
Volume 5, Issue 1, January 2015)

IJISRT17OC201 www.ijisrt.com 613

You might also like