1

I'm making an application that helps people by calling the police when they say "Help". However, I'm currently having a problem with the continuous voice recognition.

I tried to use the voice recognizer in Android Studio but i don't know how to make it listen for the trigger word i.e."Help" (here is a tutorial explaining exactly what I want but for some reason it doesn't work for me: https://betterprogramming.pub/implement-continuous-speech-recognition-on-android-1dd2f4b562fd) for reference, here is my current code:

public class MainActivity extends AppCompatActivity {
    private TextToSpeech myTTS;
    private SpeechRecognizer mySpeechRecognizer;

    private AppBarConfiguration appBarConfiguration;
    private ActivityMainBinding binding;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        binding = ActivityMainBinding.inflate(getLayoutInflater());
        setContentView(binding.getRoot());

        setSupportActionBar(binding.toolbar);

        NavController navController = Navigation.findNavController(this, R.id.nav_host_fragment_content_main);
        appBarConfiguration = new AppBarConfiguration.Builder(navController.getGraph()).build();
        NavigationUI.setupActionBarWithNavController(this, navController, appBarConfiguration);
        initializeTextToSpeach();
        initializeSpeechRecognizer();
    }

    private void initializeSpeechRecognizer() {
        if(SpeechRecognizer.isRecognitionAvailable(this)) {
            mySpeechRecognizer = SpeechRecognizer.createSpeechRecognizer(this);
            mySpeechRecognizer.setRecognitionListener(new RecognitionListener() {
                @Override
                public void onReadyForSpeech(Bundle bundle) {

                }

                @Override
                public void onBeginningOfSpeech() {

                }

                @Override
                public void onRmsChanged(float v) {

                }

                @Override
                public void onBufferReceived(byte[] bytes) {

                }

                @Override
                public void onEndOfSpeech() {

                }

                @Override
                public void onError(int i) {

                }

                @Override
                public void onResults(Bundle bundle) {
                    List<String> results = bundle.getStringArrayList(
                            SpeechRecognizer.RESULTS_RECOGNITION
                    );
                    processResult(results.get(0));
                }

                @Override
                public void onPartialResults(Bundle bundle) {

                }

                @Override
                public void onEvent(int i, Bundle bundle) {

                }
            });
        }
    }

    private void processResult(String command) {
        command = command.toLowerCase();
        if(command.indexOf("help") !=-1) {
            Uri number= Uri.parse("tel:123");
            Intent intent= new Intent(Intent.ACTION_DIAL, number);
        startActivity(intent);
        }
    }

    private void initializeTextToSpeach() {
        myTTS = new TextToSpeech(this, new TextToSpeech.OnInitListener() {
            @Override
            public void onInit(int i) {
                if(myTTS.getEngines().size() ==0){
                    Toast.makeText(MainActivity.this, "There is no voice recognizing engine in your device", Toast.LENGTH_LONG).show();
                finish();
                } else {
                    myTTS.setLanguage(Locale.US);
                    speak("Hello, I'm ready");
                }
            }
        });
    }

    private void speak(String message) {
        if (Build.VERSION.SDK_INT >= 21) {
            myTTS.speak(message, TextToSpeech.QUEUE_FLUSH, null, null);
        }
        else {
            myTTS.speak(message, TextToSpeech.QUEUE_FLUSH, null);
        }
    }

    @Override
    protected void onPause() {
        super.onPause();
        myTTS.shutdown();
    }

}
Halil Ozel
  • 2,482
  • 3
  • 17
  • 32

2 Answers2

0

the first thing I would like to draw your attention to is the presence of permission in the manifest and requesting permission from the system, without this the microphone will not work. Check if the following line of code is present in the manifest:

<uses-permission android:name="android.permission.RECORD_AUDIO" />

And update the onCreate method like this:

private static final int RECORD_AUDIO_REQUEST_CODE = 983;

@Override
protected void onCreate(Bundle savedInstanceState) {

    ...

    if (ContextCompat.checkSelfPermission(this, android.Manifest.permission.RECORD_AUDIO) != PackageManager.PERMISSION_GRANTED) {
        ActivityCompat.requestPermissions(this, new String[]{android.Manifest.permission.RECORD_AUDIO}, RECORD_AUDIO_REQUEST_CODE);
    } else {
        initializeTextToSpeach();
        initializeSpeechRecognizer();
    }
}

And check the result of the permission request:

@Override
public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {
    super.onRequestPermissionsResult(requestCode, permissions, grantResults);

    if (requestCode == RECORD_AUDIO_REQUEST_CODE && grantResults[0] == PackageManager.PERMISSION_GRANTED) {
        initializeTextToSpeach();
        initializeSpeechRecognizer();
    }
}

Next, it seems that after registering the listener, you missed a few instructions to start recognition, namely creating an intent. To do this, update your method "initializeSpeechRecognizer" at the end in a similar way:

private void initializeSpeechRecognizer() {
...
Intent recognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
        recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_WEB_SEARCH);
        recognizerIntent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, this.getPackageName());
        recognizerIntent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 3);
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M) {
            recognizerIntent.putExtra(RecognizerIntent.EXTRA_PREFER_OFFLINE, true);
        }
  mySpeechRecognizer.startListening(recognizerIntent);

And the last thing I recommend is to display information about what error occurred in the onError(int i) method, so that in the future you can understand what could go wrong:

@Override
public void onError(int i) {
    Log.e("SpeechRecognizer", "onError, code=" + i);
}

I also want to share that once I was working on an application with a continuous-speech-recognition. Google's approach to voice recognition did not suit me due to the limitations of working from services. The vosk-api library helped me here https://github.com/alphacep/vosk-api, which is an api for working with the kaldi voice recognition solution. The essence of this decision is that the application must contain an additional language model of about 50 megabytes in size, and this is only for one language. Of course, this is a big disadvantage of this approach, and the recognition accuracy is worse than that of Google's solution. But in my case it was enough.

VladDz
  • 108
  • 5
-1

just two days ago I started developing an application that does something very similar to yours. It's very difficult to find information on the internet about this, but I think you can use the tool that I used. Here's the link: https://picovoice.ai/. It's really easy to work with, it has default words for detection, and you can also create your own keyword. I also highly recommend this GitHub repository, which is part of the same tool, but remember that an example of code makes things infinitely easier. If you need more help, don't hesitate to let me know.

  • I tried it and I created my own wake word and rhino speech to intent. However, I had a problem adding them to the app. I tried to use their android demo to see how it's supposed to work but the app still crashes every time I click the start button I don't know why. – ihavenoideawhatimdoing Apr 10 '23 at 17:35