Use Text-to-Speech on Android to Read Out Incoming Messages

Apps with interfaces that make use of voice have a unique appeal. They tend to make their users feel that they are using something futuristic. Since its early days, Android has had very robust text-to-speech (TTS) functionality. This year, Google has added a lot of high-quality voices to its TTS engine and that's all the more reason for developers to use it in their apps.

In this tutorial, you'll learn how to create a simple app—with a minimalist user interface—that can receive text messages and read them to the user.

Prerequisites

Ensure that you have the Eclipse ADT Bundle set up. You can download it at the Android Developer website. For best results, you are also going to need a real Android device and some friends who can send you text messages.

1. Create a New Project

Start Eclipse, and create a new Android application. Call this application SMSReader. If you think you are going to publish this app to Google Play to share it with your friends, then make sure you use a unique package name. Set the Minimum Required SDK to Android 2.2 and set the Target SDK to Android 4.4.

This app will have one Activity. Select Create Activity and choose Empty Activity.

Name it MainActivity and click Finish.

2. Edit the Manifest

This app needs three permissions:

RECEIVE_SMS to know that the device has received an SMS
READ_SMS to read that SMS
READ_CONTACTS to map the phone number of the sender to a name (if possible)

Add the following lines to your AndroidManifest.xml.

<uses-permission android:name="android.permission.READ_SMS"/>
<uses-permission android:name="android.permission.RECEIVE_SMS"/>
<uses-permission android:name="android.permission.READ_CONTACTS"/>

This app is going to have only one screen orientation, portrait. Therefore, edit the activity tag and add the following attribute to it:

android:screenOrientation="portrait"

The manifest is now complete.

3. Edit strings.xml

It is not absolutely necessary, but storing all strings that the application uses in the res/values/strings.xml file is a good practice. Edit this file so that it has the following contents:

<?xml version="1.0" encoding="utf-8"?>
<resources>
    <string name="app_name">SMSReader</string>
    <string name="sms_label">Latest SMS</string>
    <string name="none">None</string>
    <string name="speech_toggle_on">START SPEAKING</string>
	<string name="speech_toggle_off">STOP SPEAKING</string>
	
	<string name="start_speaking">Okay! I will read your messages out loud for you now.</string>
	<string name="stop_speaking">Okay! I will stay silent now.</string>
</resources>

Most of these strings are used in the next step.

4. Edit the Layout

Edit res/layout/activity_main.xml to add the following:

a TextView to display the name of the person who sent the latest SMS
a TextView to display the contents of the latest SMS
a ToggleButton to toggle speech output on and off

After adding code for positioning and styling these elements, your file should have the following contents:

<RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    tools:context="${packageName}.${activityClass}"
    android:background="#99CC00"
    >

    <TextView
        android:id="@+id/sms_sender"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_alignParentTop="true"
        android:layout_centerHorizontal="true"
        android:layout_marginTop="20dp"
        android:text="@string/sms_label"
        android:textColor="#ffffff"
        android:textAppearance="?android:attr/textAppearanceSmall" />

    <TextView
        android:id="@+id/sms_text"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_below="@+id/sms_sender"
        android:layout_centerHorizontal="true"
        android:layout_marginTop="10dp"
        android:text="@string/none"
        android:textAppearance="?android:attr/textAppearanceLarge" 
        android:textColor="#ffffff"
        />

    <ToggleButton
        android:id="@+id/speechToggle"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_centerHorizontal="true"
        android:layout_centerVertical="true"
        android:textOff="@string/speech_toggle_on"
        android:textOn="@string/speech_toggle_off"                 
        />
    
</RelativeLayout>

The layout for our application is now complete.

5. Create a Helper Class

We're now going to create a helper class for the TTS engine. Create a new Java class and call it Speaker.java. This class is used to avoid calling the TTS API directly from the Activity.

This class implements the OnInitListener interface so that it knows when the TTS engine is ready. We store this ready state in a boolean variable named ready. We use another boolean variable named allowed whose value is true only if the user has allowed the TTS engine to speak. We also add methods to get and set the value of this variable. At this point, Speaker.java should have the following contents:

public class Speaker implements OnInitListener {

    private TextToSpeech tts;
	
	private boolean ready = false;
	
	private boolean allowed = false;
	
	public Speaker(Context context){
		tts = new TextToSpeech(context, this);		
	}	
	
	public boolean isAllowed(){
		return allowed;
	}
	
	public void allow(boolean allowed){
		this.allowed = allowed;
	}
}

The OnInitListener interface has only one method, onInit. This method is called when the TTS engine has been initialized. The status parameter lets us know if the initialization was successful. Once we know that the initialization was successful, we set the language of the TTS engine. This is important to produce speech that is comprehensible. Add the following code:

@Override
public void onInit(int status) {
	if(status == TextToSpeech.SUCCESS){
		// Change this to match your
		// locale
		tts.setLanguage(Locale.US);
		ready = true;
	}else{
		ready = false;
	}
}

Next, we add a method named speak, which uses the engine to read out any text that is passed to it. Before doing so, it checks if both the allowed and the ready values are true. The speech it generates is placed in the notification stream.

public void speak(String text){
	
	// Speak only if the TTS is ready
	// and the user has allowed speech
	
	if(ready && allowed) {
		HashMap<String, String> hash = new HashMap<String,String>();
		hash.put(TextToSpeech.Engine.KEY_PARAM_STREAM, 
				String.valueOf(AudioManager.STREAM_NOTIFICATION));
		tts.speak(text, TextToSpeech.QUEUE_ADD, hash);
	}
}

We then add a method that plays silence for a specified duration. Using this method, we can add pauses to the speech to make it sound a little clearer. Add the following code to the implementation:

public void pause(int duration){
	tts.playSilence(duration, TextToSpeech.QUEUE_ADD, null);
}

Finally, add a method to free up resources when the TTS engine is no longer needed.

// Free up resources
public void destroy(){
	tts.shutdown();
}

6. Edit the Activity Class

Edit MainActivity.java and declare all the views that we mentioned in the layout. Declare two integers, LONG_DURATION and SHORT_DURATION. These are merely values that are passed to the Speaker's pause method.

Also declare a CHECK_CODE integer. It's value isn't important. It is passed to the startActivityforResult method and then used to identify the result.

Lastly, declare a Speaker object and a BroadcastReceiver object.

At this point, your class should look like this:

public class MainActivity extends Activity {    
    
	private final int CHECK_CODE = 0x1;
	private final int LONG_DURATION = 5000;
	private final int SHORT_DURATION = 1200;
	
	private Speaker speaker;	
	
	private ToggleButton toggle;
	private OnCheckedChangeListener toggleListener;
	
	private TextView smsText;
	private TextView smsSender;
	
	private BroadcastReceiver smsReceiver;
}

Add a method to check if a TTS engine is installed on the device. The check is performed by making use of the result of another Activity.

private void checkTTS(){
	Intent check = new Intent();
	check.setAction(TextToSpeech.Engine.ACTION_CHECK_TTS_DATA);
	startActivityForResult(check, CHECK_CODE);
}

When the result of startActivityForResult arrives, the onActivityResult method is called. Therefore, we need to override it. In this method, if the result is positive, we initialize the Speaker object. If no TTS engine is installed, we redirect the user to install it.

@Override
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
	if(requestCode == CHECK_CODE){
		if(resultCode == TextToSpeech.Engine.CHECK_VOICE_DATA_PASS){
			speaker = new Speaker(this);
		}else {
            Intent install = new Intent();
            install.setAction(TextToSpeech.Engine.ACTION_INSTALL_TTS_DATA);
            startActivity(install);
        }
	}
}

It is now time to create our BroadcastReceiver to deal with the messages that the device is receiving. Whenever there are new messages, its onReceive method is called. We parse the messages, which arrive as byte arrays, using the SmsMessage class. Once the message is parsed, we use methods like getDisplayMessageBody and getOriginatingAddress to extract meaningful information out of it.

With this information, we generate the text that the TTS engine should read out. We pause for LONG_DURATION before reading out a new SMS and for SHORT_DURATION between the utterances of the SMS sender's name and the SMS's body.

Add the following code to the implementation:

private void initializeSMSReceiver(){
	smsReceiver = new BroadcastReceiver(){
		@Override
		public void onReceive(Context context, Intent intent) {
			
			Bundle bundle = intent.getExtras();
			if(bundle!=null){
				Object[] pdus = (Object[])bundle.get("pdus");
				for(int i=0;i<pdus.length;i++){
					byte[] pdu = (byte[])pdus[i];
					SmsMessage message = SmsMessage.createFromPdu(pdu);
					String text = message.getDisplayMessageBody();
					String sender = getContactName(message.getOriginatingAddress());
					speaker.pause(LONG_DURATION);
					speaker.speak("You have a new message from" + sender + "!");
					speaker.pause(SHORT_DURATION);
					speaker.speak(text);
					smsSender.setText("Message from " + sender);
					smsText.setText(text);
				}
			}
			
		}			
	};		
}

We can only extract the phone number of the sender from the message. To map this number to a contact's name, we have to make use of the user's contacts. The following method queries the contacts data. If the phone number is not available in the user's contacts, then it simply returns the string unknown number:

private String getContactName(String phone){
	Uri uri = Uri.withAppendedPath(PhoneLookup.CONTENT_FILTER_URI, Uri.encode(phone));
	String projection[] = new String[]{ContactsContract.Data.DISPLAY_NAME};
	Cursor cursor = getContentResolver().query(uri, projection, null, null, null);				
	if(cursor.moveToFirst()){
		return cursor.getString(0);
	}else {
		return "unknown number";
	}
}

Before the BroadcastReceiver can be used, it has to be registered. In the following method, we create an IntentFilter for incoming text messages and then register our smsReceiver for it:

private void registerSMSReceiver() {	
	IntentFilter intentFilter = new IntentFilter("android.provider.Telephony.SMS_RECEIVED");
	registerReceiver(smsReceiver, intentFilter);
}

Next, we create the onCreate method. Here's where we initialize all the objects we declared. We initialize the toggleListener to set the value of allowed in the Speaker class.

After these initializations, we call the checkTTS, initializeSMSReceiver, and registerSMSReceiver methods.

@Override
protected void onCreate(Bundle savedInstanceState) {
	super.onCreate(savedInstanceState);
	setContentView(R.layout.activity_main);			
	
	toggle = (ToggleButton)findViewById(R.id.speechToggle);	
	smsText = (TextView)findViewById(R.id.sms_text);
	smsSender = (TextView)findViewById(R.id.sms_sender);

	toggleListener = new OnCheckedChangeListener() {			
		@Override
		public void onCheckedChanged(CompoundButton view, boolean isChecked) {
			if(isChecked){
				speaker.allow(true);
				speaker.speak(getString(R.string.start_speaking));
			}else{
				speaker.speak(getString(R.string.stop_speaking));
				speaker.allow(false);					
			}
		}
	};		
	toggle.setOnCheckedChangeListener(toggleListener);
	
	checkTTS();
	initializeSMSReceiver();
	registerSMSReceiver();
}

Finally, in the onDestroy method of the activity, we unregister our receiver and shut down the TTS engine to free up resources.

@Override
protected void onDestroy() {	
	super.onDestroy();
	unregisterReceiver(smsReceiver);
	speaker.destroy();
}

7. Run and Test

The app is now ready to be tested. Compile and run it on a physical Android device. Tap the toggle button to enable voice and send yourself an SMS from another phone or ask one of your friends to do so. You should soon be able to hear your phone read out the SMS for you.

Here is a sample of the speech generated by the TTS engine:

Conclusion

In this tutorial, you have learnt to not only use the text-to-speech API, but also to use broadcast receivers and to make sense out of raw SMS data. You can now go on to further customize this app to you needs.

HIGHLIGHTS OF THE DAY