2

I'm trying to use the AWS Java SDK to develop a ColdFusion application that uses the Amazon Transcribe service. Unfortunately my knowledge of Java is pitiful (to say nothing of the SDK itself) and I'm having a heck of a time getting anything to happen.

The code below is intended to start a transcription job. It doesn't throw an error, but it also doesn't start the job. I can't even tell if it's sending any information to AWS.

For all I know this code is completely off base, but right now my two biggest questions are:

  1. Am I missing some obvious step to actually send the request to AWS?

  2. How would I access any response I do get back from AWS? A dump of the invokeRequest variable appears to just be the request data.

Thanks in advance for any advice.

(FWIW: CF version is 2016, java version 1.8.0_171, and the AWS SDK is 1.11.331)

<cfscript>
   /* Set up credentials */
   awsCredentials = createObject('java','com.amazonaws.auth.BasicAWSCredentials').init('#variables.AWSAccessKeyID#', '#variables.AWSSecretKey#');
   awsStaticCredentialsProvider = CreateObject('java','com.amazonaws.auth.AWSStaticCredentialsProvider').init(awsCredentials);

   /*  Create the Transcribe Service Object*/
   serviceObject = CreateObject('java', 'com.amazonaws.services.transcribe.AmazonTranscribeAsyncClientBuilder').standard().withCredentials(variables.awsStaticCredentialsProvider).withRegion(#variables.awsRegion#).build();


   /* Set up transcription job */
   MediaFileUri = CreateObject('java','com.amazonaws.services.transcribe.model.Media').init();
   MediaFileUri.setMediaFileUri('#variables.mediafilelocation#');
   invokeRequest = CreateObject('java','com.amazonaws.services.transcribe.model.StartTranscriptionJobRequest').init();
   invokeRequest.withLanguageCode('en-US');
   invokeRequest.withMedia(MediaFileUri);
   invokeRequest.withMediaFormat('wav');
   invokeRequest.withTranscriptionJobName('#variables.jobname#');

   /* Check results of request */


   /* Shut down client*/
   serviceObject.shutdown();
</cfscript>
Community
  • 1
  • 1
Rocky
  • 321
  • 1
  • 8
  • I've never used that API but [this thread](https://stackoverflow.com/questions/47586265/speech-to-text-by-aws-service-using-java-api) gives me the impression there's a few missing pieces, like you must a) give the job a name b) submit it via client.startTranscriptionJob() c) since it's asynch, you must wait for the job to complete. – SOS May 17 '18 at 23:42
  • Thank you very much, @Ageax. That startTranscriptionJob piece was the missing link. The link you provided also led me down a path to finally figure out how to get and read the response. I will post the working code in case it helps someone down the line. – Rocky May 18 '18 at 21:54

1 Answers1

3

Here is how I got this working. I'll start at the beginning, in case anyone reading this is as mystified as I was.

The first step is getting the AWS Java SDK. At some point I was under the impression that there was a separate SDK just for the Amazon Transcribe service, but that's not the case. Once you download the file, place the following jar files in the /{coldfusion}/lib/ folder (I'm not certain all of these are necessary, but this is what worked for me):

aws-java-sdk-xxxx.jar
httpclient-xxxx.jar
httpcore-xxxx.jar
jackson-annotations-xxxx.jar
jackson-core-xxxx.jar
jackson-databind-xxxx.jar
jackson-dataformat-cbor-xxxx.jar
joda-time-xxxx.jar

Restart the ColdFusion service.

The Transcribe service requires that the media file to be transcribed is in S3. I place my files in S3 using ColdFusion's native support into a bucket I've called "transcriptaudio", for example (note the colon separating the key ID from the secret):

<cffile
 action = "copy"
 source = "c:\temp\myfilename.wav"
 destination = "s3://#variables.AWSAccessKeyID#:#variables.AWSSecretKey#@transcriptaudio/">

The URL for the media will then be:

https://s3.{awsregion}.amazonaws.com/transcriptaudio/myfilename.wav

Then here is my code to start a transcription job:

<cfscript>
    /* Set up credentials */
    awsCredentials = CreateObject('java', 'com.amazonaws.auth.BasicAWSCredentials').init('#variables.AWSAccessKeyID#','#variables.AWSSecretKey#');
    variables.awsStaticCredentialsProvider = CreateObject('java','com.amazonaws.auth.AWSStaticCredentialsProvider').init(awsCredentials);

    /*  Create the Transcribe Service Object*/
    serviceObject = CreateObject('java', 'com.amazonaws.services.transcribe.AmazonTranscribeAsyncClientBuilder').standard().withCredentials(variables.awsStaticCredentialsProvider).withRegion(#variables.awsRegion#).build();

    /* Set up transcription job */
        MediaFileUri = CreateObject('java','com.amazonaws.services.transcribe.model.Media').init();
            MediaFileUri.setMediaFileUri('#variables.mediafileurlstring#');
        requestObject = CreateObject('java','com.amazonaws.services.transcribe.model.StartTranscriptionJobRequest').init();
            requestObject.withLanguageCode('en-US');
            requestObject.withMedia(MediaFileUri);
            requestObject.withMediaFormat('wav');
            requestObject.withTranscriptionJobName('#variables.jobName#');

    /* Send the request */
        sendRequest = serviceObject.startTranscriptionJob(requestObject);

    /* Shut down client*/
    serviceObject.shutdown();
</cfscript>

As Ageax noted in the comments, the transcription happens asynchronously, so I have a separate CF page to get the transcript after it completes. This code basically assumes the job is complete, but the transcriptionStatus variable will let me do that.

<cfscript>
    /* Set up credentials */
    awsCredentials = CreateObject('java','com.amazonaws.auth.BasicAWSCredentials').init('#variables.AWSAccessKeyID#','#variables.AWSSecretKey#');
    variables.awsStaticCredentialsProvider = CreateObject('java','com.amazonaws.auth.AWSStaticCredentialsProvider').init(awsCredentials);

    /*  Create the Transcribe Service Object*/
    serviceObject = CreateObject('java', 'com.amazonaws.services.transcribe.AmazonTranscribeAsyncClientBuilder').standard().withCredentials(variables.awsStaticCredentialsProvider).withRegion(#variables.awsRegion#).build();

    /* Set up results object */
        requestResultObject = CreateObject('java', 'com.amazonaws.services.transcribe.model.GetTranscriptionJobRequest').init();
            requestResultObject.withTranscriptionJobName('#variables.jobName#');

    /*  Get the results */
        requestResult = serviceObject.GetTranscriptionJob(requestResultObject);

    /* parse result object into useful variables */ 
        transcriptionStatus = requestResult.TranscriptionJob.TranscriptionJobStatus.toString();
        transcriptURL = requestResult.TranscriptionJob.Transcript.TranscriptFileUri.toString();
</cfscript>

At this point I have the TranscriptURL, which when retrieved with cfhttp returns a vast amount of unnecessary information, at least for my use. Here is my code for getting the actual text of the transcript (the service returns the transcript in an array, so in case there's a possibility that there will be more than one transcript per job for some reason, I loop over the array) (and yes I switch to CF tags here because I'm just more comfortable working in tags):

<CFHTTP url="#variables.transcriptURL#" result="transcriptContentResponse">
<CFSET ResultsStruct = DeserializeJSON(variables.transcriptContentResponse.FileContent)>
<CFSET TranscriptsArray = ResultsStruct.Results.transcripts>

<CFLOOP Array = "#variables.TranscriptsArray#" index="ThisTranscript" >
    <cfoutput>  
         #ThisTranscript['transcript']#
    </cfoutput>
</CFLOOP>
Rocky
  • 321
  • 1
  • 8
  • Great write up! "*This code basically assumes the job is complete*" Yeah, one thing I wasn't clear on is whether you actually could do that or if you must poll and wait for it to complete. – SOS May 19 '18 at 00:23