Wednesday, January 5, 2011

Speaker Identification – Expectations and Limitations

Speaker Identification is the process of identifying different speakers in the audio file/transcript.  Usually, speaker identification is done by one of the following methods.

1.    Reference Material/Agenda – In case of a meeting/conference/symposium, if we have the minute-to-minute agenda of the meeting along with the names of the speakers, we try to match the speaker to the speaker names as given in the agenda.  This is the simplest way to get accurate speaker names.  So, we always encourage our clients to send us the agenda or draft of the meeting/conference/symposium at the time of job confirmation so as to get accurate speaker identification for their jobs.

2.    Video Reference –Speaker identification can be made easy if the client provides professionally recorded video files as references where the focus is completely on the speakers.  In such cases, the transcriber can easily identify the speakers by viewing the video.

3.    Googling/Research skill based – There are many cases where there is no reference or speaker names provided by the client.  At this time, it is a very challenging task for the transcriber.  The transcriber tries to identify the voices of different speakers, simultaneously differentiating them into Speaker1 or Speaker2.  If the speakers identify themselves while speaking, the transcriber then uses his Googling/search engine skills to find out the name on the internet.  He then tries to relate the speaker name to the content of the file so as to judge whether it might be the same speaker.  The limitation in this case is that there is a high chance of identifying the wrong speaker as the internet is not a very reliable source.  This is because the search engine shows up a very wide variety of results with the same name.

4.    Voice differentiation – This is the most difficult method.  If there is no reference material or agenda and if there are more than 3 to 4 speakers in the audio, then it is very challenging to differentiate the speakers based on their voice tone.  This is especially so when it is a discussion where the speakers’ voices overlap and one cannot decipher what each speaker is saying.  Identification of speakers is almost impossible if the audio quality is bad.  In such cases, we try our best to use our listening skills to differentiate the voices to our best possible ability.  The limitation in this case is that it is based on the listening skills of the transcriber and there are high chances of mix up of speakers.

As our standard service at Cripton, Speaker Identification is done as [Male] and [Female] or Interviewer or Interviewee.  Having said that, our Transcribers/Editors always strive to do speaker identification to their best ability by applying one of the above-mentioned methods.  But these methods have their limitations; for more than three speakers, it becomes difficult to identify a particular speaker without a set agenda as a reference.

Hence, the only way to expect correct speaker identification is to provide proper reference material in the form of agenda/draft or speaker names of the conference/meeting/symposium, video files, etc.  Also, it is very important that the client uploads all the reference material along with the job itself and not at a later time as speaker identification is done at a primary stage when the transcriber is working on the document.  Hence, it is always advisable for the client to upload all the appropriate reference materials along with the audio jobs.  This will ensure the delivery of a transcript with accurate speaker identification.

2 comments:

Unknown said...

Hello, I am a forensic linguist, phonetician and transcriber and I was interested in your comments about speaker identification. In my experience, unless the cases are very sraightforward, this is a lot more difficult and complex than people think. Please be very circumspect with speaker attribution, and unless a full voice analysis has been performed I would recommend adding a caveat that attributions have been done to assist the reader and not as a proven identification.
If you would like any further information please see my website www.audiolex-forensic.com

Cripton Transcription said...

Thank you for the comment, Cate.