Tedious Made Easier with Simon Says transcription |

This article first appeared on

Written by Mike Matzdorff
Published: 25 October 2018

When Mike Matzdorff hits a workflow roadblock, he goes looking for an alternative route. What he recently discovered was not only a new low cost transcription service, but a job at the company as well!

I was working on the Lego Ninjago movie at Warner Bros. in Burbank California in early 2017. It was a massive crew, spanning two continents. I was hired for 6 days and stayed for 6 months.

I love animation and the creative environment that comes with it. Funny drawings, funny voices, lots of collaboration at many levels. My job on the movie was Assistant Editor.

Assistant editing is a crazy hard job. Long hours, lots of brain power and a myriad of things to keep close track and tight control ofÔøΩ and on animationÔøΩ even harder. Even though it was a tough gig, I realized how much I enjoyed it, because I kept coming back.

Ninjago was an AVID show. So what's this article doing here? Let me explain. The hardest job, within the hard job of assistant editing on an animated film, is transcribing, by hand, hours and hours of actors dialog, scratch dialog (temporary stuff recorded by whoever was around) and dialog of five or six actors improvising in a room. The record sessions were normally five to six hours long and usually had multiple microphones employed.

On this particular job here's how the workflow went:

  • Ingest media
  • Build session into a sequence
  • Subclip and autosync parts of the sequence and hand off to Assistants

The assistant would then:

  • Carefully listen to everything and type it, verbatim, onto a marker, placing a new marker for each line.
  • Subclip each line
  • Name each subclip with a code indicating sequence, actor, line number and some other stuff I have likely forgotten about.

It was a tedious, brutal and slow process. Really slow. What would usually end up happening, due to the compressed schedule, is everyone would start working, editors cutting, assistants marking and subclipping. By the time the assistant work was done, the scene had already been cut, and possibly re-written.

We could have sent sent the files for transcription but it was another tedious and equally slow process. The manual transcripts would have come back as Word documents without integration with the editing system, sound files or codebook: just a stack of paper. There was no time savings and no improvements in efficiency. We decided against it.

It was frustrating, but it was the defined process, the way  "it's always been done." There really was not a better way to do it. But there was. Stand-by.

I talked a few people at the studio who could authorize some tests and because I'm a Final Cut Pro X nerd, I knew there were some possibilities.

I started mucking around with auto-transcription software and devised the following insane process involving (in this order) Avid, Final Cut Pro X, Transcription Software, Producer's Best Friend, Final Cut Pro X, Text Wrangler, Avid. Jeez.

You can probably skip this next part, but if you're really interested, Here's the steps it took to make it happen.

Process to get transcriptions into Avid:

  1. Import audio file into ******* for transcription
  2. Check and adjust transcription
  3. From ******ÔøΩ  "send to Final Cut Pro X" (file menu)
  4. FCPX will ask which Library? Select  "DX" (or whatever the name is) (If you're working in FCPX skip the rest)
  5. Within FCPX make a new 24p project (sequence) with the same start timecode as the sound clip.
  6. Name sequence the same as the file name.
  7. Select entire newly imported clip (x will do so) and add to new time line (e will do so)
  8. Rename event which contains the audio file and sequence to the current date)
  9. With timeline or project selected: file>export xml (choose v1.5)
  10. Open exported XML in PBF and use NIN layout (timeline in, Keywords, notes only)and add keyword report.
  11. MS excel will open generated spreadsheet. Select  "Keywords" tab.
  12. adjust columns to this: keyword:timeline in:empty1:empty2:notes:empty3
  13. Set destination track for locator by adding  "A1" to column "empty1"
  14. Set color for locator by adding  "red" to column  "empty2"
  15. Add  "1" to every cell in column empty3 (this adds a single frame marker in Avid)
  16. Add line number, take number and dash (e.g. 1234_001-) to beginning of dialog field.
  17. Select all populated cells (that are not headers) and copy
  18. Open TextWrangler and open a new document
  19. Paste
  20. Command F to find & replace
  21. Find the apostrophe (next to the return key) and replace all with this oneÔøΩ ÔøΩ (APOSTROPHE Unicode: U+0027, UTF-8: 27) ALSO ÔøΩ andÔøΩ can cause text oddities in Avid. Check after import and adjust.

Now you're ready to import into Avid

In Avid:

  1. Subclip audio file from master DAY sequence
  2. Adjust sequence start TC to match the audio file's TC
  3. Open Marker tool and import text file made above
  4. Markers will populate A1
  5. Cut  "marker'd" sub clip back into master DAY sequence
  6. Subclip  "marker'd" section out AGAIN
  7. Begin to sub clip every take out.

Easy, right?

Nevertheless, the above steps did offer time savings and I asked the studio exec in charge of Ninjago if we could do this. It would turn days of labor into hours. On what was a brutally short schedule, it would be a godsend. But there was a catch.

Unfortunately on Lego Ninjago, like most films, security is paramount and IT policies precluded us from uploading media to the cloud. So, the process, which required uploading media to the outside world, was a dealbreaker.

With some grumbling, we had to go forward. Large measures of time and piles of money were spent to finish the job. It was frustrating but I was not deterred. I knew that the use of AI software was the solution, but, was there a better, easier, faster, more accurate solution out there?

I did a Google search for auto-transcription for production and found a blog entry from Simon Says that caught my eye. I checked it out. Transcription from 90+ languages, translation into 50+ languages, a great price point, and a pretty ambitious menu of NLE export options. Since they offered free credit to new users, I did some tests.

What I found was: very accurate transcriptions, low price, handy features like bookmarks and notes, a familiar text editor and friendly, responsive customer service.

(Right click for larger image)

I messaged Simon Says and its founder Shamir Allibhai and started asking questions about what they had done and what the team was working on. I told them about my 28-step problem and suggested a way to drastically reduce the complexity. And me being me, I started asking for features, presenting ideas of what I'd improve (if I had any programming talent at all) and a few ideas out of left field.

After running a couple small jobs through Simon Says, I was pleased. I had offered to help trying to fix what was a problem for me (the extra 24 steps to get from Final Cut X into Avid) and after a week of back and forth, Simon Says had made an export option that could directly import into Avid as locators.

Those 28 steps now took 4. The seamless path from ingest to transcription to importing transcripts into Final Cut Pro X, Adobe Premiere Pro, Avid Media Composer and others was taking shape.

I was impressed by the vision of Simon Says to alleviate the frustrating aspects of production and empower creatives. When Shamir asked me to join the company, I said yes! I'm excited about what features are coming, many based on user feedback and feature requests.

In the past month, we have worked on: exports to Final Cut Pro for Ranges, Markers, and Captions, JSON and Avid Text Markers; the enterprise application of Simon Says ( that runs standalone/self-contained on Macs/PCs); an iOS app for pre-production meetings and interviews; and a few secret goodies to come over the next month or two.

If Ninjago was currently in production, the on-premise solution that we were looking for would have been perfect: auto-transcription software that runs on air-gapped computers with export options to all the major NLE programs. Time saved, money saved, and frustration alleviated.

With advancing AI technologies like speech recognition we are making meaningful progress to solving the mundane and frustrating aspects in production that take away our attention from the priority: the story.

Mike is a father, writer, film-maker and aspiring voice actor. His credits include Fight Club, Analyze This, and Monk. He was the first assistant editor on the first studio film edited on Apple's Final Cut X software and wrote a book on the experience loaded with real world tips, tricks, and proven workflow techniques.

Currently Mike is editing for Warner Bros. Animation and writing a couple spec screenplays.

Mike has recently partnered with Simon Says. Other recent work includes: Young Justice (season 3), Batman: Long Halloween Part 1, and various new Looney Tunes Cartoons.

Follow me on Twitter here @fcpxfeatures.

Get Started with Simon Says
Transcribe & caption  like a pro.
Learn more

Related Posts