Solving the TV problem, part 2

AKA, Subs2srs

In the last post I talked about a way of cutting up podcast episodes or other target language audio, turning them into Anki flashcards, and efficiently transferring the audio into your brain.

This time I want to talk about a way of automating the same process and applying it to TV shows or movies.

Unsurprisingly, I’m not the first person to come up with this idea. Someone wrote a program just for turning video files and subtitles into Anki flashcards. It’s called Subs2srs.

I won’t spend much time explaining how it works, because the sourceforge site does a good job of that already. I’ll just talk about my experience using it.

[Note: My only complaint about Subs2srs is that it only works for Windows. For a while I tried setting up a Windows emulator on my Mac, but this turned out to be too much trouble. This kept me from trying Subs2srs for a while, until a friend graciously lent me her PC.]

Finding material

The hard part with Subs2srs is just finding the files to work with. You need to find a video file and a matching subtitles file in the same language. I haven’t found any strategies that I’m confident enough in to recommend, and in general this can take some trial and error and some adjusting depending on your comfort with downloading things.

Perhaps the safest method for getting a video file would be borrowing a DVD from the library and ripping it onto your computer with a program like Handbrake.

For subtitles, there are several websites that share free subtitle files in various languages. It may take more or less time to find this, depending on the specific movie or show, and depending on the language. There’s a site called Kitsunekko that’s dedicated to subtitles for shows (and some movies) in Japanese, English, Chinese, and Korean.

So far I’ve only tried Subs2srs with Japanese, as resources are relatively plentiful. I’ve tried various anime as well as the TV show 深夜食堂 (Midnight Diner). After some time, which depends on my number of new cards per day and the length of the show, I can watch the original show/movie with pretty much total comprehension.

There are tradeoffs when it comes to choosing the content. In general, it’s good to choose content that you’re interested in, but some consideration should also be given to usability. I enjoyed doing the reps for Fullmetal Alchemist and Attack on Titan, but most of the language in those shows I have rarely had the chance to use in daily life. Midnight Diner is a little better, though some accents are hard to understand, and there’s a lot of slang that I still don’t have enough experience or context to get the hang of. This probably just means I need more practice or some exposure to this kind of speech in real conversation.

Shortcut: use a pre-made deck

Maybe you don’t have access to a Windows PC. Maybe you can’t be bothered to make your own Subs2srs deck. Or maybe you just don’t know where to start. Luckily, some deck makers have been thoughtful enough to share what they’ve made. The biggest repository of Subs2srs decks I’ve found is http://japanesedecks.blogspot.com. It’s unfortunate that it’s limited to Japanese, but if you’re learning Japanese, there’s a lot here to start with. Anki Web sometimes has some subs2srs decks as well, but these tend to get taken down.

Another option is to find someone who has already created their own decks and reach out to them (I know at least one such person…). They may be willing to share their decks with you. It stands to reason that someone who is passionate and nerdy enough about language learning to make their own decks would also be excited to find someone else who might benefit from their work.

Case study: Tampopo (1985) – extracting subtitles from a .mkv

One movie I fell in love with recently is Tampopo, a “Ramen Western” bizarre comedy about two truck drivers, a single mother, and a motley gang of other characters on a quest to make the perfect bowl of ramen. I don’t know how everyday or usable the language is in this movie either, but I decided that I wanted to learn the lines anyway, just for fun.

The only hitch was I couldn’t find the subtitles file anywhere. The movie itself was an .mkv, which came with hard-coded subtitles. But Subs2srs needs a separate subtitles file to make the cards.

This is where another program came in handy, the descriptively titled MKVExtractGUI-2, also Windows-specific. I was able to use this to get a separate subtitles file out of the Tampopo .mkv. I believe this uses optical character recognition. As a result, the subtitles are tiny images instead of text. This comes with the downside that I can’t copy and paste the subtitles into a dictionary or Google Translate when there’s something I don’t understand. But it’s not a big deal: Subs2srs also lets you add a native language subtitles file, which can go on the back of the card alongside the target language subtitles.

Here’s an example of what one of my cards looks like. For the front of the card I see the still image an hear the audio, and for the back I see the subtitle.

I’ve only been reviewing these cards for about a week, at only five cards a day (I’m also doing Midnight Diner, and trying not to get overwhelmed). I’ll add more news here once I’ve made more progress.

More languages

As I said, I’ve only tried Subs2srs so far with Japanese. I’m curious to see what it’s like in other languages, like German, Chinese, and Korean, and someday maybe even Thai or Taiwanese Hokkien. I’m also interested in starting to compile a list of movies or shows particularly suited to Subs2srs, or for which it’s easy to find video and subtitle files. If you do give Subs2srs a try, I would love to hear about your experience.

Update June 9, 2017: I’ve been doing the Midnight Diner and Tampopo decks for a couple weeks now. I also created a German deck from the TV show Deutschland ’83 a couple days ago. Here are my impressions:

Midnight Diner – There’s quite a lot of slang and domain-specific vocabulary, e.g. stuff related to the occupation of the main characters of the episode. Most characters speak pretty fast, and some seem to have strong accents, but the subtitles are pretty precise and the audio quality is clear, which makes up for this.

Tampopo – There’s an even more diverse array of accents, slang, and silly, exaggeratedly pompous language (like when the “ramen master” is explaining how to properly show respect for the different ramen components). The audio quality isn’t always very clear, and the subtitles sometimes elide parts of the speech. In some cases this makes it impossible for me to really learn the line. I think this is fine, though. I can toss the cards that are too much of a pain to use. The theatricality of the speech has made it fun to learn so far.

Deutschland ’83 – This is a recent thriller about an East German youth who gets blackmailed by his aunt into working as a spy in West Germany during a nuclear crisis. As such the language is a little bit advanced, but the entertainment factor helps make up for the difficulty. It’s been tough to find German subtitles for German movies and TV. I found one website with lots of German subtitles, but I haven’t yet figured out if it has subs for German shows or if it’s all aimed at foreign-language material.

Credits

Nihongo Shark is another site that has a thorough discussion of Subs2srs and how to use it to learn anime (interesting that the majority of Subs2srs users seem to be focused on Japanese. Then again, Anki itself, which isn’t tied to any particular language, gets its name from Japanese (暗記). Coincidence?). The creator at one point uploaded some of his own anime decks to Anki Web, but they’ve since disappeared.

Learn Any Language has a page that explains how to use Subs2srs, and contains some links to other users who have tried it with various language.

Leave a Reply

Your email address will not be published. Required fields are marked *