Introduction of Japanese subtitles in Netflix – Netflix TechBlog – Medium

Netflix provides Japanese subtitles since the launch of distribution service in Japan in September 2015. In this blog, I will explain the technical efforts up to Japanese subtitles offering. The subtitle source file specification, conversion model from subtitle source file to Netflix delivery subtitle, Netflix Japanese subtitle delivery model etc. are taken up. In addition, I will touch on W3C subtitle standard Timed Text Markup Language 2 (TTML 2) for the introduction.

In the late 2014, Netflix was working on realizing the technical function towards the start of distribution in Japan scheduled for September 2015. At that time, I was fully aware that the subtitle quality of another company's streaming service deployed in the Japanese market was a problem. In addition, in order to maintain high quality standard of Netflix, we began preparing to introduce all "essential" functions for Japanese subtitles as a practice of high quality video distribution service in Japan. They were introduced as additional conditions to the following requirements.

  • Subtitles are delivered separately to movies (ie, burned subtitles are not allowed)
  • Subtitle source file formats are all text format Delivery to Netflix

essential function of Japanese subtitles


As a result of summarizing expert advice on market research, Japanese language and media, Five essential functions related to Japanese subtitles have been revealed. The five functions that we will explain from now are: ruby, pars, vertical writing, italic, vertical and horizontal width (vertical subtitle numbers are displayed in horizontal writing). Realizing these functions has become a big challenge to further complicate the conventional problem.

ruby ​​

ruby ​​is for explaining specific words. For example, it is used to communicate the meaning of unfamiliar words, loanwords, slang, phonetics to unusual kanji or less well-known kanji. In addition, we may explain the cultural background of the translation so that viewers can understand and enjoy the contents more deeply. Ruby display usually uses a smaller font size than subtitle characters, and ruby ​​on characters over the first line of subtitles with only one line or the first line of 2 line subtitles. If ruby ​​exists in the second line of the second line subtitle, swing ruby ​​under the character. Ruby will never be placed between two lines of subtitles. It is because it makes it hard to understand which line of characters you are describing. The example of ruby ​​shown in Figure 1 was given to the subtitle of the dialogue titled "All he ever amounted to was chitlins." . Figure 1: Examples of ruby ​​

By translating that transliteration ruby ​​into the translation of the word "chitlins" *, Viewers can associate speech keywords with translated words. As mentioned above, ruby ​​will never be placed between two lines of subtitles. In Figure 2, it is the correct ruby ​​swing of the 2 line subtitles. Should you need three lines of subtitles, roll the ruby ​​on the first line and the second line on top of the letters and the third line on the bottom of the letters. Figure 2: Correct arrangement of ruby ​​in 2 line subtitles


neighborhood is a word or a word To emphasize, it is placed above or below, it is the same role in italics in English. It helps to convey the meaning of the word and makes the translation richer and more powerful. An example of a neighboring point shown in FIG. 3 was given to the subtitle of the word "" I need someone to talk to. Thing.

In the subtitle of the example in the above figure, the neighboring point is a word of the translation of the word "talk" It is sprinkled on characters. By emphasizing the word, it is transmitted in this scene that the speaker needs a provider of information that only a specific person knows.

Vertical subtitles

Vertical subtitles are mainly used to avoid overlapping with the characters displayed on the screen of the movie. It corresponds to the display at the top of the screen in English subtitles. An example is shown in Figure 4.

Figure 4: Simultaneously displaying credits on the screen and vertical subtitles

Vertical middle width

Japan In typography of words, vertically written characters often contain horizontal letters and alphabet letters. We call this vertical and horizontal. Instead of arranging vertically, arranging half-width characters side-by-side makes it easier to read, and you can put more characters in one subtitle line. The example shown in Figure 5 is the subtitle of the dialogue titled " It's as if we are still 23 years old ". In this example, half-width numeral "23" is vertically middle to side.

Figure 5: Vertical subtitle including figures arranged side by side


italic Like italics in language, it is used for narration, off-screen lines, and forced subtitles. However, in the case of Japanese subtitles, the point that the slant direction of the italic is different is different between the horizontal subtitle and the vertical subtitle. Furthermore, the angle of inclination is not necessarily constant. Figures 6 and 7 show examples.

Figure 6: Horizontal subtitle italics
Figure 7: Subtitle italics

Japanese subtitling sourcing

Subtitle assets in the entertainment industry are mainly composed of structured text / binary files or rendered images It is one of two formats. For the Netflix content acquisition system, we always asked for the former form. There are several reasons for that. First, there are differences in subtitle functions by clients, so we need to create various client assets from one source. In addition, the text subtitle source file has promised future prospects. In other words, even if new device features are on the market one after another, in text format, it can be applied without problems to the back catalog of the enormous subtitle assets owned by Netflix. For example, when displaying subtitles in HDR content played on an HDR device, it is recommended that you specify a luminance gain so that white characters do not become the maximum white specular highlight. If you use text subtitle source, you can easily process caption display corresponding to the client profile supporting brightness gain . On the other hand, when capturing caption source in image format, it is very difficult to apply similar processing to client asset. Furthermore, from the viewpoint of searchability for analysis and natural language processing, it is far superior to the opaque image format asset.

Based on the assumption that the text format subtitle source is a mandatory condition, as a result of examining the options available for Japanese, it is possible to use Videotron Lambda (also called LambdaCap format) as the only available model for Japanese subtitles We chose it as. There are several reasons for this, but as a result of the analysis I found that the LambdaCap format has the following features:

  • Because it is somewhat open, Netflix's unique You can build tools and workflows.
  • It is the most common subtitle format supported by the Japanese subtitle tool at the present time. This was especially important for existing Japanese subtitle companies to create subtitles for Netflix.
  • It is the most common archive format of existing Japanese subtitles. In LambdaCap format, this is one of the important points that we can incorporate existing assets without conversion.
  • support essential functions of the Japanese subtitle mentioned above.
  • It is widely used in the industry for creating image based subtitle files used for burning. In other words, it was thoroughly tested.

Thus, although we chose Videotron Lambda at the beginning of Netflix distribution in Japan, it was never an excellent option in the long run. This is because there are ambiguous points in the specification rather than the standard format in the industry. While the LambdaCap format supports the essential features of Japanese subtitles, it may not include some of the basic features supported by web platform standards such as TTML 1 . Such unsupported features include colors, font information, primitives of various layouts and compositions, and so on. Also, we decided not to use the LambdaCap format as a delivery model to playback devices in Netflix's ecosystem. In addition, at this time, the Time Text Working Group (TTWG) was working on the second version of the TTML standard TTML 2 . One of the aims of TTML 2 was to support multilingual subtitles primarily for Japanese subtitles. Therefore, Netflix cooperated with TTWG toward standardization of TTML 2, and completed the specification based on the accumulated experience and carried out the introduction work to be described later. Thus, eventually TTML 2 has become the canonical representation of all source formats in Netflix's closed caption processing pipeline .

Mapping of Japanese subtitle function to TTML 2

Table 1 summarizes the essential functions of the above Japanese subtitles and the mapping between the structures provided by TTML 2. It also shows the usage statistics of the above functions in Netflix's Japanese subtitle catalog and the ideal mode of the Netflix ecosystem. Currently unused functions and those that are used less frequently are expected to be widely used in the future. † The next section details each feature specifically related to supported values. Table 1: Summary of Mapping between Japanese Subtitle Function and TTML 2 Style Structure


tts: ruby ​​

This style attribute indicates the structural features of ruby ​​content including ruby ​​itself and definition of character width in which ruby ​​was given I have specified it. The range of values ​​associated with tts: ruby ​​is mapped to the corresponding HTML markup element. As shown in the TTML sample, "container" markup contains ruby ​​characters and ruby, "base" and "text" mark up characters and ruby ​​respectively. The rendering of this sample should be Figure 8. <Style xml: id = "s3 (19459083)]