Japanese BERT
Learn about the Japanese BERT model along with its different variants.
We'll cover the following...
The Japanese BERT model is pre-trained using the Japanese Wikipedia text with WWM. We tokenize the Japanese texts using MeCab. MeCab is a morphological analyzer for Japanese text. After tokenizing with MeCab, we use the WordPiece tokenizer and obtain the subwords. Instead of using the WordPiece tokenizer and splitting the text into ...