In my tutorial Localization With I18n for the Building Your Startup With PHP series, I created sample Spanish code by cutting and pasting text strings into Google Translate. I began to wonder if I could integrate the Google Translate API with the Yii Framework's I18n resource extraction script to automate translation for a number of countries. I posted a feature request at the Yii Forum and then decided to see if I could build the feature myself.
In this tutorial, I'll walk you through my extensions to the Yii I18n extract script which do exactly this. And I'll demonstrate translating my startup application, Meeting Planner, into a handful of languages.
Keep in mind, Google Translate isn't perfect and it doesn't address issues related to time and date formats and currencies. But for a quick and affordable (free) way to build default translations for your web application into 50+ languages, this is an ideal solution.
For example, though, here's a more noticeable error I ran into in testing—luckily these are rare:
'{nFormatted} TB' => '{nFormatted} tuberculosis',
If you need a more professional approach, a friend pointed me to a paid service for managing localization within apps, Transifex. I haven't checked it out myself but it looks intriguing.
Working With Google Translate
What Languages Does It Support?
Google Translate offers translation services for 64 languages, including Swedish but sadly not Swedish Chef:
Here's a sampling of Google's supported languages—see the full list here:
Talking to the Google Translate API
I found two Composer libraries for working with the Google Translator API in PHP:
- Levan Velijanashvili's Google Translate Library
- Travis Tillotson's Google Translation Client
I found Velijanashvili's first so it's what I used in this tutorial. It leverages Google Translate through its free RESTful web interface so you do not need an API key. However, if you have a large library of resources or plan to translate many languages, you'll likely want to integrate Tillotson's as it is fully integrated with Google Translate's paid service via keys.
For this tutorial, I'm building on the Building Your Startup With PHP series codebase. To install Velijanashvili's Google Translate Library, just type:
composer require stichoza/google-translate-php
Here's some sample code to translate from English to Spanish:
use Stichoza\Google\GoogleTranslate; echo GoogleTranslate::staticTranslate('hello world', "en", "es"). "\n";
It should output:
hola mundo
Extending Yii2's I18n Message/Extract Script
How Yii2's I18n Support Works Today
At this time, you may want to review my Localization With I18n tutorial which explains how to extract message strings for your necessary language translations.
You can use Yii's Gii code generator to generate models and CRUD code which automatically integrates I18n support. Every string in the code is replaced by a function call such as Yii::t('category','text string to translate');
.
Yii offers a console command message/extract which finds all these function calls in your application and creates a directory tree of files by language and category for translations of all of these strings.
Here's an example string file for German:
<?php /** * Message translations. * * This file is automatically generated by 'yii translate' command. * It contains the localizable messages extracted from source code. * You may modify this file by translating the extracted messages. * * Each array element represents the translation (value) of a message (key). * If the value is empty, the message is considered as not translated. * Messages that no longer need translation will have their translations * enclosed between a pair of '@@' marks. * * Message string can be used with plural forms format. Check i18n section * of the guide for details. * * NOTE: this file must be saved in UTF-8 encoding. */ return [ 'Get started with Yii' => 'Machen Sie sich mit Yii begonnen', 'Heading' => 'Überschrift', 'My Yii Application' => 'Meine Yii-Anwendung', 'Yii Documentation' => 'Yii Dokumentation', 'Yii Extensions' => 'Yü -Erweiterungen', 'Yii Forum' => 'Yii Forum', 'Are you sure you want to delete this item?' => 'Sind Sie sicher, Sie wollen diesen Inhalt löschen ?', 'Congratulations!' => 'Herzlichen Glückwunsch!', 'Create' => 'schaffen', 'Create {modelClass}' => 'schaffen {modelClass}', 'Created At' => 'Erstellt am', 'Delete' => 'löschen', 'ID' => 'Identifikation',
Here's an example of the directory paths:
Extending Message/Extract for Google Translate
I chose the approach of creating a replacement script called message/google_extract
which would call Google Translate whenever it needed to translate a string.
Preventing Broken Code From Translating Tokens
Because I18n integrates parameter tokens in curly braces for variable values, I ran into some problems right away. For example, here are some I18n strings which include tokens and nested tokens:
'Create {modelClass}' 'Registered at {0, date, MMMM dd, YYYY HH:mm} from {1}' '{0, date, MMMM dd, YYYY HH:mm}' '{nFormatted} {n, plural, =1{gibibyte} other{gibibytes}}'
The Google Translate API does not have a parameter for ignoring tokens such as these in this form. But we can't translate these because they correspond to variable names and format strings in code.
It did not appear to me that a regular expression could solve this where translatable strings and tokens were present together. It's likely that readers may have a more efficient solution than I found for solving this problem—if one is clear to you, please post it in the comments.
I chose to scan the strings by character and track the nesting of curly braces. I'll be the first to admit there may be a better way. Here's my function parse_safe_translate()
:
/* * parses a string into an array * splitting by any curly bracket segments * including nested curly brackets */ public function parse_safe_translate($s) { $debug = false; $result = array(); $start=0; $nest =0; $ptr_first_curly=0; $total_len = strlen($s); for($i=0; $i<$total_len; $i++) { if ($s[$i]=='{') { // found left curly if ($nest==0) { // it was the first one, nothing is nested yet $ptr_first_curly=$i; } // increment nesting $nest+=1; } elseif ($s[$i]=='}') { // found right curly // reduce nesting $nest-=1; if ($nest==0) { // end of nesting if ($ptr_first_curly-$start>=0) { // push string leading up to first left curly $prefix = substr ( $s , $start , $ptr_first_curly-$start); if (strlen($prefix)>0) { array_push($result,$prefix); } } // push (possibly nested) curly string $suffix=substr ( $s , $ptr_first_curly , $i-$ptr_first_curly+1); if (strlen($suffix)>0) { array_push($result,$suffix); } if ($debug) { echo '|'.substr ( $s , $start , $ptr_first_curly-$start-1)."|\n"; echo '|'.substr ( $s , $ptr_first_curly , $i-$ptr_first_curly+1)."|\n"; } $start=$i+1; $ptr_first_curly=0; if ($debug) { echo 'next start: '.$start."\n"; } } } } $suffix = substr ( $s , $start , $total_len-$start); if ($debug) { echo 'Start:'.$start."\n"; echo 'Pfc:'.$ptr_first_curly."\n"; echo $suffix."\n"; } if (strlen($suffix)>0) { array_push($result,substr ( $s , $start , $total_len-$start)); } return $result; }
It converts an I18n string into an array of elements separated into translatable and untranslatable elements. For example, this code:
$message='The image "{file}" is too large. The height cannot be larger than {limit, number} {limit, plural, one{pixel} other{pixels}}.'; print_r($this->parse_safe_translate($message));
Generates this output:
Array ( [0] => The image " [1] => {file} [2] => " is too large. The height cannot be larger than [3] => {limit, number} [4] => [5] => {limit, plural, one{pixel} other{pixels}} [6] => . )
Whenever the extract process identifies a new string to translate, it breaks the string into these parts and calls the Google Translate API for any translatable string, e.g. one that doesn't begin with a left curly brace. Then it concatenates those translations with the tokenized strings back into a single string.
Translating a Tokenized String With Google Translate
Here's the function getGoogleTranslation()
for a string and a destination language. The source language is determined by Yii::$app->language
.
public function getGoogleTranslation($message,$language) { $arr_parts=$this->parse_safe_translate($message); $translation=''; foreach ($arr_parts as $str) { if (!stristr($str,'{')) { if (strlen($translation)>0 and substr($translation,-1)=='}') $translation.=' '; $translation.=GoogleTranslate::staticTranslate($str, Yii::$app->language, $language); } else { // add space prefix unless it's first if (strlen($translation)>0) $translation.=' '.$str; else $translation.=$str; } } print_r($translation); return $translation; }
I found that the combination of these approaches worked almost perfectly in my testing.
Customizing Yii's Message/Extract
Yii's I18n implementation supports loading resource strings from .PO files, .PHP files (which I use) and the database. For this tutorial, I've customized Message/Extract for the PHP file generation.
I copied and extended message/extract
in /console/controllers/TranslateController.php
. Because of PHP 5.6.x's strict rules, I changed the function names for saveMessagesToPHP
to saveMessagesToPHPEnhanced
and saveMessagesCategoryToPHP
to saveMessagesCategoryToPHPEnhanced
.
Here's the saveMessagesToPHPEnhanced()
function:
/** * Writes messages into PHP files * * @param array $messages * @param string $dirName name of the directory to write to * @param boolean $overwrite if existing file should be overwritten without backup * @param boolean $removeUnused if obsolete translations should be removed * @param boolean $sort if translations should be sorted */ protected function saveMessagesToPHPEnhanced($messages, $dirName, $overwrite, $removeUnused, $sort,$language) { foreach ($messages as $category => $msgs) { $file = str_replace("\\", '/', "$dirName/$category.php"); $path = dirname($file); FileHelper::createDirectory($path); $msgs = array_values(array_unique($msgs)); $coloredFileName = Console::ansiFormat($file, [Console::FG_CYAN]); $this->stdout("Saving messages to $coloredFileName...\n"); $this->saveMessagesCategoryToPHPEnhanced($msgs, $file, $overwrite, $removeUnused, $sort, $category,$language); } }
It calls the saveMessagesCategoryToPHP
function:
/** * Writes category messages into PHP file * * @param array $messages * @param string $fileName name of the file to write to * @param boolean $overwrite if existing file should be overwritten without backup * @param boolean $removeUnused if obsolete translations should be removed * @param boolean $sort if translations should be sorted * @param boolean $language language to translate to * @param boolean $force google translate * @param string $category message category */ protected function saveMessagesCategoryToPHPEnhanced($messages, $fileName, $overwrite, $removeUnused, $sort, $category,$language,$force=true) { if (is_file($fileName)) { $existingMessages = require($fileName); sort($messages); ksort($existingMessages); if (!$force) { if (array_keys($existingMessages) == $messages) { $this->stdout("Nothing new in \"$category\" category... Nothing to save.\n\n", Console::FG_GREEN); return; } } $merged = []; $untranslated = []; foreach ($messages as $message) { if (array_key_exists($message, $existingMessages) && strlen($existingMessages[$message]) > 0) { $merged[$message] = $existingMessages[$message]; } else { $untranslated[] = $message; } } ksort($merged); sort($untranslated); $todo = []; foreach ($untranslated as $message) { $todo[$message] = $this->getGoogleTranslation($message,$language); } ksort($existingMessages); foreach ($existingMessages as $message => $translation) { if (!isset($merged[$message]) && !isset($todo[$message]) && !$removeUnused) { if (!empty($translation) && strncmp($translation, '@@', 2) === 0 && substr_compare($translation, '@@', -2, 2) === 0) { $todo[$message] = $translation; } else { $todo[$message] = '@@' . $translation . '@@'; } } } $merged = array_merge($todo, $merged); if ($sort) { ksort($merged); } if (false === $overwrite) { $fileName .= '.merged'; } $this->stdout("Translation merged.\n"); } else { $merged = []; foreach ($messages as $message) { $merged[$message] = ''; } ksort($merged); } $array = VarDumper::export($merged); $content = <<<EOD <?php /** * Message translations. * * This file is automatically generated by 'yii {$this->id}' command. * It contains the localizable messages extracted from source code. * You may modify this file by translating the extracted messages. * * Each array element represents the translation (value) of a message (key). * If the value is empty, the message is considered as not translated. * Messages that no longer need translation will have their translations * enclosed between a pair of '@@' marks. * * Message string can be used with plural forms format. Check i18n section * of the guide for details. * * NOTE: this file must be saved in UTF-8 encoding. */ return $array; EOD; file_put_contents($fileName, $content); $this->stdout("Translation saved.\n\n", Console::FG_GREEN); }
Unfortunately, the original Message/Extract code is not commented. While there may be some additional improvements that can be made, I simply added a call to the Google Translate API here:
foreach ($untranslated as $message) { $todo[$message] = $this->getGoogleTranslation($message,$language); }
And I added a parameter ($force=true)
to force recreation of the message files:
if (!$force) { if (array_keys($existingMessages) == $messages) { $this->stdout("Nothing new in \"$category\" category... Nothing to save.\n\n", Console::FG_GREEN); return; } }
Translating Message Planner
Testing complete, I was excited to translate Message Planner into more languages. First, we add the new language translations to the /console/config/i18n.php
file:
<?php return [ // string, required, root directory of all source files 'sourcePath' => __DIR__. DIRECTORY_SEPARATOR . '..' . DIRECTORY_SEPARATOR . '..' . DIRECTORY_SEPARATOR, // Root directory containing message translations. 'messagePath' => __DIR__ . DIRECTORY_SEPARATOR .'..'. DIRECTORY_SEPARATOR . 'messages', // array, required, list of language codes that the extracted messages // should be translated to. For example, ['zh-CN', 'de']. 'languages' => ['ar','es','de','it','iw','ja','yi','zh-CN'],
Again, if you need broader language support or have a larger quantity of strings to translate, you may want to switch to Travis Tillotson's Google Translation Client and paid API access.
Then, I added translation strings /frontend/views/layouts/main.php
and /frontend/views/site/index.php
in order to demonstrate translating the home page. Since these pages aren't generated by Yii's Gii code generator, the text strings had been left in plain HTML. Here's an example of what they look like now:
<div class="row"> <div class="col-lg-4"> <h2><?= Yii::t('frontend','Getting Started') ?></h2> <p><?= Yii::t('frontend','Follow along with our tutorial series at Tuts+ as we build Meeting Planner step by step. In this episode we talk about startups in general and the goals for our application.') ?></p> <p><a class="btn btn-default" href="http://code.tutsplus.com/tutorials/building-your-startup-with-php-getting-started--cms-21948"><?= Yii::t('frontend','Episode 1') ?> »</a></p>
Here's what the home page looks like in English:
Then, I ran google_extract
:
./yii translate/google_extract /Users/Jeff/sites/mp/common/config/i18n.php
Note: Be sure when you do this that the application language is set to your default language, e.g. English. This setting is in /common/config/main.php
:
<?php return [ 'vendorPath' => dirname(dirname(__DIR__)) . '/vendor', // available languages // 'ar','de','es','it','iw','ja','yi','zh-CN' 'language' => 'en', // english 'components' => [
I found that it was necessary to run google_extract
once to create the initial message file template and a second time to initiate the calls to Google Translate.
Then I can change the language setting in /common/config/main.php
to see each translation. The results are pretty incredible for something that can be generated so quickly.
Here's the home page in Chinese:
Here's the home page in Arabic:
Here's the home page in Japanese:
Here's the home page in Yiddish:
Here's the home page in German:
What's Next?
I hope you enjoyed this tutorial. It was fun to write something that had such a broad impact on the potential reach of my Meeting Planner application. If you'd like to learn more about Meeting Planner, watch for upcoming tutorials in our Building Your Startup With PHP series. There are lots of fun features coming up.
Please feel free to add your questions and comments below; I generally participate in the discussions. You can also reach me on Twitter @reifman or email me directly.
Comments