Multi-language Solutions for ExpressionEngine
by Christofer Sandin
At Republic Factory we’ve been working with ExpressionEngine for quite a few years now. We have more or less been lurking on Twitter and searching the forums now and then. Since being to EECI in Leiden the last two years and, as I write this, looking forward to going to NY in a few weeks we thought that this would be a good time to share a few tips with the community. Over the last few years we’ve gained a lot of experience building multi lingual sites with ExpressionEngine and that we feel hopefully could lend a hand to people starting out.
Being situated in Sweden we don’t have English as our first language, so most things we develop will be in Swedish and/or English. A lot of our projects also get translated to Norwegian, Danish, or Finnish. Recently, when working with a couple of bigger clients, we’ve also worked with languages like Russian, Spanish, Italian and Turkish. In this article we’ll give a few tips based on our experience of building multi lingual websites in ExpressionEngine, and introduce a couple of things to think about when starting out.
Please note that we don’t claim that the concepts in this article is invented by us, or unheard of before, but it’s pretty rare to see an article that describes the whole process so hopefully it serves a purpose. (Make sure to take a look at the links at the bottom too.)
Yesterday, also known as ExpressionEngine 1.x.
When we started looking for ways of building multi lingual sites in ExpressionEngine was back with EE 1.5. One of the first things we found was Mark Hout’s extension Simple Translator. That worked really well for the first projects, but after a while we had to start looking for other options. One of the main reasons was that we needed to have unique URLs for each language, and not having to set a cookie to change language.
At the same time we had started to learn a thing or two about EE, and started making our own add-ons. We also found Structure early on, and spent some time hacking the old 1.x version just to add multi language support.
Fast forward to today, and we have found a way that works very well for us at the moment which I’m going to tell you a bit about here.
To keep this article from being even longer than it already is, I’m going to assume that you got the basics of ExpressionEngine down already. And that you know how to install it, set up channels and channel fields, creating templates, etc. But even before you press the install button you need to get something right, which will save you from a lot of future hassle.
Getting it right from the start, using UTF-8 character encoding.
If you only have worked with English speaking clients, and sites, you might very well be totally unaware of what a nightmare it can be working with the wrong character encoding. ExpressionEngine 2 uses UTF-8 by default, but there are some potential problems:
- You think you’re working with UTF-8 but the MySQL database saves your texts as Latin1 anyway (which can happen if you’re on a shared host, or if the database connection string is misconfigured, since MySQL defaults to Latin1)
- You have ISO-8859-1 encoded templates and outputting UTF-8 formatted data from the database
- You’re trying to save UTF-8 formatted text in a Latin1 database.
All of the scenarios above are easy to avoid if you get the character encoding right from the start. Therefore:
- Make sure the database is in UTF-8 format when you create it
- Make sure the MySQL connection uses UTF-8
- Make sure you use UTF-8 encoded template files if you save your templates as files (and you should, at least during development).
What kind of multi lingual site is it?
Got the encoding part right? Good.
With that out of the way, we can start working on the actual site. There are many ways you can structure a multi lingual site, so decide your approach. If you’re building the site for a client, make sure they are onboard with your choice and understand the differences between having one site in multiple languages and multiple sites in different languages.
The most common scenario for us—and the one we will be looking at here—is building one site available in a more than one language. There are almost always a few things that need to differ from language to language so we’ll see how you can deal with this in a little while.
For this article, let’s build an English site that will have a Swedish and a German version.
Make ExpressionEngine aware of the current language
Before we start translating we need to make ExpressionEngine aware of the different languages we will use. We usually add a few lines of PHP to the index.php
file where it says Custom Config Values that sets a global language variable and the ´site_url´.
If you point several language specific domains (www.company.com, www.company.se and www.company.de) to the same EE installation, you can do something like this:
if ($_SERVER['SERVER_NAME'] == "www.company.se")
{
$assign_to_config['global_vars']['language'] = "se";
}
else if ($_SERVER['SERVER_NAME'] == "www.company.de")
{
$assign_to_config['global_vars']['language'] = "de";
}
else
{
$assign_to_config['global_vars']['language'] = "en";
}
$assign_to_config['site_url'] = getenv('HTTP_HOST');
Or, if you like to use the same domain but use sub folders to indicate the chosen language (i.e. www.company.com, www.company.com/se/ and www.company.com/de/) you can create the subfolders on the web server and just place a copy of index.php
and your .htaccess
file there, then add this to the index.php
files:
index.php
:
$assign_to_config['global_vars']['language'] = "en";
$assign_to_config['site_url'] = getenv('HTTP_HOST');
/se/index.php
:
$assign_to_config['global_vars']['language'] = "se";
$assign_to_config['site_url'] = getenv('HTTP_HOST') . "/se/";
/de/index.php
:
$assign_to_config['global_vars']['language'] = "de";
$assign_to_config['site_url'] = getenv('HTTP_HOST') . "/de/";
Be sure to add a second dot to the system path on line 26 in the files located in the subfolders as well, so it points to the system directory:
$system_path = '../system';
This way we have the correct site_url
for each version and a language
variable to help us out later on. Also, if you use a .htaccess
file to remove index.php from the URL, make sure it points to the right subfolder instead of the main index.php
.
What do we need to translate?
When you sit down and start planning a site you quickly realize that there are several things that you need to translate. This becomes even more important if the client is to take over the content editing process later on, or if they work with professional translators.
These are the things that you need to figure out how to translate:
- The Control Panel (which usually is easy thanks to the community).
- The main content like articles, products, and pages. (Basically, all things that lives in channels.)
- All the other small things (short phrases like copyright notice, next/previous links, wording in pagination, and navigation ).
- Other assets like downloadable documents, videos and images.
Ok, so how do we do this?
As with ExpressionEngine in general, there are many ways to solve a problem. We want to use the native functionality where it excels, which also gives us a robust and future proof system based on the core functionality (this doesn’t mean we don’t like add-ons, just that we like to keep the dependencies down for future updates). At the same time, doing the same thing over and over again is boring, so we want some help where it’s appropriate.
I’ll introduce a few of our on helper add-ons shortly, but all these things can be done manually as well. One add-on that I will mention is Travis Schmeisser and Jack McDade’s Structure, which we really like and use without hesitation when needed.
Step 1: The Control Panel.
Before you set out to translate the entire Control Panel, take a look at the ExpressionEngine site and see if someone else in the community already has done the work for you. If someone has, translating the CP is as easy as downloading the language files. If not, this might be a great opportunity for you to be the hero of everyone who speaks your native language and create the language files you need and then share them.
Step 2: The main content.
Let’s say we use a channel to store something we call “pages”. In this channel we need a heading and a text field for each page, so we create the following channel fields:
- en-page-heading
- en-page-text
- se-page-heading
- se-page-text
- de-page-heading
- de-page-text
This way we can have a separate heading and text for each language, and by using Publish Layouts we can create three new tabs called English, Swedish and German to make the editing experience tied to the languages and easy to use.
In the template files we can then just use {{language}-page-heading}
and {{language}-page-text}
to output the right content to the user.
And if we have things that differ between the different language versions, we can also use the {language}
variable to add things to a specific version:
{ if language == "se" }
Show pics of ice bears and the bikini team...
{ /if }
In our experience there are two drawbacks to this approach:
The first one is that if you use a lot of custom fields combined with a lot of languages there will be a lot of data in the exp_channel_data
database table. Using Pixel and Tonic’s Matrix is a good option in this case, since it saves the data in a separate table in the database.
The second one is that it’s a bit of a pain to set all the custom fields up if you’re working with a lot of fields; this is especially true if you’re using complex field types which needs to be set up with a lot of details.
To reduce the amount clicking, we developed a new field type, which clones the original and creates the other language specific fields following a chosen naming convention. This way you just need to create the custom fields for one language, and then the add-on helps out by creating the others. Helps a lot when you’re dealing with 6-7 languages.
We use a custom add-on internally but I searched Devot:ee looking for an alternative and found Max Lazar’s MX Tool Box which should have similar functionality. It has got a high grade and some great comments so it might be well worth a try if you get sick of creating the fields manually.
Finally, I’d like to point out that you don’t have to use the above approach for all channels. Let’s say you have company news that always will be specific to one language, just create different channels for the different languages and use the language
variable in the templates to select the right one.
Step 3: The small things.
In essence, we use the built-in global variables for these kind of things. So, if you create language specific versions of the variables you can use a similar approach as above.
- en-global-left = “Left”
- se-global-left = “Vänster”
- de-global-left = “Links”
- en-global-right = “Right”
- se-global-right = “Höger”
- de-global-right = “Recht”
Then, in the template files you just use {{language}-global-left}
and the right phrase for the current language appears like magic. A pro tip is to come up with a naming convention for your global variables and then stick to it, that makes it a lot easier when your looking at your template code.
This works well, but the interface for managing global variables in EE 2.2 and greater is a little clunky for most clients; it’s hard to get an overview of all the different language versions of the same variable. We’ve solved this by creating an add-on called Republic Variables, which we use to organize all the variables.
The visual overview also makes it a breeze for editors to update them and to see what’s missing in the translation process (the add-on is only for in-house use at the moment, but we might release it to the community if someone is interested). If building a custom add-on doesn’t seem like something you like to do, there are some other great options out there:
- Low Variables is a great choice, especially if you also need some more functionality like selecting pares order or want to use different types of variables (like checkboxes and dropdowns). We’ve use this on one or two sites, and it works really well.
- The Multi Language Module also seems to work in a similar way to our add-on, but we haven’t used this one so I can’t say for sure.
Navigation with Structure
Since the added support for a custom title in Structure 3.0+ there is no need to hack it anymore to be able to use this on a multi language site. Use:
{ exp:structure:nav channel:title="channel:{language}-page-title" }
and create the three en-page-title, se-page-title and de-page-title fields.
The next thing is to persuade Jack and Travis to add multi language structure_uri
so you can have language specific URLs, but that’s another story. (Or, maybe something which can be done by extending Structure.)
Step 4: Other assets
Organizing files, movies and other assets is pretty easy. Decide on separate upload locations, use a Matrix and name it accordingly (like en-page-matrix
) or something else that suits the current project.
As you hopefully have seen by now, the {language}
variable is your friend in most scenarios.
Working with professional translators
You also need to think about how to get the translated text into ExpressionEngine. If you work with professional translators you need to decide how you want the translated texts delivered.
Having the translators working directly in the Control Panel is one option, but from our experience most translators prefer to use familiar software. At Republic we’ve found Microsoft Excel to be a good option. We can use the CSV format when exporting, the translators can use Excel to view and work with the content, and then we can export from the Excel format to CSV again for an easy import.
In the above example we’d probably build the English version and make sure everything works the way it’s supposed to. Then, export the entry_id
, field name and content to an CSV document and send that file to the Swedish and German translators telling them to keep the HTML formatting intact. When they’re done, we’ll do it the other way around; import the new data to the specific entry_id
s and the corresponding custom fields.
If you’re uncomfortable writing SQL queries Andrew Weaver’s DataGrab can be helpful to importing data. We used it on one or two project where we needed to get data from WordPress to ExpressionEngine and it’s very easy to use.
Final thoughts
This is just one way of working with different languages. As I mentioned in the beginning, we have tried a few others over the years, but this is the most versatile approach so far.
We have some in-house add-ons that makes it a little easier for us, but everything in the article can be done with ExpressionEngine right out of the box. If you have any questions feel free to send me an email, find @republicfactory on Twitter, or come up and say hello at EECI in New York.
Good luck with your next multi lingual project. And, if you have any tips that you think the rest of us will benefit from, please leave a comment!
Links:
There are docs describing the approach used in Step 1 above on the Wiki.
John Henry Donovan has a excellent presentation of the subject at Slideshare.
EEHarbour got an add-on called Transcribe, which is being ported to EE2 as I write this.
I also found an article by Carl Crawley where he describes an alternative method while doing some research for this article. It’s a good read, and it’s always useful to have other perspectives.
Finally, make sure to do a search for “language” on Devot:ee and have a look.