StoreCore

Inter­na­tion­al­iza­tion and local­iza­tion

by Ward van der Put

StoreCore is a multi-store ecommerce frame­work that supports multiple webshops in multiple languages by default. This developer guide describes how internationalization (I18N) and localization (L10N) are implemented throughout the framework.

Limitations of MVC-L

From the outset we decided multilingual support of European languages SHOULD be a key feature of StoreCore as an open-source ecommerce community operating from Europe. Support of multiple languages would no longer be an option, but a MUST. For companies operating in bilingual and multilingual European countries like Belgium, Luxembourg, and Switzerland this may of course be an critical key feature too.

Early experiments for StoreCore were based on an OpenCart fork called SumoStore (in 2014) and on a fork of OpenCart itself (in 2015). Both OpenCart and SumoStore use an application design pattern known as MVC-L or MVC+L: traditional Model-View-Controller (MVC) with an extra Language (L) dimension. Ultimately we dropped this MVC-L architecture however, for several reasons.

The basic MVC-L (model-view-controller-language) application structure adds severe limitations to performance, maintenance, and scalability. For example, a single language adds over 350 files in about 40 directories to an OpenCart install. If the OpenCart MVC-L implementation is expanded to four or even more languages, file management becomes a dreadful task.

There are performance side-effects if a single MVC view consists of not only one template file, but also several language files for all supported languages.

Furthermore, consistency within one language is difficult to maintain if terms are spread out over dozens of language files. For example, if the store manager wants to change shopping cart to shopping basket, a developer will have go over several files. A more centralized approach with an end-user interface for editing seems a much wiser choice.

Translation memory (TM)

StoreCore uses a translation memory (TM) to handle and maintain all language strings. The translation memory database table is defined in the main SQL DDL (data definition language) file core-mysql.sql for MySQL and MariaDB. The translations are included in a separate SQL DML (data manipulation language) data file called i18n-dml.sql.

The StoreCore translation memory (TM) has the following database table structure for MySQL and MariaDB:

CREATE TABLE IF NOT EXISTS `sc_translation_memory` (
  `translation_id`   VARCHAR(63)  CHARACTER SET ascii  COLLATE ascii_bin  NOT NULL,
  `language_id`      VARCHAR(13)  CHARACTER SET ascii  COLLATE ascii_bin  NOT NULL  DEFAULT 'en-GB',
  `admin_only_flag`  BIT(1)       NOT NULL  DEFAULT b'0',
  `date_modified`    TIMESTAMP    NOT NULL  DEFAULT CURRENT_TIMESTAMP  ON UPDATE CURRENT_TIMESTAMP,
  `translation`      TEXT         NULL,
  PRIMARY KEY (`translation_id`, `language_id`),
  FOREIGN KEY `fk_translation_memory_languages` (`language_id`) REFERENCES `sc_languages` (`language_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB  DEFAULT CHARSET=utf8mb4  COLLATE=utf8mb4_unicode_ci;

The foreign key (FK) language_id in the sc_translation_memory database table points to the primary key (PK) language_id in the sc_languages table for all supported languages:

CREATE TABLE IF NOT EXISTS `sc_languages` (
  `language_id`   VARCHAR(13)       CHARACTER SET ascii  COLLATE ascii_bin  NOT NULL,
  `parent_id`     VARCHAR(13)       CHARACTER SET ascii  COLLATE ascii_bin  NOT NULL  DEFAULT 'en-GB',
  `enabled_flag`  BIT(1)            NOT NULL  DEFAULT b'0',
  `sort_order`    TINYINT UNSIGNED  NOT NULL  DEFAULT 0,
  `english_name`  VARCHAR(40)       NOT NULL,
  `local_name`    VARCHAR(40)       NULL  DEFAULT NULL,
  PRIMARY KEY (`language_id`),
  FOREIGN KEY `fk_languages_languages` (`parent_id`) REFERENCES `sc_languages` (`language_id`) ON DELETE RESTRICT ON UPDATE CASCADE,
  UNIQUE KEY `uk_english_name` (`english_name`),
  INDEX `ix_languages` (`enabled_flag` DESC, `sort_order` ASC, `local_name` ASC, `english_name` ASC)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8mb4  COLLATE=utf8mb4_unicode_ci;

Database table definitions are included in the core-mysql.sql SQL file.

Root: core or master languages

Core languages, or masters, are root-level language packs. They SHOULD NOT be deleted. If the parent_id of a language is equal to the en-GB, that language is a core language located at the root of the language family tree.

Currently, the StoreCore core supports four European master languages. These are defined in the SUPPORTED_LANGUAGES class constant of the LanguagePacks class:

  • de-DE for German
  • en-GB for English
  • fr-FR for French
  • nl-NL for Dutch.

If no language match is found, StoreCore always defaults to en-GB for British English. You could therefore say that British English — or European English — is the master of all masters in the StoreCore translation memory.

Master languages SHOULD NOT be deleted, but they MAY be disabled. The example below illustrates that you should not DELETE a master language (in this case de-DE for German), but you can disable the language by settings its enabled_flag status code to 0 (zero) or b'0' (binary zero).

thumb_down Not recommended:
DELETE
  FROM sc_languages
 WHERE language_id = 'de-DE'
thumb_up Recommended:
UPDATE sc_languages
   SET status = 0
 WHERE language_id = 'de-DE'

Tree and branches: secondary languages

Secondary languages are derived from the core/master languages. They only contain differences (deltas) with the master language. For example, the “English – United States” or en-US language pack only contains the differences between American English and British English in its “English – United Kingdom” or en-GB master. This allows for global localization while maintaining language consistency and a concise dictionary.

Content language negotiation

StoreCore uses the HTTP Accept-Language header to determine which content language is preferred by visitors, customers, users, and client applications. The current language can be found by supplying an array of supported languages to the Language::negotiate() method.

Class synopsis
Language {
    public string negotiate ( array $supported [, string $default = 'en-GB'] )
}

The $supported parameter must be an associative array of ISO language codes that evaluate to true. For example, if an application supports both English and French, the supported languages may be defined as:

$supported = array(
    'en-GB' => true,
    'fr-FR' => true,
);

This data structure allows you to temporarily disable a supported language, without fully dropping it:

$supported = array(
    'en-GB' => true,
    'fr-FR' => false,
);

Translation guidelines

Language components

The translation memory contains seven types of components, divided into two groups. These types are namespaced with an uppercase prefix.

The first group contains basic language constructs in plain text, without any formatting:

  • ADJECTIVE for adjectives
  • NOUN for nouns and names
  • VERB for verbs.

The second group is used in user interfaces and MAY contain formatting, usually HTML5 or AMP HTML:

  • COMMAND for menu commands and command buttons
  • ERROR for error messages
  • HEADING for headings and form labels
  • TEXT for anything else.

Compound nouns

Compound nouns are handled as single nouns. For example, shopping cart is not stored as two terms like NOUN_SHOPPING plus NOUN_CART, but as a single segment like NOUN_SHOPPING_CART.

Names as nouns

Names are treated as nouns. Therefore they contain the default NOUN prefix, for example NOUN_PAYPAL for PayPal and NOUN_GOOGLE_ANALYTICS for Google Analytics.

Verbs to commands

Commands, menu commands, and command buttons usually indicate an activity. Therefore commands SHOULD be derived from verbs. The translation memory SQL file contains an example of this business logic. The general verb print in lowercase becomes the command Print… with an uppercase first letter and three dots in user interfaces:

INSERT IGNORE INTO `sc_translation_memory`
    (`translation_id`, `language_id`, `translation`)
  VALUES
    ('VERB_PRINT', 'ca-039', 'imprimir'),
    ('VERB_PRINT', 'de-DE', 'drucken'),
    ('VERB_PRINT', 'en-GB', 'print'),
    ('VERB_PRINT', 'es-ES', 'imprimir'),
    ('VERB_PRINT', 'fr-FR', 'imprimer'),
    ('VERB_PRINT', 'it-IT', 'stampare'),
    ('VERB_PRINT', 'nl-NL', 'printen'),
    ('VERB_PRINT', 'pt-PT', 'imprimir');

INSERT IGNORE INTO `sc_translation_memory`
    (`translation_id`, `language_id`, `translation`)
  VALUES
    ('COMMAND_PRINT', 'ca-039', 'Imprimeix…'),
    ('COMMAND_PRINT', 'de-DE', 'Drucken…'),
    ('COMMAND_PRINT', 'en-GB', 'Print…'),
    ('COMMAND_PRINT', 'es-ES', 'Imprimir…'),
    ('COMMAND_PRINT', 'fr-FR', 'Imprimer…'),
    ('COMMAND_PRINT', 'it-IT', 'Stampa…'),
    ('COMMAND_PRINT', 'nl-NL', 'Printen…'),
    ('COMMAND_PRINT', 'pt-PT', 'Imprimir…');

In some cases verbs are included in the translation memory for reference purposes and consistency. For example, the English verb to print has three common translations in Dutch: “printen”, “afdrukken”, and “drukken”. The definition printen of VERB_PRINT in nl-NL thus indicates the preferred translation for Dutch (nl-NL) in the Netherlands (NL).

Errors and exceptions

Error messages and exception message strings currently are not translated when these are intended primarily for developers and server administrators.