StoreCore

Inter­na­tion­al­iza­tion and local­iza­tion

by Ward van der Put

StoreCore is a multi-store e-commerce framework that supports multiple webshops in multiple languages by default. This developer guide describes how internationalization (I18N) and localization (L10N) are implemented throughout the framework.

This documentation is a work in progress. It describes prerelease software, and is subject to change. All StoreCore code is released as free and open-source software (FOSS) under the GNU General Public License.

Limitations of MVC-L

From the outset we decided multilingual support of European languages SHOULD be a key feature of StoreCore as an open-source e-commerce community operating from Europe. Support of multiple languages would no longer be an option, but a MUST. For companies operating in bilingual and multilingual European countries like Belgium, Luxembourg, and Switzerland this may of course be an critical key feature too.

Early experiments for StoreCore were based on an OpenCart fork called SumoStore (in 2014) and on a fork of OpenCart itself (in 2015). Both OpenCart and SumoStore use an application design pattern known as MVC-L or MVC+L: traditional Model-View-Controller (MVC) with an extra Language (L) dimension. Ultimately we dropped this MVC-L architecture however, for several reasons.

The basic MVC-L (model-view-controller-language) application structure adds severe limitations to performance, maintenance, and scalability. For example, a single language adds over 350 files in about 40 directories to an OpenCart install. If the OpenCart MVC-L implementation is expanded to four or even more languages, file management becomes a dreadful task.

There are performance side-effects if a single MVC view consists of not only one template file, but also several language files for all supported languages.

Furthermore, consistency within one language is difficult to maintain if terms are spread out over dozens of language files. For example, if the store manager wants to change shopping cart to shopping basket, a developer will have go over several files. A more centralized approach with an end-user interface for editing seems a much wiser choice.

Translation memory (TM)

StoreCore uses a translation memory (TM) to handle and maintain all language strings. The translation memory database table is defined in the main SQL DDL (data definition language) file core-mysql.sql for MySQL. The translations are included in a separate SQL DML (data manipulation language) data file called i18n-dml.sql.

The StoreCore translation memory (TM) has the following database table structure:

CREATE TABLE IF NOT EXISTS sc_translation_memory (
  translation_id   VARCHAR(128)          CHARACTER SET ascii  COLLATE ascii_bin  NOT NULL,
  language_id      CHAR(5)               CHARACTER SET ascii  COLLATE ascii_bin  NOT NULL  DEFAULT 'en-GB',
  admin_only_flag  TINYINT(1) UNSIGNED   NOT NULL  DEFAULT 0,
  date_modified    TIMESTAMP             NOT NULL  DEFAULT CURRENT_TIMESTAMP  ON UPDATE CURRENT_TIMESTAMP,
  translation      TEXT                  NULL,
  PRIMARY KEY pk_translation_memory_id (translation_id, language_id),
  FOREIGN KEY fk_translation_memory_languages (language_id) REFERENCES sc_languages (language_id) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB  DEFAULT CHARSET=utf8  COLLATE=utf8_unicode_ci;

The foreign key (FK) language_id in the sc_translation_memory table points to the primary key (PK) language_id in the sc_languages table:

CREATE TABLE IF NOT EXISTS sc_languages (
  language_id   CHAR(5)              CHARACTER SET ascii  COLLATE ascii_bin  NOT NULL,
  parent_id     CHAR(5)              CHARACTER SET ascii  COLLATE ascii_bin  NOT NULL  DEFAULT 'en-GB',
  enabled_flag  TINYINT(1) UNSIGNED  NOT NULL  DEFAULT 0,
  sort_order    TINYINT(3) UNSIGNED  NOT NULL  DEFAULT 0,
  english_name  VARCHAR(32)          NOT NULL,
  local_name    VARCHAR(32)          CHARACTER SET utf8  COLLATE utf8_unicode_ci  NULL  DEFAULT NULL,
  PRIMARY KEY pk_language_id (language_id),
  FOREIGN KEY fk_languages_languages (parent_id) REFERENCES sc_languages (language_id) ON DELETE RESTRICT ON UPDATE CASCADE,
  UNIQUE KEY uk_english_name (english_name)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8  COLLATE=utf8_general_ci;

Root: core or master languages

Core languages, or masters, are root-level language packs. They SHOULD NOT be deleted, which is prevented by a foreign key constraint fk_language_id on the self-referencing key parent_id. If the language_id is equal to the parent_id, a language has no parent and is therefore a core language located at the root of the language family tree.

Currently, the StoreCore core supports four European master languages. These are defined in the SUPPORTED_LANGUAGES constant of the Locale class:

  • de-DE for German
  • en-GB for English
  • fr-FR for French
  • nl-NL for Dutch.

If no language match is found, StoreCore always defaults to en-GB for British English. You could therefore say that British English — or European English — is the master of all masters in the StoreCore translation memory.

Master languages cannot be deleted; they can only be disabled. If you do try to delete a master language from the database, the DELETE query fails on a foreign key constraint. The example below illustrates that you cannot DELETE a master language but you can disable it by settings its status flag to 0 (zero).

Incorrect:
DELETE FROM sc_languages
      WHERE iso_code = 'de-DE'
Correct:
UPDATE sc_languages
   SET status = 0
 WHERE iso_code = 'de-DE'

Tree and branches: secondary languages

Secondary languages are derived from the core/master languages. They only contain differences (deltas) with the master language. For example, the “English – United States” or en-US language pack only contains the differences between American English and British English in its “English – United Kingdom” or en-GB master. This allows for global localization while maintaining language consistency and a concise dictionary.

Content language negotiation

StoreCore uses the HTTP Accept-Language header to determine which content language is preferred by visitors, customers, users, and client applications. The current language can be found by supplying an array of supported languages to the Language::negotiate() method.

Class synopsis
Language {
    public string negotiate ( array $supported [, string $default = 'en-GB'] )
}

The $supported parameter must be an associative array of ISO language codes that evaluate to true. For example, if an application supports both English and French, the supported languages may be defined as:

$supported = array(
    'en-GB' => true,
    'fr-FR' => true,
);

This data structure allows you to temporarily disable a supported language, without fully dropping it:

$supported = array(
    'en-GB' => true,
    'fr-FR' => false,
);

Translation guidelines

Language components

The translation memory contains seven types of components, divided into two groups. These types are namespaced with an uppercase prefix.

The first group contains basic language constructs in plain text, without any formatting:

  • ADJECTIVE for adjectives
  • NOUN for nouns and names
  • VERB for verbs.

The second group is used in user interfaces and MAY contain formatting, usually HTML5 or AMP HTML:

  • COMMAND for menu commands and command buttons
  • ERROR for error messages
  • HEADING for headings and form labels
  • TEXT for anything else.

Compound nouns

Compound nouns are handled as single nouns. For example, shopping cart is not stored as two terms like NOUN_SHOPPING plus NOUN_CART, but as a single segment like NOUN_SHOPPING_CART.

Names as nouns

Names are treated as nouns. Therefore they contain the default NOUN prefix, for example NOUN_PAYPAL for PayPal and NOUN_GOOGLE_ANALYTICS for Google Analytics.

Verbs to commands

Commands, menu commands and command buttons usually indicate an activity. Therefore commands SHOULD be derived from verbs. The translation memory SQL file contains an example of this business logic. The general verb print in lowercase becomes the command Print… with an uppercase first letter and three dots in user interfaces:

INSERT IGNORE INTO sc_translation_memory
    (translation_id, language_id, translation)
  VALUES
    ('VERB_PRINT', 'ca-AD', 'imprimir'),
    ('VERB_PRINT', 'de-DE', 'drucken'),
    ('VERB_PRINT', 'en-GB', 'print'),
    ('VERB_PRINT', 'es-ES', 'imprimir'),
    ('VERB_PRINT', 'fr-FR', 'imprimer'),
    ('VERB_PRINT', 'it-IT', 'stampare'),
    ('VERB_PRINT', 'nl-NL', 'printen'),
    ('VERB_PRINT', 'pt-PT', 'imprimir'),;

INSERT IGNORE INTO sc_translation_memory
    (translation_id, language_id, translation)
  VALUES
    ('COMMAND_PRINT', 'ca-AD', 'Imprimeix…'),
    ('COMMAND_PRINT', 'de-DE', 'Drucken…'),
    ('COMMAND_PRINT', 'en-GB', 'Print…'),
    ('COMMAND_PRINT', 'es-ES', 'Imprimir…'),
    ('COMMAND_PRINT', 'fr-FR', 'Imprimer…'),
    ('COMMAND_PRINT', 'it-IT', 'Stampa…'),
    ('COMMAND_PRINT', 'nl-NL', 'Printen…'),
    ('COMMAND_PRINT', 'pt-PT', 'Imprimir…');

In some cases verbs are included in the translation memory for reference purposes and consistency. For example, the verb to print may in Dutch be translated as printen, afdrukken, or drukken. The definition printen of VERB_PRINT thus indicates the preferred translation for Dutch.

Errors and exceptions

Error messages and exception message strings currently are not translated when these are intended primarily for developers and server administrators.