Contribute to the DSpace Development Fund

The newly established DSpace Development Fund supports the development of new features prioritized by DSpace Governance. For a list of planned features see the fund wiki page.

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 30 Next »

WARNING - Still under review

The proposed approach is still under review - there are still considerable technical hurdles to overcome in this area of work.

Introduction

Prior to DSpace 7, the DSpace XML and JSP User interfaces had different catalogs of interface messages. Unified on a single user interface, the DSpace community is transitioning to a single catalog of interfaces messages and better tools for translators.

Originating in the Linux world, the GNU gettext tools and the PO file format is also the backbone of localization support in Wordpress, django and Drupal

The DSpace community is actively seeking contributors to aid in the translation of DSpace interface messages, to ensure that DSpace 7 can benefit from the most extensive localization support in the history of the project.

Getting started

Contributing to translations requires a Github and a DuraSpace wiki account. Head over to www.github.com to create a free Github account.

Accounts for the DuraSpace wiki are free as well and can be requested in an email to sysadmin@duraspace.org, indicating that you wish to contribute to DSpace 7 Translations.

Make yourself familiar with the new .pot catalog of DSpace messages, as well the .po file format with the translations for the catalog.

  • 2019-05-20 DRAFT dspace.pot => TODO UPDATE LINK ONCE PR MERGED !!!!!!!!
    • In this catalog, all msgstr entries are empty. This is because the catalog is the authoritative place for the original keys, not for any translations.
  • 2019-05-20 DRAFT en.po or the older DRAFT de.po => TODO UPDATE LINK ONCE PR MERGED !!!!!!!!
    • Both the msgids and the msgstr entries are in English in en.po.
    • The msgids are in English, and the msgstr are in German in de.po
    • Purely as comment, the old intermediate keys are still referenced in comments

Explore tools that can help you with the management of .pot and .po files. https://poedit.net/ is one widely used desktop application.

2019-07-25 State of development

The key challenges that are still being tackled are:

2019-05-20 State of development

As part of preview release 1, the developers are still using en.json catalogs. Once Pull Request 366 is accepted, the migration to the new .POT and .PO standard files will be official.

As long as DSpace 7 is still in development, it is expected that the dspace.pot catalog, as well as the different translations, will continue to be extended and evolved.

Together, we aim to release as many, as complete translations as possible, as part of the official DSpace 7.0 release.

Volunteer!

Please list your name, email address alongside any of the languages to which you wish to contribute. Also feel free to join the channel #translation on the DuraSpace Slack for assistance and discussion around DSpace 7 translations. 

Dutch (nl.po)

Bram Luyten - bram@atmire.com

Claudia Jürgen - claudia.juergen@tu-dortmund.de - German (de.po)

Translator documentation

Translation files (.po)

Comments - #

All lines in the translation files that start with # are comments, either aimed at helping developers or helping translators.

#: at the start of a line is a source reference, aimed at making it clear where in the source code this message IS or WAS used. Because the original po files were migrated from another, json based format, the old JSON keys have been added as source references, to make it clear where the message originally came from in the previous format.

Normal key / msgid examples

For most keys that need to be translated, the English original is part of the DSpace Angular source code. This original is then used as the msgid

As a translator, you add the translation into the msgstr variable.

As demonstrated in the example below, you are required to wrap your translation in double quotes.

Basic translation example
#: .item.edit.withdraw.description
msgid "Are you sure this item should be withdrawn from the archive?"
msgstr "Ben je zeker dat je dit item wil terugtrekken uit het archief?"

Following multi-line example shows that both the original key, as well as the translation, can be split over multiple lines.

Multi-line key and translation example
#: .submission.sections.upload.info
msgid ""
"Here you will find all the files currently in the item. You can update the "
"file metadata and access conditions or <strong>upload additional files just "
"dragging & dropping them everywhere in the page</strong>"
msgstr ""
"Hier vind je alle bestanden momenteel opgeladen in het item. Je kan hier de "
"bestands metadata en toegangsvoorwaarden aanpassen. Je kan ook <strong>extra"
"bestanden toevoegen door ze eender waar naar de pagina te slepen</strong>"

Because double quote is the special character that starts or ends a message, you need to prefix double quotes with \ if you want them to appear in the actual message.

Likewise, as the previous just explained that \ can be used to escape characters, you also need to escape \ itself if you want it to appear in the message.

Escaping of double quotes example
#: .submission.workflow.generic.delete-help
msgid ""
"If you would to discard this item, select \"Delete\".  You will then be "
"asked to confirm it."
msgstr ""
"If you would to discard this item, select \"Delete\".  You will then be "
"asked to confirm it."

Dynamic key example

In parts of the DSpace Angular code, a list of very similar objects is being built and displayed in the user interface, for example, the search filters. 

These similar objects all need different labels in the user interface:

  • search.filters.filter.author.head
  • search.filters.filter.dateissued.head
  • ...

Contrary to the normal keys that are discussed higher on this page, they keys are dynamically built in the code, meaning that the developer has no opportunity to put the English string into the code, unless he or she would hard-code a static list of those similar objects. If this sounds too abstract, look at the snippet of angular component code that puts these search filters in the page:

<h5 class="d-inline-block mb-0">{{'search.filters.filter.' + filter.name + '.head'| translate}}</h5>

So to deal with these kinds of occurences, we are currently still using the (old) en.json key entries for those type of messages, in the en.po file, for example:

#. ENGLISH KEY: "Author"
#: .search.filters.filter.author.head
msgid "search.filters.filter.author.head"
msgstr "Author"

Notice how here, that the English translation is also added into the comments, because msgid needs to hold the key in order for the translation to work.

The Angular DSpace 7 message catalog (dspace.pot)

The format of the dspace.pot catalog file, is very similar to the format of the translations. 

The main difference is that this file does not contain actual translated strings (msgstr), because it serves as the authoritative catalog of the source messages, without translations.


Background reading

The PO Format

GNU gettext utilities

Developer How-to

Escaping

In Angular files (.html, .ts, ...) you need to escape

{{ 'copyright © 2002-{'+'{ year }'+'}' | translate:{year : dateObj | date:'y'} }}


In .po/.pot files you need to escape

  • Double quotes (") with \
  • ...

Locating keys that have not been replaced

If you execute following command in the angular source directory, you get a list of keys that have not yet been replaced.

Grep command for identifying keys that have not yet been replaced
grep -snRHIiE "'.*\.[^\s]+\.[^']+' \| translate" *

Sample output looks like:

app/+community-page/delete-community-page/delete-community-page.component.html:5: <h2 id="header" class="border-bottom pb-2">{{ 'community.delete.head' | translate
app/+community-page/delete-community-page/delete-community-page.component.html:7: <p class="pb-2">{{ 'community.delete.text' | translate:{ dso: dso.name } }}</p>
app/+community-page/delete-community-page/delete-community-page.component.html:12: <button class="btn btn-primary" (click)="onCancel(dso)">{{'community.delete.cancel' | translate}}
app/+community-page/edit-community-page/edit-community-page.component.html:4: <h2 id="header" class="border-bottom pb-2">{{ 'community.edit.head' | translate }}</h2>


Future work

Message Context - msgctxt

Apart from comments, starting with #, the msgid lines representing the key, and the msgstr lines, representing the translations, an entry can also contain a msgctxt line.

Originally, in linux gettext, they are being used to disambiguate messages. Let's say you have two different places in the application where you are using "Person" in English, you sometimes need the ability to give both of these occurences a different translation in another language. 

This is why uniqueness is actually not enforced on the uniqueness of the msgid, but on the uniqueness of the combination between msgid, and another directive, msgctxt.

Here's an example, outside of DSpace, showing that "Normal" requires different grammar in another language, depending on the context in which it is used.

General gettext / PO example of msgctxt
#: utils/katestyletreewidget.cpp:132
msgctxt "Text style"
msgid "Normal"
msgstr "običan"
⁠
#: utils/kateautoindent.cpp:78
msgctxt "Autoindent mode"
msgid "Normal"
msgstr "obično"

At the time of writing, 2019-06-01, the NGX-Translate implementation of the support of po files that DSpace is relying on, actually prevails the translations from being rendered correctly if a msgctxt is also part of the entry. This means that right now, we can only have a single translation for a single string. This is also why the catalog and the en.po translation file don't contain msgctxt entries.

What happened to the objective to leverage .po and gettext as new standard?

The Angular i18n framework we use, NGX translate, has a 3rd party PO file loader: https://github.com/biesbjerg/ngx-translate-po-http-loader 

Even though the community was initially very optimistic about its potential and the transition to .po and gettext, the major deal breaker was the absence of support for gettext message context (msgctxt), that would allow a translator to translate a key like "Home" into different words in the target language, depending on the context.

The initial ambitions to use the English string as the key itself, and abandon intermediate keys, was also problematic, as we hit a big number of areas in the code where keys were built up programmatically.

As a result, the community settled for:

  • JSON6 as the format for the message catalog
  • Reverting to a flat list of keys, instead of a hierarchical tree. This now makes it possible again to search on a particular key, which was not possible anymore in the hierarchical format. 


  • No labels