Contributors mailing list archives

Re: Translations handling in OCA repositories

ThinkOpen Solutions Portugal, Daniel Reis
- 13/10/2016 21:48:00

Thanks for the work Pedro!

The limitation you describe seems perfectly acceptable, since having PO files is merged PRs will not be frequent:
mostly on first module submission, or on a rare submission of a translation via PR.


Citando Pedro M. Baeza (Tecnativa) <>:

Hi contributors,
I have been working all day improving the handling of translations via Transifex and our automated scripts for synchronizing Transifex > Github and vice versa, due to a problem I faced yesterday translating resources.
Let me tell you a bit about the current architecture for you to understand the problem:

When a commit gets any repository, Travis CI performs an stage that installs Odoo, clones repositories, installs the corresponding modules, and then exports the POT file from Odoo and push to Transifex all resources (POT files) and translations (PO file). This way, we assure that all the modifications made on strings (and possibly in PO files included in the PRs) are dumped to Transifex. This is performed basically by the script
In the other hand, the process to pull back (or pushed, it depends from what side you are considering it) from Transifex the translations done on the platform to GitHub, is made by a non automatic script located in This script checks for changes in the translations and prepare a commit with all of them under the label "OCA Transbot updated translations from Transifex". We are running this script once a week starting on Saturdays in a cron on an OCA server.

First problem we faced in the past is that there's no way to obtain something like "all the modified files", but the script has to check translation per translation if there has been any change, taking the PO file from Transifex and from GitHub, and comparing both at string level. This makes the script extremely slow as there are a lot of requests, and we also have to deal with the rate limits applied to both GitHub (5000 requests per hour) and Transifex (
5 per second, 100 per minute for GET, 2 per second and 25 per minute for POST, and 5 per second, 50 per min for PUT - There's also a general limit of 6000 requests per hour - There's also no reliable way to check these rates (we didn't get consistent error codes - maybe nowadays this has changed), so the solution we took was to have some rest time between operations. The script needs with this the whole weekend to be completed for all the OCA repositories.

The problem I have found now is that I made a translation when the module arrives to the repository, but discovered that the translation was lost after a while. The explanation behind this is a combination of several aspects:
  • There are several commits that have been made in the same repository coinciding with the v10 taskforce effort.
  • The module I translated already contains a partial translation in the repository.
  • The Transifex > GitHub script hasn't been launched in between.

I have tried several approaches to solve this without success: deleting empty entries, trying to remove files, detecting changes... One limitation in Git about the absence of timestamp metadata in files (check official statement about that in has also made this harder. Another limitation is about timestamps in Transifex, that are naive (not timezone aware).

But finally I have reached one satisfactory solution marking translation files with the timestamp of the last commit that modifies them (converted to UTC), and let Transifex client to skip these files. You can see it in this commit:

The only limitation for now is that you can't change a translation in GitHub and also in Transifex in the same synchronization window, which should be hard to reach if the rule is always to translate on Transifex.

I hope this explanation serves for anyone interested in the process. Any suggestion in it is welcome.


Post to: