The Ubuntu desktop help documents and Ubuntu Serverguide are published on the help.ubuntu.com web site in approximately 60 languages (although many translations are very incomplete). The document compile work flow is somewhat wasteful, because it makes specific language versions of every file, even if they are identical for all languages. For example:
doug@s15:~/docs-trunk/z/html/ubuntu-docs$ ls -l *yelp-note-warning*
-rw-r--r-- 1 doug doug 1088 Mar 13 23:55 yelp-note-warning.png.am
-rw-r--r-- 1 doug doug 1088 Mar 13 23:55 yelp-note-warning.png.ar
-rw-r--r-- 1 doug doug 1088 Mar 13 23:55 yelp-note-warning.png.ast
-rw-r--r-- 1 doug doug 1088 Mar 13 23:55 yelp-note-warning.png.az
...
-rw-r--r-- 1 doug doug 1088 Mar 13 23:55 yelp-note-warning.png.ur
-rw-r--r-- 1 doug doug 1088 Mar 13 23:55 yelp-note-warning.png.uz
-rw-r--r-- 1 doug doug 1088 Mar 13 23:55 yelp-note-warning.png.zh-CN
-rw-r--r-- 1 doug doug 1088 Mar 13 23:55 yelp-note-warning.png.zh-HK
-rw-r--r-- 1 doug doug 1088 Mar 13 23:55 yelp-note-warning.png.zh-TW
For such a file it would be adequate to just have one file called yelp-note-warning.png
For completeness, an example where we would not be able to make just one file:
doug@s15:~/docs-trunk/z/html/ubuntu-docs$ ls -l figures/unity-workspace-intro*
-rw-r--r-- 1 doug doug 48077 Mar 13 23:55 figures/unity-workspace-intro.png.am
-rw-r--r-- 1 doug doug 48077 Mar 13 23:55 figures/unity-workspace-intro.png.ar
-rw-r--r-- 1 doug doug 48077 Mar 13 23:55 figures/unity-workspace-intro.png.ast
...
-rw-r--r-- 1 doug doug 48077 Mar 13 23:55 figures/unity-workspace-intro.png.da
-rw-r--r-- 1 doug doug 64335 Mar 13 23:55 figures/unity-workspace-intro.png.de
-rw-r--r-- 1 doug doug 48077 Mar 13 23:55 figures/unity-workspace-intro.png.el
...
-rw-r--r-- 1 doug doug 48077 Mar 13 23:55 figures/unity-workspace-intro.png.id
-rw-r--r-- 1 doug doug 47152 Mar 13 23:55 figures/unity-workspace-intro.png.it
-rw-r--r-- 1 doug doug 48077 Mar 13 23:55 figures/unity-workspace-intro.png.ja
...
-rw-r--r-- 1 doug doug 48077 Mar 13 23:55 figures/unity-workspace-intro.png.zh-HK
-rw-r--r-- 1 doug doug 48077 Mar 13 23:55 figures/unity-workspace-intro.png.zh-TW
My question is: How can we identify that all files are identical and if so replace them with one file without a language specific suffix?
Pseudo code:
For all files, including sub-folders{
If all language specific versions of the file are identical{
Replace the language specific versions with one non-language specific version.
}
}
At this time, we are only concerned with the desktop help docs, although if something is figured out, we would also do it for the Serverguide.
This could be done in the Makefile, if possible (and preferred). Or as a stand alone script (which I suppose could be called from the Makefile). My concern with writing a c program to it, it that not all Ubuntu documentation team members are also c programmers, causing a potential maintenance concern down the road.
It is envisioned, but not a requirement, that the new code, as per the pseudo code above would be added at the very end of the install section of the Makefile:
# Installs all HTML files to a single multilingual directory for subsequent copying to
# the web server document structure (e.g. to run with Apache and MultiViews enabled)
install:
rm -Rf "$(INSTALLDIR)"/*; \
mkdir -p "$(INSTALLDIR)"; \
cp -R "$(HTMLDESTDIR)/"* "$(INSTALLDIR)"; \
for lc in C $(help_linguas); do \
lang=`echo $$lc | $(SED) -e 's/[@_]/-/'`; \
if test "$$lang" = "C"; then lang=en; fi; \
if test "$$lang" = "gl"; then lang=gl-GL; fi; \
if test "$$lang" = "ms"; then lang=ms-MS; fi; \
if test "$$lang" = "pl"; then lang=pl-PL; fi; \
cp -af "$(INSTALLDIR)/$$lang"/*.css "$(INSTALLDIR)"; \
rm -Rf "$(INSTALLDIR)/$$lang"/*.css ; \
find "$(INSTALLDIR)/$$lang" -type f -exec mv {} {}.$$lang \; ; \
cp -af "$(INSTALLDIR)/$$lang"/* "$(INSTALLDIR)"; \
rm -Rf "$(INSTALLDIR)/$$lang" ; \
done
... new code, per this question, goes here ...
EDIT: Gunnar's solution is great, but there are still other files that are redundant. All.js files for example, and still some .png files in the main directory.
EDIT: Gunnar's revised solution addresses all the issues, for a net savings of 3326 less files for just the 17.04 desktop docs web pages alone.
References:
The html compile Makefile. (see the install
segment at the end.)
The entire project code.
The build procedure.
The web site .htaccess file. Gunnar's answer relies on the language fallback.
I made an attempt at fixing it:
http://bazaar.launchpad.net/~ubuntu-core-doc/ubuntu-docs/trunk/revision/607?compare_revid=605
It keeps the .en extension, and relies on the .htaccess file as a fallback.
The first try resulted in always getting the .png file version that had no language extension no matter what actual language was being asked for, the second attempt has .png.en as the generic file extension and relies on the website .htaccess file LanguagePriority fallabck directive to supply that file in situations where there isn't one specific to that desired language.
Example:
German will get the .png.de file, Italian the .png.it file, and any other language request will get the .png.en file.