GetSimple Support Forum

Full Version: Romance languages transliteration plugin
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Makes sure correct slug/URL transliteration for non US-ASCII characters in Spanish, French, Italian, Portuguese....

Can be especially useful for servers not supporting mb encoding.
Also fixes issues like 'ü' being converted to 'ue' (should be 'u'), and with Spanish marks ¿ ¡

Usage: unzip and upload to your plugins folder.

Includes these character conversions:
Code:
    # vowels:
    'á'=>'a', 'é'=>'e', 'í'=>'i', 'ó'=>'o', 'ú'=>'u',
    'Á'=>'a', 'É'=>'e', 'Í'=>'i', 'Ó'=>'o', 'Ú'=>'u',
    'à'=>'a', 'è'=>'e', 'ì'=>'i', 'ò'=>'o', 'ù'=>'u',
    'À'=>'a', 'È'=>'e', 'Ì'=>'i', 'Ò'=>'o', 'Ù'=>'u',
    'ä'=>'a', 'ë'=>'e', 'ï'=>'i', 'ö'=>'o', 'ü'=>'u',
    'Ä'=>'a', 'Ë'=>'e', 'Ï'=>'i', 'Ö'=>'o', 'Ü'=>'u',
    'â'=>'a', 'ê'=>'e', 'î'=>'i', 'ô'=>'o', 'û'=>'u',
    'Â'=>'a', 'Ê'=>'e', 'Î'=>'i', 'Ô'=>'o', 'Û'=>'u',
    # consonants:
    'ñ'=>'n', 'ç'=>'c',
    'Ñ'=>'n', 'Ç'=>'c',
    # marks:
    '¿'=>'', '¡'=>''

Please let me know if I should add some other char.
OMG! Bro. You are really my life saver! I salute you for the great help! I'm so happy that it works like a charm now.

Again, I really appreciate your help bro!
I suppose the lang authors could simply add this to the lang file also ?
since GS now has i18n fallbacks to default ( which is core en ) COuld be not just add this to the core en_us file, i think its missing.
(2015-09-12, 23:26:31)shawn_a Wrote: [ -> ]I suppose the lang authors could simply add this to the lang file also ?

Yes, in fact there are several language files that do it.
But if an user has a different language selected for the backend (e.g. he or she prefers working in English), then transliteration doesn't work for this user.

(2015-09-12, 23:28:31)shawn_a Wrote: [ -> ]since GS now has i18n fallbacks to default ( which is core en ) COuld be not just add this to the core en_us file, i think its missing.

Yes but... for all languages? Only some?
There's also a problem with at least one character: ü (and Ü) - in German it is transliterated as "ue", but in Spanish it should be "u". No idea if there are other possible conflicts between langs.
we could at least include a standard latin translit, since we only have a few char built in to the _id func and most are just removed completely.

not sure in understand your first comment, afaik translit is only used for slugs atm.
(2015-09-13, 05:10:45)shawn_a Wrote: [ -> ]we could at least include a standard latin translit, since we only have a few char built in to the _id func and most are just removed completely.

I agree, it would be nice.

I've merged several transliteration array into this one (draft), that would for for Roman/latin (Spanish, French, Italian, Portuguese, Catalan...), Russian, Polish, Czech, Slovak, ...

Code:
// Roman
    'á'=>'a', 'é'=>'e', 'í'=>'i', 'ó'=>'o', 'ú'=>'u',
    'Á'=>'a', 'É'=>'e', 'Í'=>'i', 'Ó'=>'o', 'Ú'=>'u',
    'à'=>'a', 'è'=>'e', 'ì'=>'i', 'ò'=>'o', 'ù'=>'u',
    'À'=>'a', 'È'=>'e', 'Ì'=>'i', 'Ò'=>'o', 'Ù'=>'u',
    'ä'=>'a', 'ë'=>'e', 'ï'=>'i', 'ö'=>'o', 'ü'=>'u',
    'Ä'=>'a', 'Ë'=>'e', 'Ï'=>'i', 'Ö'=>'o', 'Ü'=>'u',
    'â'=>'a', 'ê'=>'e', 'î'=>'i', 'ô'=>'o', 'û'=>'u',
    'Â'=>'a', 'Ê'=>'e', 'Î'=>'i', 'Ô'=>'o', 'Û'=>'u',
    'ñ'=>'n', 'ç'=>'c',
    'Ñ'=>'n', 'Ç'=>'c',
    '¿'=>'', '¡'=>'',
// special Czech chars with diacritics (except some)
    "ě"=>"e","Ě"=>"E","š"=>"s","Š"=>"S","č"=>"c",
    "Č"=>"c","ř"=>"r","Ř"=>"r","ž"=>"z","Ž"=>"z",
    "ý"=>"y","Ý"=>"y",
    "ů"=>"u","Ů"=>"u","ť"=>"t","Ť"=>"t",
    "ď"=>"d","Ď"=>"d","ň"=>"n","Ň"=>"n",
    //special Slovakian chars with diacritics (except some)
    "ĺ"=>"l","ľ"=>"l","ŕ"=>"r",
    "Ĺ"=>"l","Ľ"=>"L","Ŕ"=>"r",
// Polish
  "Ą"=>"a","Ć"=>"c","Ę"=>"e",
  "Ł"=>"l","Ń"=>"n",
  "Ś"=>"s","Ź"=>"z","Ż"=>"z",
  "ą"=>"a","ć"=>"c","ę"=>"e",
  "ł"=>"l","ń"=>"n",
  "ś"=>"s","ź"=>"z","ż"=>"z",
// Russian
    "А"=>"a","Б"=>"b","В"=>"v",
    "Г"=>"g","Д"=>"d","Е"=>"e","Ё"=>"yo","Ж"=>"zh",
    "З"=>"z","И"=>"i","Й"=>"j","К"=>"k","Л"=>"l",
    "М"=>"m","Н"=>"n","О"=>"o","П"=>"p","Р"=>"r",
    "С"=>"s","Т"=>"t","У"=>"u","Ф"=>"f","Х"=>"h",
    "Ц"=>"c","Ч"=>"ch","Ш"=>"sh","Щ"=>"shh","Ъ"=>"'",
    "Ы"=>"y","Ь"=>"","Э"=>"e","Ю"=>"yu","Я"=>"ya",
    "а"=>"a","б"=>"b","в"=>"v","г"=>"g","д"=>"d",
    "е"=>"e","ё"=>"yo","ж"=>"zh","з"=>"z","и"=>"i",
    "й"=>"j","к"=>"k","л"=>"l","м"=>"m","н"=>"n",
    "о"=>"o","п"=>"p","р"=>"r","с"=>"s","т"=>"t",
    "у"=>"u","ф"=>"f","х"=>"h","ц"=>"c","ч"=>"ch",
    "ш"=>"sh","щ"=>"shh","ъ"=>"","ы"=>"y","ь"=>"",
    "э"=>"e","ю"=>"yu","я"=>"ya"," "=>"-","-"=>"-","»"=>""

However it would not work properly with German at least because of the ü character (though I found that many German blogs out there don't seem to care about this). Not sure what to suggest about this (some switch, gsconfig setting...?)

(2015-09-13, 05:10:45)shawn_a Wrote: [ -> ]not sure in understand your first comment, afaik translit is only used for slugs atm.

If an user that has en_US selected for working in the backend, creates a page with a Russian title, the translit array in ru_RU is not used when generating the page slug.
you could hack this by using a modified ru file with only a transliteration in it, set backend language to this ru custom.
then turn langfallbackmerge on for en_US ... not that that is reasonable but it should work.

"translate front end stuff with alternate language"

hmm, almost everything is front end, so maybe we just override using what we just discussed, custom, or add a gs config to load only the translit from alternate language file.

We don't really support front end languages.. do we need to or leave it to plugins to do translit on their own when saving an alternate language page.