Login

polyfragmented · (This post was last modified: 2011-03-08, 08:22:51 by infos.media.)

Hi,

I'm a new user from Germany testing GetSimple. Liking it lots so far.

I noticed that German umlauts (ÃƒÂ¤,ÃƒÂ¼,ÃƒÂ¶,...) in a navigation's link title attribute and link text are being rendered with HTML entities in GetSimple's markup output. Since the .htaccess file and my own encoding metatag request UTF-8 that seems redundant? Seems to happen in navigation only here. Using GS 2.0.3.1 Edit: plus transliteration plugin.

Code:
<li class="ueber current"><a href="http://domain.info/ueber/" title="&Uuml;ber mich">&Uuml;ber mich</a></li>

Keep up the good work! Smile

Thorsten

yojoe · 2011-03-08, 08:11:22

When you create a new page, name its url without special characters,
or use this plugin: http://get-simple.info/extend/plugin/slu...ration/33/

polyfragmented · (This post was last modified: 2011-03-08, 08:18:16 by infos.media.)

Thanks for your reply,

I forgot to mention that I am indeed using the transliteration plugin which cleaned up the page slug itself fine. I'm using the default call for building the navigation.

polyfragmented · 2011-03-10, 06:05:45

I filed a bug report at http://code.google.com/p/get-simple-cms/...ail?id=142

***ccagle8*** · 2011-03-13, 12:13:42

Ive checked this out, and I fail to see where the problem is. From the source, you may see the encoded characters, but why is this a bad thing?

We've had discussions about this very thing before (encoded chars in the page's body) and noone has ever provided proof that this is a bad thing. From what I know, I think Google will be able to read something like this:

Code:
German &Auml;&auml;&Ouml;&ouml;&Uuml;&uuml;

mvlcek · 2011-03-13, 18:19:09

ccagle8 Wrote:Ive checked this out, and I fail to see where the problem is. From the source, you may see the encoded characters, but why is this a bad thing?
...

Code:
German ÄäÖöÜü

It makes a lot of operations on the page a bit more difficult, e.g.

Extracting the words for search - you have to decode the entities first
Extracting the first n characters for an excerpt - the number of characters will be off, if you don't decode
Splitting a long page into multiple pages.

Together with these terrible slashes I had to get the content for I18N Search like (it works, but it is correct?):

Code:
$content = html_entity_decode(strip_tags(stripslashes(htmlspecialchars_decode($pagedata->content))), ENT_QUOTES, 'UTF-8')

I think, it's impossible to get the page content with tags, but everything besides htmlspecialchars decoded - thus the pagify plugin doesn't care.

BTW: I hope the person responsible for add/stripslashes and it's "automagical" usage is never allowed to define a functionality of a programming language again ;-)

polyfragmented · (This post was last modified: 2011-03-14, 22:14:54 by infos.media.)

I'll research this further from an encoding/SEO point-of-view when I find the time.

Thanks for chipping in on this from a programming point-of-view, mvlcek.

polyfragmented · 2011-03-31, 05:45:36

Someone brought up the same problem in the system's editor. Connie suggested a change to the editor config which turns off character replacement in the editor. That solves the editor problem at least and makes edits easier with lots of special characters.

Login
Username:
Password:	Lost Password?
	Remember me