GetSimple Support Forum

Full Version: Bug: htmlentities messes code blocks
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Greek characters are unicode supported. If entered as regulart text, they make no difference than other text.
If entered in a <code> block they are presented with their entity form. This results to
Code:
&pi; = 3.14
&phi; = 1.62

instead of

Code:
π = 3.14
φ = 1.62

The problem lies into safe_slash_html() function inside admin/inc/basic.php where htmlentities is used. htmlentities() and htmlspecialcharacters() are identical in all ways, except with htmlentities(), all characters which have HTML character entity equivalents are translated into these entities.

Fix:

FIND
Code:
function safe_slash_html($text) {
    if (get_magic_quotes_gpc()==0) {
        $text = addslashes(htmlentities($text, ENT_QUOTES, 'UTF-8'));
    } else {
        $text = htmlentities($text, ENT_QUOTES, 'UTF-8');
    }
    return $text;
}

REPLACE, WITH
Code:
function safe_slash_html($text) {
    if (get_magic_quotes_gpc()==0) {
        $text = addslashes(htmlspecialchars($text, ENT_QUOTES, 'UTF-8'));
    } else {
        $text = htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
    }
    return $text;
}
thanks for the info,

did you work with GS 3.0 or with the Beta version, GS 3.1?
Get-Simple 3.0
please test GetSimple 3.1B, as there are changes done
Just tested with version 3.1B r646 RC3.
Problem is still there.
intelx86 Wrote:Greek characters are unicode supported. If entered as regulart text, they make no difference than other text.
If entered in a <code> block they are presented with their entity form. This results to
Code:
&pi; = 3.14
&phi; = 1.62
instead of
[...]

You mean in the source code of a page?
I think that html entities are generated in the page content not only between <code> and </code> tags, but everywhere. It happens also with characters like á (&aacuteWink, ä (&aumlWink, etc.

Not a serious problem, as pages are rendered equally by the browser, the user doesn't notice.

...But anyway, I like your suggested patch: the page's source code is more readable if it is in German, French, Spanish, etc. (and I believe it wouldn't break anything)
The problem arroused as I use markdown syntax while writing pages. Basically markdown is a frontend which parses the content of each page. So it is "silly" to store entities inside the xml files, for characters that are fully supported by unicode.
Ah, are you using Zegnåt's Markdown plugin...?
Yes. The problem actually arrouses with the plugin, but basically it's a structural problem (the use of htmlentities instead of htmlspecialchars) of GetSimple.

If greek characters are entered in a simple <pre><code></code></pre> structure, with markdown disabled, the result is greek characters, not their entities, though entitites are used for storage.
Strange, I get the same entities in the html code with greek (or German/Spanish/etc.) characters both in- and outside <pre><code></code></pre> blocks.

(Anyway, as I said, I like your patch. I think that making that change in the core would be good.)
@intelx86 , thanks for the patch, I've only seen it now.

Updated the SVN with it...