1 (edited by intelx86 2012-02-04 11:57:20)

Topic: Bug: htmlentities messes code blocks

Greek characters are unicode supported. If entered as regulart text, they make no difference than other text.
If entered in a <code> block they are presented with their entity form. This results to

&pi; = 3.14
&phi; = 1.62

instead of

π = 3.14
φ = 1.62

The problem lies into safe_slash_html() function inside admin/inc/basic.php where htmlentities is used. htmlentities() and htmlspecialcharacters() are identical in all ways, except with htmlentities(), all characters which have HTML character entity equivalents are translated into these entities.

Fix:

FIND

function safe_slash_html($text) {
    if (get_magic_quotes_gpc()==0) {
        $text = addslashes(htmlentities($text, ENT_QUOTES, 'UTF-8'));
    } else {
        $text = htmlentities($text, ENT_QUOTES, 'UTF-8');
    }
    return $text;
}

REPLACE, WITH

function safe_slash_html($text) {
    if (get_magic_quotes_gpc()==0) {
        $text = addslashes(htmlspecialchars($text, ENT_QUOTES, 'UTF-8'));
    } else {
        $text = htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
    }
    return $text;
}

Re: Bug: htmlentities messes code blocks

thanks for the info,

did you work with GS 3.0 or with the Beta version, GS 3.1?

|--
Die deutsche GetSimple-Webseite: http://www.Get-Simple.de = the german Get-Simple-Website!
Das deutschsprachige GetSimple-(Unter-)Forum:   http://get-simple.info/forum/forum/16/german-deutsch/

Re: Bug: htmlentities messes code blocks

Get-Simple 3.0

Re: Bug: htmlentities messes code blocks

please test GetSimple 3.1B, as there are changes done

|--
Die deutsche GetSimple-Webseite: http://www.Get-Simple.de = the german Get-Simple-Website!
Das deutschsprachige GetSimple-(Unter-)Forum:   http://get-simple.info/forum/forum/16/german-deutsch/

Re: Bug: htmlentities messes code blocks

Just tested with version 3.1B r646 RC3.
Problem is still there.

Re: Bug: htmlentities messes code blocks

intelx86 wrote:

Greek characters are unicode supported. If entered as regulart text, they make no difference than other text.
If entered in a <code> block they are presented with their entity form. This results to

&pi; = 3.14
&phi; = 1.62

instead of
[...]

You mean in the source code of a page?
I think that html entities are generated in the page content not only between <code> and </code> tags, but everywhere. It happens also with characters like á (&aacute;), ä (&auml;), etc.

Not a serious problem, as pages are rendered equally by the browser, the user doesn't notice.

...But anyway, I like your suggested patch: the page's source code is more readable if it is in German, French, Spanish, etc. (and I believe it wouldn't break anything)

Re: Bug: htmlentities messes code blocks

The problem arroused as I use markdown syntax while writing pages. Basically markdown is a frontend which parses the content of each page. So it is "silly" to store entities inside the xml files, for characters that are fully supported by unicode.

Re: Bug: htmlentities messes code blocks

Ah, are you using Zegnåt's Markdown plugin...?

9 (edited by intelx86 2012-02-06 15:00:53)

Re: Bug: htmlentities messes code blocks

Yes. The problem actually arrouses with the plugin, but basically it's a structural problem (the use of htmlentities instead of htmlspecialchars) of GetSimple.

If greek characters are entered in a simple <pre><code></code></pre> structure, with markdown disabled, the result is greek characters, not their entities, though entitites are used for storage.

10 (edited by Carlos 2012-02-06 15:36:14)

Re: Bug: htmlentities messes code blocks

Strange, I get the same entities in the html code with greek (or German/Spanish/etc.) characters both in- and outside <pre><code></code></pre> blocks.

(Anyway, as I said, I like your patch. I think that making that change in the core would be good.)

Re: Bug: htmlentities messes code blocks

@intelx86 , thanks for the patch, I've only seen it now.

Updated the SVN with it...

Currently working on The Matrix Plugin...