Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Corrupted xml files
#1
Hello

I'm working on a site that is on Norwegian sysedata.no server. At the beginning there were not many issues (just a problem with login in Safari 4 after installation, it was first time I wasn't redirected to admin panel after installation) but it's getting worse and I guess it's caused by server configuration. First there were some troubles with saving files, xmls were corrupted but only when client tried to edit them. I have fixed it. Now it's not possible to login to admin and there is quite long list of errors in debug mode when trying to reach admin panel. It looks like:

Code:
Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 42: parser error : CData section not finished <p>  Hvem er vi?<br /> AAT in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: <br /> in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 42: parser error : PCDATA invalid Char value 12 in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: <br /> in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 82: parser error : Sequence ']]>' not allowed in content in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: E-post: luna@aatranslator.no</p> in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 82: parser error : Sequence ']]>' not allowed in content in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ]]></content><private><![CDATA[]]></private><menuOrder><![CDATA[5]]></menuOrder> in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 82: parser error : internal error in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ]]></content><private><![CDATA[]]></private><menuOrder><![CDATA[5]]></menuOrder> in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 82: parser error : Extra content at the end of the document in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ]]></content><private><![CDATA[]]></private><menuOrder><![CDATA[5]]></menuOrder> in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Notice: Trying to get property of non-object in /rgt/207/67279/www/temp/admin/pages.php on line 34

Notice: Trying to get property of non-object in /rgt/207/67279/www/temp/admin/pages.php on line 35

Notice: Trying to get property of non-object in /rgt/207/67279/www/temp/admin/pages.php on line 36

Notice: Trying to get property of non-object in /rgt/207/67279/www/temp/admin/pages.php on line 37

Notice: Trying to get property of non-object in /rgt/207/67279/www/temp/admin/pages.php on line 38

Notice: Trying to get property of non-object in /rgt/207/67279/www/temp/admin/pages.php on line 39

Notice: Trying to get property of non-object in /rgt/207/67279/www/temp/admin/pages.php on line 44

Notice: Trying to get property of non-object in /rgt/207/67279/www/temp/admin/pages.php on line 46

Notice: Trying to get property of non-object in /rgt/207/67279/www/temp/admin/pages.php on line 47

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 1291224 bytes) in /rgt/207/67279/www/temp/admin/inc/template_functions.php on line 913

There are a few plugins installed - mostly I18N pack (with custom fields) but I have set up several sites like that on different hostings and it's first time I get such problems.

Any ideas? I'm not PHP programmer so I rely on you guys.
Reply
#2
what is the output of the server health check?
that is the most important info which could help with analysis ...

do you have a backup? Page backups or ZIP backups?
|--

Das deutschsprachige GetSimple-(Unter-)Forum:   http://get-simple.info/forums/forumdisplay.php?fid=18
Reply
#3
Hello

Currently I can't login (I got code from previous post) but site is working (without one page). Last time I was checking it out it was OK, except for one page that was corrupted (the one that is not visible right now). But I have restored it then (overwriting I18N Navigation structure helped).
The most serious message from server seems to be
Code:
Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 1291224 bytes) in /rgt/207/67279/www/temp/admin/inc/template_functions.php on line 913

but to be honest I don't know what does it mean.

I can set up site again as it is in development stage but I afraid it will do the same next time as I think it's something with server configuration.
Reply
#4
blazejs Wrote:Hello

Currently I can't login (I got code from previous post) but site is working (without one page). Last time I was checking it out it was OK, except for one page that was corrupted (the one that is not visible right now). But I have restored it then (overwriting I18N Navigation structure helped).

Do you mean i18n_menu_cache.xml?
You can just delete it and it will recreated.

blazejs Wrote:The most serious message from server seems to be
Code:
Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 1291224 bytes) in /rgt/207/67279/www/temp/admin/inc/template_functions.php on line 913


but to be honest I don't know what does it mean.

So far I had this error message only when trying to create thumbnails for I18N Gallery for large (> 10 MPixel) images. But theoretically this could also happen when indexing lots of pages with I18N Search.

And - if you use GetSimple 3.1 - this will probably happen when GetSimple tries to create the page cache for a large site.
I18N, I18N Search, I18N Gallery, I18N Special Pages - essential plugins for multi-language sites.
Reply
#5
@mvlcek

I have to disagree that you think the caching function of 3.1 might cause memory problems with large sites.

I've just created a site with 5100 pages:

o - every page using the il8n customfields with 3 extra fields and a textarea
o - all fields have data
o - all cached

The total memory used before and after the cache array is read is:

Start memory usage: 1 mb
end memory usage: 1.5 mb

Although way beyond what GS is designed for, it can be done. Its a little sluggish on the backend creating the page cache after a page update as its got to read in each file to create pages.xml but it is usable. A lot more usable than what we had before, reading in each file per page refresh.

Front end is as responsive as ever.
My Github Repos: Github
Website: DigiMute
Reply
#6
n00dles101 Wrote:I have to disagree that you think the caching function of 3.1 might cause memory problems with large sites.

Then I assume that only the metadata without the content itself is written to the cache file?
I18N, I18N Search, I18N Gallery, I18N Special Pages - essential plugins for multi-language sites.
Reply
#7
yes, content is never cached. we read that in directly.

it was designed primarily for caching menus, getting child pages etc...

And on normal pages there are only 2 file reads, one for the pages.xml file and one for the content.

Before the caching we had to read in every file on the system just to create the menu.

Mike...
My Github Repos: Github
Website: DigiMute
Reply
#8
I have deleted all plugins and page files (there were less than 10 pages) and I finally logged into admin panel. There's still one warning from debugmode:
Code:
Warning: Invalid argument supplied for foreach() in /rgt/207/67279/www/temp/admin/inc/template_functions.php on line 895

There are not any warnings in health check right now. But I've noticed error log file is very big (ca. 4MB). Do you want to have a look at this? I may put it somewhere to download.

It's GS 3.0 BTW.
Reply
#9
Make sure you have index,xml in the pages folder. This is required for the system to operate.
My Github Repos: Github
Website: DigiMute
Reply
#10
Thank you! I have restored pages and so far it seems to work. I hope there won't be any more troubles with that Smile
Reply
#11
Unfortunately it happened again... This time I have some report from health check

Code:
Data File Integrity Check

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 53: parser error : CData section not finished &lt;p&gt; AAT er en gruppe h&amp;oslash;yt utdann in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: &lt;br /&gt; in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 53: parser error : PCDATA invalid Char value 12 in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: &lt;br /&gt; in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 93: parser error : Sequence ']]>' not allowed in content in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: E-post: luna@aatranslator.no&lt;/p&gt; in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 93: parser error : Sequence ']]>' not allowed in content in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ]]></content><private><![CDATA[]]></private><menuOrder><![CDATA[5]]></menuOrder> in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 93: parser error : internal error in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ]]></content><private><![CDATA[]]></private><menuOrder><![CDATA[5]]></menuOrder> in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 93: parser error : Extra content at the end of the document in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ]]></content><private><![CDATA[]]></private><menuOrder><![CDATA[5]]></menuOrder> in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

and also

Code:
/backups/pages/hvem-er-vi.bak.xml    XML Invalid - Error!

just this, I had to delete pages to get into admin panel.

What do you think that could be? Something with server or troubles with entities in xml files?
Reply
#12
OK. Now I'm almost 100% sure that problem is in entities encoding. I have a text that will crash any Get Simple installation (tested on two, different hostings).
Text is Norwegian, with some French letters as well, generally lots of letters that are converted to entities.

I was trying

Code:
# WYSIWYG Editor Options
define('GSEDITOROPTIONS', 'entities : false');

but this has no effect on xml file as all national letters are converted to entities in xml. What is strange, this happen only with Western script, while any CE script letters (e.g. ń, ź, ć, ą) - except for famous ó, are always rendered as simple characters (no entities).

I was also trying that: http://get-simple.info/forum/post/12471/#p12471 - no effect.

So when I put my text to editor I have lots of warnings like this when trying to save:

Code:
Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 41: parser error : CData section not finished &lt;p&gt; er en gruppe h&oslash;yt utdannede in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: &lt;br /&gt; in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 41: parser error : PCDATA invalid Char value 12 in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: &lt;br /&gt; in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 44: parser error : Entity 'bull' not defined in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: &bull; Statsautorisert translat&oslash;r fransk&ndash;norsk (2008)&lt;br /&gt; in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 44: parser error : Entity 'oslash' not defined in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: &bull; Statsautorisert translat&oslash;r fransk&ndash;norsk (2008)&lt;br /&gt; in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 44: parser error : Entity 'ndash' not defined in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: &bull; Statsautorisert translat&oslash;r fransk&ndash;norsk (2008)&lt;br /&gt; in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 45: parser error : Entity 'bull' not defined in /rgt/207/67279/www/temp/admin/inc/basic.php on line 219

And this is only that one page, national characters on other pages are OK.

Is there any solution to have all that national characters saved as e.g. å instead of &aring; ? Or actually &amp;aring;..
Reply
#13
blazejs Wrote:Is there any solution to have all that national characters saved as e.g. å instead of &aring; ? Or actually &amp;aring;..
Oh yes - already underway to be introduced in one of new versions.
blazejs Wrote:I have a text that will crash any Get Simple installation (tested on two, different hostings).
Paste, it, paste, it!!! I know a few strange hostings perfect to check it.
Reply
#14
blazejs Wrote:I have a text that will crash any Get Simple installation (tested on two, different hostings).

I'm also very interested in testing that...
Reply
#15
Looking at this error:
Code:
Entity: line 44: parser error : Entity 'bull' not defined
led me to draw conclusion that you pasted some special chars, which ckeditor wasn't able to process.
Most of errors you posted are related to faulty encoding.

Try to add ckeditor's "paste from word" button when you paste text or clear char encoding by processing the text through notepad2.

There's another thing: do you use norwegian php locale settings ?
Code:
setlocale(LC_ALL, 'no_NO.UTF-8', 'no_NN.UTF-8', 'no_NB.UTF-8');
Of course norwegian locale settings have to exist, but if it's a norwegian server, chances are it is already installed.


Edit: try to recreate all existing pages as new files, to get rid with char encoding problems.

And a last question: did you use ckeditor as a GS wysiwyg wditor since installation ?
Addons: blue business theme, Online Visitors, Notepad
Reply
#16
Hej guys!

I think I have found source of all problems. It's very strange, and indeed - it's one of the characters that could not be processed. Invisible character I haven't seen before. It's like end of paragraph with additional strikes, see attachment.

Haven't idea what comes it from, it was in original file and was copied every time. Screenshot is from Coda Panic with turned on invisible characters.
Reply
#17
I wonder if we can implement something like this in the core:

http://stackoverflow.com/questions/14978...php-string

I would probably add it right here: http://code.google.com/p/get-simple-cms/...ic.php#621
- Chris
Thanks for using GetSimple! - Download

Please do not email me directly for help regarding GetSimple. Please post all your questions/problems in the forum!
Reply
#18
blazejs Wrote:one of the characters that could not be processed
Could you paste it here
http://andrew.hedges.name/experiments/obfuscator/
to check whether it has any unicode number?
Reply
#19
I have put full line:

[Illustrasjon til denne siden: vedlagt foto ”Våre tjenester”]

That character is after ] but is invisible. Got something like this:

Code:
[illustrasjon til denne siden: vedlagt foto ”våre tjenester”]

I don't know if this got encoded.
Reply
#20
Looks like it is twelve. But if thos forum script works safe with it in posts (displayed but strangely after copy-paste from your post )
- GS could also.
For the moment, I tried to paste it to editing filed of a page served by 3.0 and results are I can't access admin/pages.php from that moment (browser displayed blank page), all other pages display correctly, but beware with tests! Looks like page-rendering engine bug of GS.

Hmmm.
Reply
#21
As I said, all the other pages displayed correctly - frontend pages as well as administrative backend. Only the page with dangerous a displayed as a blank template with no content (probably this char crashes something triggered by get_page_content() )After restarting browser login page displayed correctly, but after loggin in admin/pages.php remained blank, so I changed the address manually to admin/backups.php and restored the earlier copy of the page which I have pasted the treacherous char to. All looks fine from now on, but this bug deserves a fix.
Reply
#22
yes, nice bug catch.. I put into the title of my index page to test and its completely borked my test site.
Even removing it I cannot view anything... frontend or admin 8)

Edit: just downloaded the latest SVN version and did the same test and all is working fine... strange...

Edit Again : I've just installed both 3.0 and 3.1 and I can now not replicate this problem again. I've copied the whole string into the title and content section and all is working fine...
My Github Repos: Github
Website: DigiMute
Reply
#23
n00dles101 Wrote:I've just installed both 3.0 and 3.1 and I can now not replicate this problem again. I've copied the whole string into the title and content section and all is working fine...
Well, I pasted just a last char from the problematic string as a first one of the page content. Moreover I did it on a page from already existing site with UTF-8 template, not on fresh install. Is there any file relating to pages structure that is always used by pages.php and not used by the other screens of the admin backend?
Reply
#24
Yep, that's the result - you can't get into control panel. Only deleting file with that character helps.
Reply
#25
This should be fixed now in SVN - http://code.google.com/p/get-simple-cms/...ail?id=281
- Chris
Thanks for using GetSimple! - Download

Please do not email me directly for help regarding GetSimple. Please post all your questions/problems in the forum!
Reply




Users browsing this thread: 2 Guest(s)