Punctuation characters in RSS import

Author Message

Paul Carpenter

Tuesday 07 September 2004 3:43:13 pm

RSS import is working fine except that special characters in titles and descriptions show up in the browser with the ampersand converted into the

&amp

sequence. The user see's

'

instead of the punctuation character.

The characters are coming in correctly from the RSS feed. What comes in as

'

is being converted to

&amp#039;

.

Can someone point me to where this conversion is taking place or how to reverse the conversion?

Chalda Pnuzig

Tuesday 11 September 2007 2:42:43 am

I add this in rssimport.php, in function getCDATA (before return)

	$char=array('''=>''', '−'=>'-', 'ˆ'=>'^', '˜'=>'~', 'Š'=>'Š', '‹'=>'‹', 'Œ'=>'Œ', '‘'=>'‘', '’'=>'’', '“'=>'“', '”'=>'”', '•'=>'•', '–'=>'–', '—'=>'—', '˜'=>'˜', '™'=>'™', 'š'=>'š', '›'=>'›', 'œ'=>'œ', 'Ÿ'=>'Ÿ', 'ÿ'=>'ÿ', 'Œ'=>'Œ', 'œ'=>'œ', 'Š'=>'Š', 'š'=>'š', 'Ÿ'=>'Ÿ', 'ƒ'=>'ƒ', 'ˆ'=>'ˆ', '˜'=>'˜', 'Α'=>'Α', 'Β'=>'Β', 'Γ'=>'Γ', 'Δ'=>'Δ', 'Ε'=>'Ε', 'Ζ'=>'Ζ', 'Η'=>'Η', 'Θ'=>'Θ', 'Ι'=>'Ι', 'Κ'=>'Κ', 'Λ'=>'Λ', 'Μ'=>'Μ', 'Ν'=>'Ν', 'Ξ'=>'Ξ', 'Ο'=>'Ο', 'Π'=>'Π', 'Ρ'=>'Ρ', 'Σ'=>'Σ', 'Τ'=>'Τ', 'Υ'=>'Υ', 'Φ'=>'Φ', 'Χ'=>'Χ', 'Ψ'=>'Ψ', 'Ω'=>'Ω', 'α'=>'α', 'β'=>'β', 'γ'=>'γ', 'δ'=>'δ', 'ε'=>'ε', 'ζ'=>'ζ', 'η'=>'η', 'θ'=>'θ', 'ι'=>'ι', 'κ'=>'κ', 'λ'=>'λ', 'μ'=>'μ', 'ν'=>'ν', 'ξ'=>'ξ', 'ο'=>'ο', 'π'=>'π', 'ρ'=>'ρ', 'ς'=>'ς', 'σ'=>'σ', 'τ'=>'τ', 'υ'=>'υ', 'φ'=>'φ', 'χ'=>'χ', 'ψ'=>'ψ', 'ω'=>'ω', 'ϑ'=>'ϑ', 'ϒ'=>'ϒ', 'ϖ'=>'ϖ', ' '=>' ', ' '=>' ', ' '=>' ', '‌'=>'‌', '‍'=>'‍', '‎'=>'‎', '‏'=>'‏', '–'=>'–', '—'=>'—', '‘'=>'‘', '’'=>'’', '‚'=>'‚', '“'=>'“', '”'=>'”', '„'=>'„', '†'=>'†', '‡'=>'‡', '•'=>'•', '…'=>'…', '‰'=>'‰', '′'=>'′', '″'=>'″', '‹'=>'‹', '›'=>'›', '‾'=>'‾', '⁄'=>'⁄', '€'=>'€','ℑ'=>'ℑ', '℘'=>'℘', 'ℜ'=>'ℜ', '™'=>'™', 'ℵ'=>'ℵ', '←'=>'←', '↑'=>'↑', '→'=>'→', '↓'=>'↓', '↔'=>'↔', '↵'=>'↵', '⇐'=>'⇐', '⇑'=>'⇑', '⇒'=>'⇒', '⇓'=>'⇓', '⇔'=>'⇔', '∀'=>'∀', '∂'=>'∂', '∃'=>'∃', '∅'=>'∅', '∇'=>'∇', '∈'=>'∈', '∉'=>'∉', '∋'=>'∋', '∏'=>'∏', '∑'=>'∑', '−'=>'−', '∗'=>'∗', '√'=>'√', '∝'=>'∝', '∞'=>'∞', '∠'=>'∠', '∧'=>'∧', '∨'=>'∨', '∩'=>'∩', '∪'=>'∪', '∫'=>'∫', '∴'=>'∴', '∼'=>'∼', '≅'=>'≅', '≈'=>'≈', '≠'=>'≠', '≡'=>'≡', '≤'=>'≤', '≥'=>'≥', '⊂'=>'⊂', '⊃'=>'⊃', '⊄'=>'⊄', '⊆'=>'⊆', '⊇'=>'⊇', '⊕'=>'⊕', '⊗'=>'⊗', '⊥'=>'⊥', '⋅'=>'⋅', '⌈'=>'⌈', '⌉'=>'⌉', '⌊'=>'⌊', '⌋'=>'⌋', '⟨'=>'〈', '⟩'=>'〉', '◊'=>'◊', '♠'=>'♠', '♣'=>'♣', '♥'=>'♥', '♦'=>'♦');
	foreach ($char as $key=>$value)
		$textCDATA=str_replace($key,$value,$textCDATA);
	$textCDATA=iconv("UTF-8","ISO-8859-1",$textCDATA);
	$textCDATA=html_entity_decode($textCDATA, ENT_QUOTES, 'ISO-8859-1');
	$textCDATA=iconv("ISO-8859-1","UTF-8",$textCDATA);

I feel like I could... like I could... TAKE ON THE WORLD!!!

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.