* [gentoo-user] utf8_general_ci
@ 2015-05-05 15:32 Joseph
2015-05-05 16:32 ` Fernando Rodriguez
2015-05-06 22:14 ` [gentoo-user] MySQL utf8 support (was Re: utf8_general_ci) Harm Geerts
0 siblings, 2 replies; 5+ messages in thread
From: Joseph @ 2015-05-05 15:32 UTC (permalink / raw
To: gentoo-user
I have my mysql database "Collation" set as: utf8_general_ci
but when a customer from for example Japan places an order all I see is:
竹鼻立原町5-5
Do I need to change "Collation" setting to something else or something else?
--
Joseph
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gentoo-user] utf8_general_ci
2015-05-05 15:32 [gentoo-user] utf8_general_ci Joseph
@ 2015-05-05 16:32 ` Fernando Rodriguez
2015-05-05 17:03 ` Joseph
2015-05-06 22:14 ` [gentoo-user] MySQL utf8 support (was Re: utf8_general_ci) Harm Geerts
1 sibling, 1 reply; 5+ messages in thread
From: Fernando Rodriguez @ 2015-05-05 16:32 UTC (permalink / raw
To: gentoo-user
On Tuesday, May 05, 2015 9:32:15 AM Joseph wrote:
> I have my mysql database "Collation" set as: utf8_general_ci
>
> but when a customer from for example Japan places an order all I see is:
>
>
竹鼻立原町5-5
>
> Do I need to change "Collation" setting to something else or something else?
>
>
I think that's because the web applications runs the data through something
like php's htmlspecialchars() or similar to help prevent SQL injections. So
you'll need to either decode it before using it (I think you can use the app-
text/recode), or use a different method to filter anything that could be
malicious SQL.
--
Fernando Rodriguez
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gentoo-user] utf8_general_ci
2015-05-05 16:32 ` Fernando Rodriguez
@ 2015-05-05 17:03 ` Joseph
2015-05-05 17:53 ` Fernando Rodriguez
0 siblings, 1 reply; 5+ messages in thread
From: Joseph @ 2015-05-05 17:03 UTC (permalink / raw
To: gentoo-user
On 05/05/15 12:32, Fernando Rodriguez wrote:
>On Tuesday, May 05, 2015 9:32:15 AM Joseph wrote:
>> I have my mysql database "Collation" set as: utf8_general_ci
>>
>> but when a customer from for example Japan places an order all I see is:
>>
>>
>竹鼻立原町5-5
>>
>> Do I need to change "Collation" setting to something else or something else?
>>
>>
>
>I think that's because the web applications runs the data through something
>like php's htmlspecialchars() or similar to help prevent SQL injections. So
>you'll need to either decode it before using it (I think you can use the app-
>text/recode), or use a different method to filter anything that could be
>malicious SQL.
I've saved the relevant information into a TXT file (address.txt) and tried to run: recode ISO-8859-9..UTF8 < address.txt > address2.txt
&#31481;&#40763;&#31435;&#21407;&#30010;&#65301;&#65293;&#65301;
&#23665;&#31185;&#21306;
&#20140;&#37117;&#24066;, 601-8015
&#20140;&#37117;&#24220;, Japan
It didn't help. How do you run "recode" correctly?
Yes, the customer is using oscommerce php addlication to provide information.
--
Joseph
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gentoo-user] utf8_general_ci
2015-05-05 17:03 ` Joseph
@ 2015-05-05 17:53 ` Fernando Rodriguez
0 siblings, 0 replies; 5+ messages in thread
From: Fernando Rodriguez @ 2015-05-05 17:53 UTC (permalink / raw
To: gentoo-user
[-- Attachment #1: Type: text/plain, Size: 1567 bytes --]
On Tuesday, May 05, 2015 11:03:38 AM Joseph wrote:
> On 05/05/15 12:32, Fernando Rodriguez wrote:
> >On Tuesday, May 05, 2015 9:32:15 AM Joseph wrote:
> >> I have my mysql database "Collation" set as: utf8_general_ci
> >>
> >> but when a customer from for example Japan places an order all I see is:
> >>
> >>
>
>&#31481;&#40763;&#31435;&#21407;&#30010;&#65301;&#65293;&#65301;
> >>
> >> Do I need to change "Collation" setting to something else or something
else?
> >>
> >>
> >
> >I think that's because the web applications runs the data through something
> >like php's htmlspecialchars() or similar to help prevent SQL injections. So
> >you'll need to either decode it before using it (I think you can use the
app-
> >text/recode), or use a different method to filter anything that could be
> >malicious SQL.
>
> I've saved the relevant information into a TXT file (address.txt) and tried
to run: recode ISO-8859-9..UTF8 < address.txt > address2.txt
>
>
&#31481;&#40763;&#31435;&#21407;&#30010;&#65301;&#65293;&#65301;
> &#23665;&#31185;&#21306;
> &#20140;&#37117;&#24066;, 601-8015
> &#20140;&#37117;&#24220;, Japan
>
> It didn't help. How do you run "recode" correctly?
> Yes, the customer is using oscommerce php addlication to provide
information.
It looks like they ran it through the encoding function twice. This worked for
me:
recode html..utf8 < test.txt | recode html..utf8
--
Fernando Rodriguez
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* [gentoo-user] MySQL utf8 support (was Re: utf8_general_ci)
2015-05-05 15:32 [gentoo-user] utf8_general_ci Joseph
2015-05-05 16:32 ` Fernando Rodriguez
@ 2015-05-06 22:14 ` Harm Geerts
1 sibling, 0 replies; 5+ messages in thread
From: Harm Geerts @ 2015-05-06 22:14 UTC (permalink / raw
To: gentoo-user
On Tuesday 05 May 2015 09:32:15 Joseph wrote:
> I have my mysql database "Collation" set as: utf8_general_ci
>
> but when a customer from for example Japan places an order all I see is:
>
> &#31481;&#40763;&#31435;&#21407;&#30010;&#65301;&
> ;#65293;&#65301;
>
> Do I need to change "Collation" setting to something else or something else?
I'm not sure which character codes are used for Japanese but it's worth noting
that mysql's utf8 encoding is a partial implementation which only supports 3
bytes per character.
For full utf8 support you'll need to use the utf8mb4 encoding.
https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-utf8mb4.html
https://mathiasbynens.be/notes/mysql-utf8mb4
Note that this is not relevant to your problem which is covered by
Fernando Rodriguez' reply.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-05-06 22:14 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-05 15:32 [gentoo-user] utf8_general_ci Joseph
2015-05-05 16:32 ` Fernando Rodriguez
2015-05-05 17:03 ` Joseph
2015-05-05 17:53 ` Fernando Rodriguez
2015-05-06 22:14 ` [gentoo-user] MySQL utf8 support (was Re: utf8_general_ci) Harm Geerts
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox