ASP汉字编码转换UTF-8及UTF-8转换GB2312

作者:admin 日期:2012-04-11

function chinese2unicode(Str)
dim i
dim Str_one
dim Str_unicode
for i=1 to len(Str)
    Str_one=Mid(Str,i,1)
    Str_unicode=Str_unicode&chr(38)
    Str_unicode=Str_unicode&chr(35)
    Str_unicode=Str_unicode&chr(120)
    Str_unicode=Str_unicode& Hex(ascw(Str_one))
    Str_unicode=Str_unicode&chr(59)
next
Response.Write Str_unicode
end function

UTF-8 To GB2312

function UTF2GB(UTFStr)
    for Dig=1 to len(UTFStr)
        if mid(UTFStr,Dig,1)="%" then
            if len(UTFStr) >= Dig+8 then
                GBStr=GBStr & ConvChinese(mid(UTFStr,Dig,9))
                Dig=Dig+8
            else
                GBStr=GBStr & mid(UTFStr,Dig,1)
            end if
        else
            GBStr=GBStr & mid(UTFStr,Dig,1)
        end if
    next
    UTF2GB=GBStr
end function

function ConvChinese(x)
    A=split(mid(x,2),"%")
    i=0
    j=0

    for i=0 to ubound(A)
        A(i)=c16to2(A(i))
    next

    for i=0 to ubound(A)-1
        DigS=instr(A(i),"0")
        Unicode=""
        for j=1 to DigS-1
            if j=1 then
                A(i)=right(A(i),len(A(i))-DigS)
                Unicode=Unicode & A(i)
            else
                i=i+1
                A(i)=right(A(i),len(A(i))-2)
                Unicode=Unicode & A(i)
            end if
        next

        if len(c2to16(Unicode))=4 then
            ConvChinese=ConvChinese & chrw(int("&H" & c2to16(Unicode)))
        else
            ConvChinese=ConvChinese & chr(int("&H" & c2to16(Unicode)))
        end if
    next
end function

Tags: asp

分类:技术文章 | 固定链接 | 评论: 0 | 引用: 0 | 查看次数: 2233

PHP判断字符串是UTF-8还是GB2312编码

作者:admin 日期:2012-04-11

当在php中使用mb_detect_encoding函数进行编码识别时，很多人都碰到过识别编码有误的问题，例如对与GB2312和UTF- 8，或者UTF-8和GBK(这里主要是对于cp936的判断),网上说是由于字符短是，mb_detect_encoding会出现误判。
例如:

复制代码代码如下:

PHP代码

$encode = mb_detect_encoding($keytitle, array("ASCII",'UTF-8′,"GB2312′,"GBK",'BIG5′));
if ($encode == “UTF-8″){
$keytitle = iconv("UTF-8″,"GBK",$keytitle);
}

这段代码的作用是检测字符串的编码是否UTF-8，是的话就转换为GBK。
可是当 $keytitle = “%D0%BE%C6%AC”;时。检测结果却是UTF-8.这个bug其实不算是bug，写程序时也不应当过于依赖mb_detect_encoding，当字符串较短时，检测结果产生偏差的可能性很大。
怎么解决呢，我的办法是：
复制代码代码如下:

PHP代码