无法用python打开unicode网址(Can't open Unicode URL with Python)

网站建设820 更新时间：2025-06-08 17:01:50

使用Python 2.5.2和Linux的debian我试图从西班牙语的URL中获取包含西班牙语字符（'í'）的内容：

import urllib url = u'http://mydomain.es/índice.html' content = urllib.urlopen(url).read()

我收到这个错误：

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 8: ordinal not in range(128)

我已经尝试使用之前将URL传递给urllib：

url = urllib.quote(url)

和这个：

url = url.encode('UTF-8')

但它不起作用

你能告诉我我做错了什么吗？

Using Python 2.5.2 and Linux Debian, I'm trying to get the content from a Spanish URL that contains a Spanish char 'í':

import urllib url = u'http://mydomain.es/índice.html' content = urllib.urlopen(url).read()

I'm getting this error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 8: ordinal not in range(128)

I've tried using before passing the url to urllib this:

url = urllib.quote(url)

and this:

url = url.encode('UTF-8')

but they didn't work.

Can you tell me what I am doing wrong ?

最满意答案

根据适用标准RFC 1378 ，URL只能包含ASCII字符。这里有很好的解释，我引用：

“...只有字母数字[0-9a-zA-Z]，特殊字符”$ -_。+！*'（），“[不包括引号]和用于保留目的的保留字符在URL中使用未编码。“

正如我给出的网址解释的那样，这可能意味着您必须用'％ED'替换“带有尖锐重音的小写字母”。

Per the applicable standard, RFC 1378, URLs can only contain ASCII characters. Good explanation here, and I quote:

"...Only alphanumerics [0-9a-zA-Z], the special characters "$-_.+!*'()," [not including the quotes - ed], and reserved characters used for their reserved purposes may be used unencoded within a URL."

As the URLs I've given explain, this probably means you'll have to replace that "lowercase i with acute accent" with `%ED'.

无法用python打开unicode网址(Can't open Unicode URL with Python)

使用Python 2.5.2和Linux的debian我试图从西班牙语的URL中获取包含西班牙语字符（'í'）的内容：

import urllib url = u'http://mydomain.es/índice.html' content = urllib.urlopen(url).read()

我收到这个错误：

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 8: ordinal not in range(128)

我已经尝试使用之前将URL传递给urllib：

url = urllib.quote(url)

和这个：

url = url.encode('UTF-8')

但它不起作用

你能告诉我我做错了什么吗？

Using Python 2.5.2 and Linux Debian, I'm trying to get the content from a Spanish URL that contains a Spanish char 'í':

import urllib url = u'http://mydomain.es/índice.html' content = urllib.urlopen(url).read()

I'm getting this error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 8: ordinal not in range(128)

I've tried using before passing the url to urllib this:

url = urllib.quote(url)

and this:

url = url.encode('UTF-8')

but they didn't work.

Can you tell me what I am doing wrong ?

最满意答案

根据适用标准RFC 1378 ，URL只能包含ASCII字符。这里有很好的解释，我引用：

“...只有字母数字[0-9a-zA-Z]，特殊字符”$ -_。+！*'（），“[不包括引号]和用于保留目的的保留字符在URL中使用未编码。“

正如我给出的网址解释的那样，这可能意味着您必须用'％ED'替换“带有尖锐重音的小写字母”。

Per the applicable standard, RFC 1378, URLs can only contain ASCII characters. Good explanation here, and I quote:

"...Only alphanumerics [0-9a-zA-Z], the special characters "$-_.+!*'()," [not including the quotes - ed], and reserved characters used for their reserved purposes may be used unencoded within a URL."

As the URLs I've given explain, this probably means you'll have to replace that "lowercase i with acute accent" with `%ED'.

本文发布于:2023-08-28，感谢您对本站的认可！

本文链接:http://torson.com.cn/wangzhan/1693208380a697203.html

无法用python打开unicode网址(Can't open Unicode URL with Python)

最满意答案

最满意答案

发布评论取消回复

最近发表

相关推荐

标签列表

无法用python打开unicode网址(Can't open Unicode URL with Python)

最满意答案

最满意答案

发布评论 取消回复

最近发表

相关推荐

标签列表

发布评论取消回复