在爬取过程中运行程序出现了以下错误,回到原网页查看发现评论中出现表情导致出错,百度后可以加入以下语句解决此问题
Traceback (most recent call last):File "C:Users萌萌哒炸鸡腿Desktoppython豆瓣书评.py", line 29, in <module>print(x,')',comment)
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 5-5: Non-BMP character not supported in Tk
import sys
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
x = 'This works! U0001F44D'
anslate(non_bmp_map))
code:
import re
import requests
from bs4 import BeautifulSoupimport sys
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)LIST = []
urls = []
url = '='
for i in range(1,25):urls.append(url + str(i))
for u in urls:response = (u)html = soup = BeautifulSoup(html,'lxml')List = soup.find_all('span',class_ = "short")for i in List:LIST.)print(len(LIST))
x = 1
for comment in LIST:print('(',x,')',anslate(non_bmp_map))x += 1'''
import sys
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
x = 'This works! U0001F44D'
anslate(non_bmp_map))
'''
错误解决参考博客
本文发布于:2024-02-04 18:46:21,感谢您对本站的认可!
本文链接:https://www.4u4v.net/it/170713999558499.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
留言与评论(共有 0 条评论) |