bank marketing数据集预处理

阅读: 评论:0

bank marketing数据集预处理

bank marketing数据集预处理

采用bank marketing数据集,官方下载地址:
UCI Machine Learning Repository

博主下载了一份,免费分享

链接:=e3f6 
提取码:e3f6

根据如下文章为参考

因为该文章代码是图片,所以在实验成功后把文字代码放上来。并且对部分代码进行了微调,例如不是针对所有特征列进行独热编码。

import pandas as pd
import random
df = pd.read_csv('bank-additional-full.csv', encoding='utf-8-sig', sep=';')
print(df.tail())
for col in df. columns:if type(df[col][0]) is str:print ("unknown value count in "+ col +" is "+ str(df[df[col]=='unknown']['y'].count()))#缺失值处理
df.loc[df["job"] == "unknown","job"] = "admin."
df.loc[df["marital"] == "unknown","marital"] = "married"
df.loc[df["education"] == "unknown","education"] = "university.degree"
df.loc[df["housing"] == "unknown","housing"] = random.choice(["yes", "no"])
df.loc[df["loan"] == "unknown","loan"] = "no"
df = df.drop(["default"], axis= 1)#数据编码
for col lumns:if type(df[col][0]) is str:df.loc[df[col] == "no", col] = 0df.loc[df[col] == "yes", col] = 1df.education = place({"illiterate" : 1,"basic.4y" : 2, "basic.6y" : 3,"basic.9y" : 4, "high.school" : 5, &#urse" : 6,"university.degree" : 7})
df.month = place({"jan" : 1,"feb" : 2, "mar" : 3,"apr" : 4, "may" : 5, "jun" : 6,"jul" : 7, "aug" : 8, "sep" : 9,"oct" : 10, "nov" : 11, "dec" : 12})
df.day_of_week = df.day_place({"mon" : 1,"tue" : 2, "wed" : 3,"thu" : 4, "fri" : 5})
df.contact = place({"cellular" : 0,"telephone" : 1})
df.poutcome = place({"failure" : 0,"nonexistent" : 1,"success": 2})
# 对指定列进行独热编码
encoded_cols = pd.get_dummies(df[['job', 'contact', 'marital']])# 将独热编码后的列与原始数据进行合并
df = pd.concat([df, encoded_cols], axis=1)# 删除原始的 'job', 'contact', 'marital' 列
df = df.drop(['job', 'contact', 'marital'], axis=1)# 将 'y' 列移动到最后一列
y_column = df.pop('y')  # 移除 'y' 列并返回该列
df['y'] = y_column  # 将 'y' 列添加到 DataFrame 的最后一列# 将处理后的数据保存为 CSV 文件
df.to_csv('processed_data.csv', index=True)

本文发布于:2024-02-02 03:55:57,感谢您对本站的认可!

本文链接:https://www.4u4v.net/it/170681765541197.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:数据   bank   marketing
留言与评论(共有 0 条评论)
   
验证码:

Copyright ©2019-2022 Comsenz Inc.Powered by ©

网站地图1 网站地图2 网站地图3 网站地图4 网站地图5 网站地图6 网站地图7 网站地图8 网站地图9 网站地图10 网站地图11 网站地图12 网站地图13 网站地图14 网站地图15 网站地图16 网站地图17 网站地图18 网站地图19 网站地图20 网站地图21 网站地图22/a> 网站地图23