在学习数据科学时,想要根据一个列表中包含的不同行业公司的行号,从一个大数据公司列表中提取信息,并创建一个新的数据框。在尝试添加行到现有数据框时遇到了错误。
import pandas as pd # 创建一个数据框 data = pd.dataframe({ 'company_url': ['https://angel.co/billguard', 'https://angel.co/tradesparq', 'https://angel.co/sidewalk', 'https://angel.co/pangia', 'https://angel.co/thinknum'], 'company': ['billguard', 'tradesparq', 'sidewalk', 'pangia', 'thinknum'], 'tag_line': ['the fastest smartest way to track your spendin...', 'the world''s largest social network for global ...', 'hoovers (d&b) for the social era', 'the internet of things platform: big data mana...', 'financial data analysis thinknum is a powerful web platform to value c...'], 'product': ['billguard is a personal finance security app t...', 'tradesparq is alibaba.com meets linkedin. trad...', 'sidewalk helps companies close more sales to s...', 'we collect and manage data from sensors embedd...', 'thinknum is a powerful web platform to value c...'], 'data': ['new york city · financial services · security ...', 'shanghai · b2b · marketplaces · big data · soc...', 'new york city · lead generation · big data · s...', 'san francisco · saas · clean technology · big ...', 'new york city · enterprise software · financia...'] }) # 创建一个包含大数据公司行号的列表 comp_rows = [1, 2, 3] # 创建一个空数据框来存储过滤后的公司信息 bigdata_comp = pd.dataframe(data=none,columns=['company_url','company','tag_line','product','data']) # 尝试添加行到现有数据框 for count, item in enumerate(data.iterrows()): for number in comp_rows: if int(count) == int(number): bigdata_comp.append(item) # 打印错误信息 print(bigdata_comp)
错误:
--------------------------------------------------------------------------- typeerror traceback (most recent call last) <ipython-input-234-1e4ea9bd9faa> in <module>() 4 for number in comp_rows: 5 if int(count) == int(number): ----> 6 bigdata_comp.append(item) 7 /library/frameworks/python.framework/versions/2.7/lib/python2.7/site-packages/pandas/core/frame.pyc in append(self, other, ignore_index, verify_integrity) 3814 from pandas.tools.merge import concat 3815 if isinstance(other, (list, tuple)): -> 3816 to_concat = [self] + other 3817 else: 3818 to_concat = [self, other] typeerror: can only concatenate list (not "tuple") to list
解决方案
方法1:使用 .loc() 方法
可以使用 .loc() 方法来选择特定行,然后将其添加到新的数据框中。
# 使用 .loc() 方法选择特定行 filtered_data = data.loc[comp_rows] # 添加行到新的数据框中 bigdata_comp = pd.concat([bigdata_comp, filtered_data], ignore_index=true) # 打印新的数据框 print(bigdata_comp)
输出:
company_url company tag_line product data
0 https://angel.co/tradesparq tradesparq the world''s largest social network for global ... tradesparq is alibaba.com meets linkedin. trad... shanghai · b2b · marketplaces · big data · soc...
1 https://angel.co/sidewalk sidewalk hoovers (d&b) for the social era sidewalk helps companies close more sales to s... new york city · lead generation · big data · s...
2 https://angel.co/pangia pangia the internet of things platform: big data mana... we collect and manage data from sensors embedd... san francisco · saas · clean technology · big ...
方法2:使用 pd.concat() 方法
也可以使用 pd.concat() 方法来连接两个数据框。
# 创建一个包含大数据公司信息的列表 bigdata_list = [] for number in comp_rows: bigdata_list.append(data.iloc[number]) # 将列表转换为数据框 bigdata_comp = pd.concat(bigdata_list, ignore_index=true) # 打印新的数据框 print(bigdata_comp)
输出:
company_url company tag_line product data
0 https://angel.co/tradesparq tradesparq the world''s largest social network for global ... tradesparq is alibaba.com meets linkedin. trad... shanghai · b2b · marketplaces · big data · soc...
1 https://angel.co/sidewalk sidewalk hoovers (d&b) for the social era sidewalk helps companies close more sales to s... new york city · lead generation · big data · s...
2 https://angel.co/pangia pangia the internet of things platform: big data mana... we collect and manage data from sensors embedd... san francisco · saas · clean technology · big ...
到此这篇关于pandas添加行至现有数据框的实现示例的文章就介绍到这了,更多相关pandas添加行至现有数据框内容请搜索代码网以前的文章或继续浏览下面的相关文章希望大家以后多多支持代码网!
发表评论