腾讯云

文章/答案/技术大牛

发布

社区首页 >问答首页 >JSON格式的字符串到熊猫数据

问JSON格式的字符串到熊猫数据
EN

Stack Overflow用户

提问于 2016-07-21 13:54:02

回答 2查看 1.3K关注 0票数 1

好的，我整个下午都在用这面墙砸我的头。我知道有很多类似的帖子，但是我不断地犯错误，而且很可能正在犯一个愚蠢的错误。

我使用这里找到的apyori包来做一些事务篮分析：https://pypi.python.org/pypi/apyori/1.1.1

看来，包dump_as_json()方法会为每个可能的篮子释放出RelationRecords字典。

我想把这些json格式的字典合并成一个熊猫数据集，但是当我尝试使用pd.read_json()时，却遇到了不同的错误。

这是我的代码：

import apyori, shutil, os
from apyori import apriori
from apyori import dump_as_json
import pandas as pd
import json

try:
    from StringIO import StringIO
except ImportError:
    from io import StringIO

transactions = [
    ['Jersey','Magnet'],
    ['T-Shirt','Cap'],
    ['Magnet','T-Shirt'],
    ['Jersey', 'Pin'],
    ['T-Shirt','Cap']
]
results = list(apriori(transactions))
results_df = pd.DataFrame()
for RelationRecord in results:
    dump_as_json(RelationRecord,output_file)
print output_file.getvalue()
json_file = json.dumps(output_file.getvalue())
print json_file


print data_df.head()

有什么想法吗?如何将存储在output_file中的json格式的字典存储到熊猫数据中？

python

json

apriori

回答 2

Stack Overflow用户

回答已采纳

发布于 2016-07-21 14:36:52

我建议阅读StackOverflow关于生成Minimal, Complete, and Verifiable example的指南。此外，像“我不断犯错误”这样的说法也没有多大帮助。尽管如此，我还是看了一下您的代码和这个apyori包的源代码。撇开排字不谈，看起来问题线就在这里：

for RelationRecord in results:
    dump_as_json(RelationRecord,output_file)

您正在创建一个每行一个对象的JSON文件(我认为这有时被称为LSON或Line-JSON)。作为一个完整的文档，它只是无效的JSON。您可以尝试将其保留为同构字典或其他一些pd.DataFrame友好结构的列表。

output = []
for RelationRecord in results:
    o = StringIO()
    dump_as_json(RelationRecord, o)
    output.append(json.loads(o.getvalue()))
data_df = pd.DataFrame(output)

票数 2

Stack Overflow用户

发布于 2018-03-21 23:08:05

您可以使用以下脚本进一步将Apriori结果转换为更好看的dataframe：

summary_df = pd.DataFrame(columns=('Items','Antecedent','Consequent','Support','Confidence','Lift'))

Support =[]
Confidence = []
Lift = []
Items = []
Antecedent = []
Consequent=[]

for RelationRecord in results: 
    for ordered_stat in RelationRecord.ordered_statistics:
        Support.append(RelationRecord.support)
        Items.append(RelationRecord.items)
        Antecedent.append(ordered_stat.items_base)
        Consequent.append(ordered_stat.items_add)
        Confidence.append(ordered_stat.confidence)
        Lift.append(ordered_stat.lift)

summary_df['Items'] = Items                                   
summary_df['Antecedent'] = Antecedent
summary_df['Consequent'] = Consequent
summary_df['Support'] = Support
summary_df['Confidence'] = Confidence
summary_df['Lift']= Lift

最后的dataframe看起来如下：

希望这会有所帮助:)

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/38514984

复制

相似问题

问JSON格式的字符串到熊猫数据
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问JSON格式的字符串到熊猫数据EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问JSON格式的字符串到熊猫数据
EN