from sklearn.feature_extraction.text import TfidfVectorizer X_tfidf= tfidf.fit_transform(df['liststring']).toarray() reverse_vocab ={v:k for k,v in vocab.items()}
feature_names = <em
TfidfVectorizer(max_features=10000, min_df=5, max_df=0.7, stop_words=stopwords.words('english'))tfidf_obj= tfidfconverter.fit(processed_text)//this is what will be used again
joblib.dump(tfidf_obj, 'tf-idf.joblib
--我发现在我的情况下有效的解决方案--张贴在下面。希望这对某人有所帮助。,我将如何将用sklearn创建的TF-国防军的输出传递到Keras模型或张量中,然后再输入到一个密集的神经网络中?我正在处理FakeNewsChallenge数据集。任何指导都会有帮助。训练集-标题,正文,标签
BodyIDs.train_bodies -培训集分为两种不同的CSV (train_bodies,train_stances),并通过- Body ID (num)、articleBody (text)train_stances -标题(文本)、Body