Naive Bayesian Classifier 介绍
这是一个非常简单的 Python 库,实现了朴素贝叶斯分类器。
示例代码:
""" Suppose you have some texts of news and kNow their categories. You want to train a system with this pre-categorized/pre-classified texts. So, you have better call this data your training set. """ from naiveBayesClassifier import tokenizer from naiveBayesClassifier.trainer import Trainer from naiveBayesClassifier.classifier import Classifier newsTrainer = Trainer(tokenizer.Tokenizer(stop_words = [], signs_to_remove = ["?!#%&"])) # You need to train the system passing each text one by one to the trainer module. newsSet =[ {'text': 'not to eat too much is not enough to lose weight', 'category': 'health'}, {'text': 'Russia is trying to invade Ukraine', 'category': 'politics'}, {'text': 'do not neglect exercise', 'category': 'health'}, {'text': 'Syria is the main issue, Obama says', 'category': 'politics'}, {'text': 'eat to lose weight', 'category': 'health'}, {'text': 'you should not eat much', 'category': 'health'} ] for news in newsSet: newsTrainer.train(news['text'], news['category']) # When you have sufficient trained data, you are almost done and can start to use # a classifier. newsClassifier = Classifier(newsTrainer.data, tokenizer.Tokenizer(stop_words = [], signs_to_remove = ["?!#%&"])) # Now you have a classifier which can give a try to classifiy text of news whose # category is unkNown, yet. unkNownInstance = "Even if I eat too much, is not it possible to lose some weight" classification = newsClassifier.classify(unkNownInstance) # the classification variable holds the possible categories sorted by # their probablity value print classification
Naive Bayesian Classifier 官网
https://github.com/muatik/naive-bayes-classifier
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。