前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >python 实现csdn平台自动化定时评论功能实现

python 实现csdn平台自动化定时评论功能实现

作者头像
大家一起学编程
发布2021-10-13 16:49:10
9220
发布2021-10-13 16:49:10
举报

Csdn自动化评价功能

前言

之前一段时间,看到一些博主在文章下自动评论,当然我是欢迎大家为我的文章进行评论。我也思考了一下,如果是我来进行开发自动化评价,我要如何操作。

首先,我们需要思考的问题,我们先一个一个的把它们列出来。

1、获取文章,并获取到文章id

2、获取到评价的接口。

3、如何处理重复评价。

4、系统中断如何处理。

正文

一、分析获取文章id

进入csdn,找到文章列表,按F12,分析文章数据返回接口和返回数据。我们提取主要的数据。

然后去获取到我们所需要的接口,详细步骤这里就不说明了。

使用相同的方式,在自己的文章下面发送一条消息,就可以获取到发送评论的接口。

接着,我们来思考一下我们的流程。

二、流程设计

我们可以采取两种方式:

1、获取到文章列表,然后获取到文章id,然后查询是否评论,评论就跳过,没有评论就评论。

优点:不需要储存任何数据,减少操作。

缺点:文章重复查询,重复查询是否评论,效率不高。

2、我们先将获取到的文章列表全部储存到数据库,然后通过获取数据库未评价文章id,进行评价,评价完成之后,标记为已评价,无需查询是否已评价。

优点:效率相对1提高,不会重复查询是否评论。

缺点:需要数据库服务器,需要掌握数据库相关操作。

这里我们采用第二种方式,但我们之前的文章可能有我们评论过的,所以,我们还是需要判断一下是否有评论过,于是我们的流程变了。

三、数据库设计

我们已经设计好流程了,然后来设计数据库字段。

我们需要文章id,文章url,作者,是否评论字段,是否点赞字段,当然你还可以增加其他的一些字段。

我们开始创建表:(防止昵称有表情符号,编码不使用utf8,采用utf8mb4 )

代码语言:javascript
复制
CREATE TABLE `article` (
  `articleId` bigint NOT NULL COMMENT 'id',
  `articleDetailUrl` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT 'url',
  `articleTitle` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '标题',
  `nickName` varchar(30) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '昵称',
  `hotRankScore` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '热度',
  `comment` int NOT NULL DEFAULT '0' COMMENT '是否评论(0,否,1是)',
  `like` int NOT NULL DEFAULT '0' COMMENT '是否点赞(0,否,1是)',
  `insert_time` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`articleId`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci ROW_FORMAT=DYNAMIC;

这样,我们就创建好了我们的数据表。

四、操作数据库

我们已经建好了数据库,我们现在思考一下,我们需要用到哪些操作。思考方向,始终围绕着增删改查来思考,这样你会更加清楚。

1、增加:插入文章列表数据

2、删除:这里我们用不到

3、修改:未评论更新为已评论

4、查询:查询未评论的文章id,查询该文章列表是否插入过数据库

因此我们编写方法:

代码语言:javascript
复制
代码语言:javascript
复制
POOL = PooledDB(
    creator=pymysql,    maxconnections=20,      mincached=6,     maxcached=None,     maxshared=5,
    blocking=True,     maxusage=None,    setsession=[],     ping=0,        host='127.0.0.1',port=3306,user='root',password='root',database='csdn_article',charset='utf8')

def insert_article(articleId, articleDetailUrl,articleTitle, nickName, hotRankScore):
    db = POOL.connection()
    conn = db.cursor()# 使用cursor()方法获取操作游标
    conn.execute("INSERT INTO `article`(`articleId`, `articleDetailUrl`,`articleTitle`, `nickName`, `hotRankScore`) VALUES (%s, '%s','%s', '%s', '%s');"%(articleId, articleDetailUrl,pymysql.escape_string(articleTitle), pymysql.escape_string(nickName), hotRankScore))# 使用execute方法执行SQL语句
    data=db.commit()# 使用 fetchone() 方法获取一条数据
    db.close()
    return data

def select_is_insert(articleId):
    db = POOL.connection()
    conn = db.cursor()# 使用cursor()方法获取操作游标
    conn.execute("SELECT COUNT(*) FROM `article` WHERE `articleId` = %s;"%articleId)# 使用execute方法执行SQL语句
    data = conn.fetchall()# 使用 fetchone() 方法获取一条数据
    db.close()
    return data[0][0]

def select_is_comment():#查询没有评论的数据
    db = POOL.connection()
    conn = db.cursor()# 使用cursor()方法获取操作游标
    conn.execute("SELECT `articleId`,`articleDetailUrl` FROM `article` WHERE `comment` = '0' LIMIT 0, 2;")# 使用execute方法执行SQL语句
    data = conn.fetchall()# 使用 fetchone() 方法获取一条数据
    db.close()
    return data

def update_article(articleId,comment=1):
    db = POOL.connection()
    conn = db.cursor()# 使用cursor()方法获取操作游标
    conn.execute("UPDATE `article` SET  `comment` =%s WHERE `articleId` = %s;"%(comment,articleId))# 使用execute方法执行SQL语句
    data=db.commit()# 使用 fetchone() 方法获取一条数据
    db.close()
    return data
代码语言:javascript
复制

五、获取csdn文章数据

我们继续获取csdn文章列表(从排行榜中获取):

代码语言:javascript
复制
代码语言:javascript
复制
def qzzhrb():
    """全站综合热榜"""
    headers1={
        'Host': 'blog.csdn.net',
    'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36',
    }
    for y in range(0,4):
        time.sleep(5)
        response=requests.get("https://blog.csdn.net/phoenix/web/blog/hotRank?page="+str(y)+"&pageSize=25",headers=headers1)
        if response.json()["message"]=="success":
            for i in response.json()["data"]:
                for j in range(1,len(i["articleDetailUrl"])):
                    if i["articleDetailUrl"][-j]=="/":
                        articleId=i["articleDetailUrl"][-j+1:]
                        if select_is_insert(articleId)!=1:
                            insert_article( articleId,i["articleDetailUrl"],i["articleTitle"],i["nickName"],i["hotRankScore"])
                            break
                        break
#1到2天执行一次
def lynrb():
    """领域内容榜"""
    headers1={
    'Host': 'blog.csdn.net',
    'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36',
    }
    list=["python","java","javascript","人工智能","php","c%2Fc%2B%2B","大数据","移动开发","数据结构与算法","游戏","网络","运维","测试"]
    for y in range(0,2):
        for i in list:
            response=requests.get("https://blog.csdn.net/phoenix/web/blog/hotRank?page="+str(y)+"&pageSize=25&child_channel="+i,headers=headers1)
            time.sleep(5)
            if response.json()["message"]=="success":
                for i in response.json()["data"]:
                    for j in range(1,len(i["articleDetailUrl"])):
                        if i["articleDetailUrl"][-j]=="/":
                            articleId=i["articleDetailUrl"][-j+1:]
                            if select_is_insert(articleId)!=1:
                                insert_article( articleId,i["articleDetailUrl"],i["articleTitle"],i["nickName"],i["hotRankScore"])
                                break
                            break
# lynrb()
#每天8点一次
def xjzzb():
    """新晋作者榜"""
    headers1={
        'Host': 'blog.csdn.net',
    'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36',
    }
    for y in range(0,5):
        time.sleep(5)
        response=requests.get("https://blog.csdn.net/phoenix/web/blog/newUserRank?page="+str(y)+"&pageSize=20",headers=headers1)
        if response.json()["message"]=="success":
            for i in response.json()["data"]:
                for j in range(1,len(i["articleDetailUrl"])):
                    if i["articleDetailUrl"][-j]=="/":
                        articleId=i["articleDetailUrl"][-j+1:]
                        if select_is_insert(articleId)!=1:
                            insert_article( articleId,i["articleDetailUrl"],i["articleTitle"],i["nickName"],i["hotRankScore"])
                            break
                        break
# xjzzb()

def recommend():
    """推荐栏目"""
    header={
        "path": "/api/articles?type=more&category=home&shown_offset=0",
        "accept-language": "zh-CN,zh;q=0.9",
        "referer": "https://blog.csdn.net/",
        "accept": "application/json, text/javascript, */*; q=0.01",
        "X - Tingyun - Id": "im - pGljNfnc;r = 332305116",
        "Sec - Fetch - Site": "same - origin",
        "Sec - Fetch - Mode": "cors",
        "Sec - Fetch - Dest": "empty",
        "Accept - Encoding": "gzip, deflate, br",
        "Accept - Language": "zh - CN, zh;q = 0.9",
        "Host": "blog.csdn.net",
        "Connection": "keep - alive",
        "sec - ch - ua": '" Not A;Brand";v = "99", "Chromium";v = "90", "Google Chrome";v = "90"',
    "Accept": "application / json, text / javascript, * / *; q = 0.01",
    "X - Requested - With": "XMLHttpRequest",
    "sec - ch - ua - mobile": "?0",
    "Cookie": "uuid_tt_dd=10_30743904980-1618379395370-717724; Hm_up_6bcd52f51e9b3dce32bec4a3997715ac=%7B%22islogin%22%3A%7B%22value%22%3A%220%22%2C%22scope%22%3A1%7D%2C%22isonline%22%3A%7B%22value%22%3A%220%22%2C%22scope%22%3A1%7D%2C%22isvip%22%3A%7B%22value%22%3A%220%22%2C%22scope%22%3A1%7D%7D; Hm_ct_6bcd52f51e9b3dce32bec4a3997715ac=6525*1*10_30743904980-1618379395370-717724; __gads=ID=e49a2afa774ef751-22e5578266c70052:T=1618379398:RT=1618379398:S=ALNI_MbGqjNddCmz_vAd5NE9aUuroCHdwA; ssxmod_itna=YqGxcCD=0QK7qYKGHEoQ40OxUxmufqLLdr80i44GNYWDZDiqAPGhDCbbtxw0mBmDI=fijYav4j4biaPIoKQmOXxbDCPGnDB9+fpDem=D5xGoDPxDeDADYo6DAqiOD7T=DEDm48DaxDoDehI7DY5DhxDC0GPDwx0CAg04eG9s7=7Cd/94VxxeAxG1=40HKYSm5t8EeGv3+x0kU40OuP58U6YDU7b4fQioWhhedndeklTYlelK/SD46nuKiD+xoehrTnq9DDpXbm6DD===; ssxmod_itna2=YqGxcCD=0QK7qYKGHEoQ40OxUxmufqLLdr80DA=nxAeD/FCbDFx48kIZp7KAphO1Cx25eGozrjvh8kC2L5tYKCxjy+INZxC0+h/0tZMc8qC2CO=5aZgAKKHnShO94Y8V=uoW+9KyHY/zkQG8xX4HqLlUDNjK0QxOW0x9EYc0wYQixG2Y0PDFqD2YiD==; dc_session_id=10_1620356271093.957181; TY_SESSION_ID=0fae6181-358c-443f-9ab9-19d622159650; dc_sid=dcd3c87e76fb26c5661211133000d30c; c_first_ref=default; c_first_page=https%3A//blog.csdn.net/; c_segment=12; Hm_lvt_6bcd52f51e9b3dce32bec4a3997715ac=1618379397,1619322330,1620356273; c_ref=https%3A//blog.csdn.net/; firstDie=1; log_Id_view=51; c_pref=https%3A//blog.csdn.net/; c_page_id=default; dc_tos=qspx78; log_Id_pv=10; c-login-auto=9; Hm_lpvt_6bcd52f51e9b3dce32bec4a3997715ac=1620359253; announcement-new=%7B%22isLogin%22%3Afalse%2C%22announcementUrl%22%3A%22https%3A%2F%2Fblog.csdn.net%2Fblogdevteam%2Farticle%2Fdetails%2F112280974%3Futm_source%3Dgonggao_0107%22%2C%22announcementCount%22%3A0%7D",
    "user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36"
    }
    list=["python","home","career","java","web","arch","blockchain","db","5g","game","mobile","ops","sec","engineering"]
    for i in list:
        response=requests.get("https://blog.csdn.net/api/articles?type=more&category=%s&shown_offset=0"%i,headers=header)
        if response.json()["status"]=="true":
            for i in response.json()["articles"]:
                if select_is_insert(i["product_id"]) != 1:
                    insert_article(i["product_id"],i["url"],i["title"],i["nickname"],i["views"])

代码语言:javascript
复制

这样,我们就获取到数据,并将数据插入到数据库中。

六、查询是否已评论

我们已经从数据拿到数据,需要查询数据是否我们在以前就评论过,因此我们需要做一下判断,需要在header中添加自己的cookie。

代码语言:javascript
复制
代码语言:javascript
复制
def comment_page(id,page=1,size=18):
    """查询是否已评论"""
    header={
        'Host': 'blog.csdn.net',
        'Connection': 'keep-alive',
        'Content-Length': '0',
        'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="90", "Google Chrome";v="90"',
        'Accept': '*/*',
        'X-Requested-With': 'XMLHttpRequest',
        'sec-ch-ua-mobile': '?0',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36',
        'X-Tingyun-Id': 'im-pGljNfnc;r=301378265',
        'Origin': 'https://blog.csdn.net',
        'Sec-Fetch-Site': 'same-origin',
        'Sec-Fetch-Mode': 'cors',
        'Sec-Fetch-Dest': 'empty',
        'Accept-Encoding': 'gzip, deflate, br',
        'Accept-Language': 'zh-CN,zh;q=0.9',
        
    }
    data="page=%s&size=%s&commentId="%(page,size)
    response=requests.post("https://blog.csdn.net/phoenix/web/v1/comment/list/%s?page=%s&size=%s&commentId="%(id,page,size),headers=header,data=data)
    if response.json()["code"]==200:
        list=response.json()["data"]["list"]
        if response.json()["data"]["count"]<< span="">=size:
            for i in list:
                if i["info"]["userName"]=="qq_39046854":#用户id为我的id,需要改为自己的
                    update_article(id)
                    return True
        if response.json()["data"]["count"]>size:
            for i in list:
                if i["info"]["userName"] == "qq_39046854":
                    update_article(id)
                    return True
            comment_page(id, 1, response.json()["data"]["count"])
        return  False
代码语言:javascript
复制

整体逻辑:

查询评论列表数据,如果条数小于等于size,如果评论数据中,没有自己的id,则返回false,如果有自己的id则更新数据库,返回true。

如果条数大于size,如果评论数据中,没有自己的id,则进行递归,如果依然没有,返回false,如果有自己的id则更新数据库,返回true。

七、评论

OK,已经做完了装备,然后我们来评论。获取到评论的接口。

代码语言:javascript
复制
代码语言:javascript
复制
def comment(articleId):
    """评论"""
    time.sleep(5)
    user_headers = {
        'Host': 'blog.csdn.net',
        'Connection': 'keep-alive',
        'Content-Length': '75',
        'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="90", "Google Chrome";v="90"',
        'Accept': 'application/json, text/javascript, */*; q=0.01',
        'X-Requested-With': 'XMLHttpRequest',
        'sec-ch-ua-mobile': '?0',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36',
        'X-Tingyun-Id': 'im-pGljNfnc;r=290154423',
        'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
        'Origin': 'https://blog.csdn.net',
        'Sec-Fetch-Site': 'same-origin',
        'Sec-Fetch-Mode': 'cors',
        'Sec-Fetch-Dest': 'empty',
        'Accept-Encoding': 'gzip, deflate, br',
        'Accept-Language': 'zh-CN,zh;q=0.9',
    }
    content=["666"]
    data="commentId=&content=%s&articleId=%s"%(random.choice(content),articleId)
    response=requests.post("https://blog.csdn.net/phoenix/web/v1/comment/submit",
                          data=data.encode('utf-8'),
                        headers=user_headers)
    print(response.json())
    if response.json()["message"]=="success":
        update_article(articleId)
    if response.json()["code"]==400:
        update_article(articleId)
代码语言:javascript
复制

这里需要添加自己的cookie。逻辑:调用评论的接口,如果返回成功,则更新数据库状态,如果返回400,找不到文章则更新并跳过。

八、组合评论

我们已经搞定了单个的评论,然后我们组合各个函数,查询数据库,然后进行评论。

代码语言:javascript
复制
代码语言:javascript
复制
def is_comment():
    not_comment=select_is_comment()
    if not_comment!=():
        for i in not_comment:
            if str(i[1][22:33]) in "qq_39046854":#需要过滤不评论的用户写在这里
                update_article(i[0])
            else:
                y=comment_page(i[0])
                if y==False:
                    comment(i[0])
代码语言:javascript
复制

逻辑:查询数据库未评论数据,如果返回了数据,则进入循环,如果是黑名单用户文章,则不评论,直接更新,如果不是则查询是否评论,如果没有评论就进行评论。

九、设置定时任务

已经生产了各个函数,但是只能一次一次执行,或者就是写一个循环,那不如写一个定时任务吧。然后我们来进行定时任务的组合。需要用到schedule模块。

代码语言:javascript
复制
代码语言:javascript
复制

@repeat(every(10).minutes,func=recommend)#查询推荐栏目
@repeat(every(3).minutes,func=is_comment)#进入评论
@repeat(every().day.at("10:00"),func=qzzhrb)#查询热榜
@repeat(every().day.at("09:00"),func=lynrb)#查询热榜
@repeat(every().day.at("07:00"),func=xjzzb)#查询热榜
def run_threaded(func):
    job_thread = threading.Thread(target=func)
    job_thread.start()

def main():
    while True:
        try:
            run_pending()#run_pending:运行所有可以运行定时任务
        except:
            pass
        time.sleep(1)
代码语言:javascript
复制

十、完整代码

代码语言:javascript
复制
import random,time,os,threading
import requests
import pymysql
from dbutils.pooled_db import PooledDB
from schedule import every, repeat, run_pending
import json


POOL = PooledDB(
    creator=pymysql,  # 使用链接数据库的模块
    maxconnections=20,  # 连接池允许的最大连接数,0和None表示不限制连接数
    mincached=6,  # 初始化时,链接池中至少创建的空闲的链接,0表示不创建
    maxcached=None,  # 链接池中最多闲置的链接,0和None不限制
    maxshared=5,
    blocking=True, # 连接池中如果没有可用连接后,是否阻塞等待。True,等待;False,不等待然后报错
    maxusage=None,# 一个链接最多被重复使用的次数,None表示无限制
    setsession=[], # 开始会话前执行的命令列表。如:["set datestyle to ...", "set time zone ..."]
    ping=0,    # ping MySQL服务端,检查是否服务可用。
    host='127.0.0.1',port=3306,user='root',password='root123456*',database='csdn_article',charset='utf8')

def insert_article(articleId, articleDetailUrl,articleTitle, nickName, hotRankScore):
    db = POOL.connection()
    conn = db.cursor()# 使用cursor()方法获取操作游标
    conn.execute("INSERT INTO `article`(`articleId`, `articleDetailUrl`,`articleTitle`, `nickName`, `hotRankScore`) VALUES (%s, '%s','%s', '%s', '%s');"%(articleId, articleDetailUrl,pymysql.escape_string(articleTitle), pymysql.escape_string(nickName), hotRankScore))# 使用execute方法执行SQL语句
    data=db.commit()# 使用 fetchone() 方法获取一条数据
    db.close()
    return data

def select_is_insert(articleId):
    db = POOL.connection()
    conn = db.cursor()# 使用cursor()方法获取操作游标
    conn.execute("SELECT COUNT(*) FROM `article` WHERE `articleId` = %s;"%articleId)# 使用execute方法执行SQL语句
    data = conn.fetchall()# 使用 fetchone() 方法获取一条数据
    db.close()
    return data[0][0]

def select_is_comment():#查询没有评论的数据
    db = POOL.connection()
    conn = db.cursor()# 使用cursor()方法获取操作游标
    conn.execute("SELECT `articleId`,`articleDetailUrl` FROM `article` WHERE `comment` = '0' LIMIT 0, 2;")# 使用execute方法执行SQL语句
    data = conn.fetchall()# 使用 fetchone() 方法获取一条数据
    db.close()
    return data

def update_article(articleId,comment=1):
    db = POOL.connection()
    conn = db.cursor()# 使用cursor()方法获取操作游标
    conn.execute("UPDATE `article` SET  `comment` =%s WHERE `articleId` = %s;"%(comment,articleId))# 使用execute方法执行SQL语句
    data=db.commit()# 使用 fetchone() 方法获取一条数据
    db.close()
    return data

def comment(articleId):
    """评论"""
    time.sleep(5)
    user_headers = {
        'Host': 'blog.csdn.net',
        'Connection': 'keep-alive',
        'Content-Length': '75',
        'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="90", "Google Chrome";v="90"',
        'Accept': 'application/json, text/javascript, */*; q=0.01',
        'X-Requested-With': 'XMLHttpRequest',
        'sec-ch-ua-mobile': '?0',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36',
        'X-Tingyun-Id': 'im-pGljNfnc;r=290154423',
        'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
        'Origin': 'https://blog.csdn.net',
        'Sec-Fetch-Site': 'same-origin',
        'Sec-Fetch-Mode': 'cors',
        'Sec-Fetch-Dest': 'empty',
        'Accept-Encoding': 'gzip, deflate, br',
        'Accept-Language': 'zh-CN,zh;q=0.9',
    }
    content=["666"]
    data="commentId=&content=%s&articleId=%s"%(random.choice(content),articleId)
    response=requests.post("https://blog.csdn.net/phoenix/web/v1/comment/submit",
                          data=data.encode('utf-8'),
                        headers=user_headers)
    print(response.json())
    if response.json()["message"]=="success":
        update_article(articleId)
    if response.json()["code"]==400:
        update_article(articleId)

def qzzhrb():
    """全站综合热榜"""
    headers1={
        'Host': 'blog.csdn.net',
    'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36',
    }
    for y in range(0,4):
        time.sleep(5)
        response=requests.get("https://blog.csdn.net/phoenix/web/blog/hotRank?page="+str(y)+"&pageSize=25",headers=headers1)
        if response.json()["message"]=="success":
            for i in response.json()["data"]:
                for j in range(1,len(i["articleDetailUrl"])):
                    if i["articleDetailUrl"][-j]=="/":
                        articleId=i["articleDetailUrl"][-j+1:]
                        if select_is_insert(articleId)!=1:
                            insert_article( articleId,i["articleDetailUrl"],i["articleTitle"],i["nickName"],i["hotRankScore"])
                            break
                        break
def lynrb():
    """领域内容榜"""
    headers1={
    'Host': 'blog.csdn.net',
    'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36',
    }
    list=["python","java","javascript","人工智能","php","c%2Fc%2B%2B","大数据","移动开发","数据结构与算法","游戏","网络","运维","测试"]
    for y in range(0,2):
        for i in list:
            response=requests.get("https://blog.csdn.net/phoenix/web/blog/hotRank?page="+str(y)+"&pageSize=25&child_channel="+i,headers=headers1)
            time.sleep(5)
            if response.json()["message"]=="success":
                for i in response.json()["data"]:
                    for j in range(1,len(i["articleDetailUrl"])):
                        if i["articleDetailUrl"][-j]=="/":
                            articleId=i["articleDetailUrl"][-j+1:]
                            if select_is_insert(articleId)!=1:
                                insert_article( articleId,i["articleDetailUrl"],i["articleTitle"],i["nickName"],i["hotRankScore"])
                                break
                            break

def xjzzb():
    """新晋作者榜"""
    headers1={
        'Host': 'blog.csdn.net',
    'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36',
    }
    for y in range(0,5):
        time.sleep(5)
        response=requests.get("https://blog.csdn.net/phoenix/web/blog/newUserRank?page="+str(y)+"&pageSize=20",headers=headers1)
        if response.json()["message"]=="success":
            for i in response.json()["data"]:
                for j in range(1,len(i["articleDetailUrl"])):
                    if i["articleDetailUrl"][-j]=="/":
                        articleId=i["articleDetailUrl"][-j+1:]
                        if select_is_insert(articleId)!=1:
                            insert_article( articleId,i["articleDetailUrl"],i["articleTitle"],i["nickName"],i["hotRankScore"])
                            break
                        break

def recommend():
    """推荐栏目"""
    header={
        "path": "/api/articles?type=more&category=home&shown_offset=0",
        "accept-language": "zh-CN,zh;q=0.9",
        "referer": "https://blog.csdn.net/",
        "accept": "application/json, text/javascript, */*; q=0.01",
        "X - Tingyun - Id": "im - pGljNfnc;r = 332305116",
        "Sec - Fetch - Site": "same - origin",
        "Sec - Fetch - Mode": "cors",
        "Sec - Fetch - Dest": "empty",
        "Accept - Encoding": "gzip, deflate, br",
        "Accept - Language": "zh - CN, zh;q = 0.9",
        "Host": "blog.csdn.net",
        "Connection": "keep - alive",
        "sec - ch - ua": '" Not A;Brand";v = "99", "Chromium";v = "90", "Google Chrome";v = "90"',
    "Accept": "application / json, text / javascript, * / *; q = 0.01",
    "X - Requested - With": "XMLHttpRequest",
    "sec - ch - ua - mobile": "?0",
    "Cookie": "uuid_tt_dd=10_30743904980-1618379395370-717724; Hm_up_6bcd52f51e9b3dce32bec4a3997715ac=%7B%22islogin%22%3A%7B%22value%22%3A%220%22%2C%22scope%22%3A1%7D%2C%22isonline%22%3A%7B%22value%22%3A%220%22%2C%22scope%22%3A1%7D%2C%22isvip%22%3A%7B%22value%22%3A%220%22%2C%22scope%22%3A1%7D%7D; Hm_ct_6bcd52f51e9b3dce32bec4a3997715ac=6525*1*10_30743904980-1618379395370-717724; __gads=ID=e49a2afa774ef751-22e5578266c70052:T=1618379398:RT=1618379398:S=ALNI_MbGqjNddCmz_vAd5NE9aUuroCHdwA; ssxmod_itna=YqGxcCD=0QK7qYKGHEoQ40OxUxmufqLLdr80i44GNYWDZDiqAPGhDCbbtxw0mBmDI=fijYav4j4biaPIoKQmOXxbDCPGnDB9+fpDem=D5xGoDPxDeDADYo6DAqiOD7T=DEDm48DaxDoDehI7DY5DhxDC0GPDwx0CAg04eG9s7=7Cd/94VxxeAxG1=40HKYSm5t8EeGv3+x0kU40OuP58U6YDU7b4fQioWhhedndeklTYlelK/SD46nuKiD+xoehrTnq9DDpXbm6DD===; ssxmod_itna2=YqGxcCD=0QK7qYKGHEoQ40OxUxmufqLLdr80DA=nxAeD/FCbDFx48kIZp7KAphO1Cx25eGozrjvh8kC2L5tYKCxjy+INZxC0+h/0tZMc8qC2CO=5aZgAKKHnShO94Y8V=uoW+9KyHY/zkQG8xX4HqLlUDNjK0QxOW0x9EYc0wYQixG2Y0PDFqD2YiD==; dc_session_id=10_1620356271093.957181; TY_SESSION_ID=0fae6181-358c-443f-9ab9-19d622159650; dc_sid=dcd3c87e76fb26c5661211133000d30c; c_first_ref=default; c_first_page=https%3A//blog.csdn.net/; c_segment=12; Hm_lvt_6bcd52f51e9b3dce32bec4a3997715ac=1618379397,1619322330,1620356273; c_ref=https%3A//blog.csdn.net/; firstDie=1; log_Id_view=51; c_pref=https%3A//blog.csdn.net/; c_page_id=default; dc_tos=qspx78; log_Id_pv=10; c-login-auto=9; Hm_lpvt_6bcd52f51e9b3dce32bec4a3997715ac=1620359253; announcement-new=%7B%22isLogin%22%3Afalse%2C%22announcementUrl%22%3A%22https%3A%2F%2Fblog.csdn.net%2Fblogdevteam%2Farticle%2Fdetails%2F112280974%3Futm_source%3Dgonggao_0107%22%2C%22announcementCount%22%3A0%7D",
    "user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36"
    }
    list=["python","home","career","java","web","arch","blockchain","db","5g","game","mobile","ops","sec","engineering"]
    for i in list:
        response=requests.get("https://blog.csdn.net/api/articles?type=more&category=%s&shown_offset=0"%i,headers=header)
        if response.json()["status"]=="true":
            for i in response.json()["articles"]:
                if select_is_insert(i["product_id"]) != 1:
                    insert_article(i["product_id"],i["url"],i["title"],i["nickname"],i["views"])

def comment_page(id,page=1,size=18):
    """查询是否已评论"""
    header={
        'Host': 'blog.csdn.net',
        'Connection': 'keep-alive',
        'Content-Length': '0',
        'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="90", "Google Chrome";v="90"',
        'Accept': '*/*',
        'X-Requested-With': 'XMLHttpRequest',
        'sec-ch-ua-mobile': '?0',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36',
        'X-Tingyun-Id': 'im-pGljNfnc;r=301378265',
        'Origin': 'https://blog.csdn.net',
        'Sec-Fetch-Site': 'same-origin',
        'Sec-Fetch-Mode': 'cors',
        'Sec-Fetch-Dest': 'empty',
        'Accept-Encoding': 'gzip, deflate, br',
        'Accept-Language': 'zh-CN,zh;q=0.9',
        'Cookie': 'uuid_tt_dd=10_30743904990-1609307614140-892319; UN=qq_39046854; p_uid=U010000; Hm_ct_6bcd52f51e9b3dce32bec4a3997715ac=6525*1*10_30743904990-1609307614140-892319!5744*1*qq_39046854; Hm_lvt_e5ef47b9f471504959267fd614d579cd=1614845242; __gads=ID=e3eda8954669b04c-22bf2c9088c700af:T=1619152098:RT=1619152098:S=ALNI_Mabu5XI7BhNuTCo5hOhZVC6RKaXJw; ssxmod_itna=Qqmx0Qi=D=0=nDeq0LP=4jo4RE7n7oiT+TrDlPpQxA5D8D6DQeGTT0dDB7Q1im7YHwTD578uhWx=m4ou=79C+oXPTDCPGnDBFh3TDee=D5xGoDPxDeDADYE6DAqiOD7qDdEsNv/8DbxYpnDA3Di4D+bkQDmqG0DDtHR4G2D7Un07Dqbu0jWWtohkDqwY+nD0t3xBLebaT5apaq0uiroPK48DDHtYyYQ0GCuxwCYBXTKqGyiKGuATUl9bRCOTXS/LvPthDLBhGCI7D6wAxqY7wC9gM5FYeqWhvQnGWnqReME0DDG8h=S2exD=; ssxmod_itna2=Qqmx0Qi=D=0=nDeq0LP=4jo4RE7n7oiT+D6aKp0DGqDsrdeeDLD=vR9bk=yn8Qk0IkUDnbmYyQjB9+FP6hOP2vebQd20Wa5U70=s5vbAxcwHwWzkr6nbVC3b0N2CdOIvRgQVByY2qdw874XwYWLDV+GM0AfmKDAyG+EPYQjxIRhk8jAx3kDV4MGkidjRfL0mD8Pvt8KGGYXOCPNGA=+sIP6K6P0CdRytv=4yl2FcBaH/bRc=4CFyBmK8=+OKppF4/=EvEcBhvu9iVcXmdRRsd/HW=sCLvk8yjfeDKdwiC3+Y=YhS4hrlZ+M0NVpGM7PDjKDeuD4D; UserName=qq_39046854; UserInfo=354c160126b1426e8ee609598d4e473f; UserToken=354c160126b1426e8ee609598d4e473f; UserNick=%E5%A4%A7%E5%AE%B6%E4%B8%80%E8%B5%B7%E5%AD%A6%E7%BC%96%E7%A8%8B%EF%BC%88python%EF%BC%89; AU=1D9; BT=1619320665305; Hm_up_6bcd52f51e9b3dce32bec4a3997715ac=%7B%22islogin%22%3A%7B%22value%22%3A%221%22%2C%22scope%22%3A1%7D%2C%22isonline%22%3A%7B%22value%22%3A%221%22%2C%22scope%22%3A1%7D%2C%22isvip%22%3A%7B%22value%22%3A%220%22%2C%22scope%22%3A1%7D%2C%22uid_%22%3A%7B%22value%22%3A%22qq_39046854%22%2C%22scope%22%3A1%7D%7D; c_first_ref=www.baidu.com; c_segment=15; dc_sid=3287252e7d613c9ce0fc0f7bff30cc8d; firstDie=1; c_first_page=https%3A//download.csdn.net/download/binzainet/11432043; Hm_lvt_6bcd52f51e9b3dce32bec4a3997715ac=1620921821,1620924825,1621223521,1621223546; aliyun_webUmidToken=T2gA_B9FwF6d3YpegJJLr7sTunCxnLaUsnXsbtNlO-dk6-hgLZhENFiE0dyY2OFIiK8=; dc_session_id=10_1621301343026.272834; TY_SESSION_ID=5adae148-659f-4a3c-8354-3c27eafb9e82; announcement-new=%7B%22isLogin%22%3Atrue%2C%22announcementUrl%22%3A%22https%3A%2F%2Fblog.csdn.net%2Fblogdevteam%2Farticle%2Fdetails%2F112280974%3Futm_source%3Dgonggao_0107%22%2C%22announcementCount%22%3A0%2C%22announcementExpire%22%3A3600000%7D; log_Id_click=1149; c_ref=https%3A//blog.csdn.net/; log_Id_view=3430; c_pref=https%3A//blog.csdn.net/; c_utm_medium=distribute.pc_feed.none-task-blog-yuanlijihua_tag_v1-2.nonecase; c_page_id=default; dc_tos=qta45d; log_Id_pv=2040; Hm_lpvt_6bcd52f51e9b3dce32bec4a3997715ac=1621301378',
    }
    data="page=%s&size=%s&commentId="%(page,size)
    response=requests.post("https://blog.csdn.net/phoenix/web/v1/comment/list/%s?page=%s&size=%s&commentId="%(id,page,size),headers=header,data=data)
    if response.json()["code"]==200:
        list=response.json()["data"]["list"]
        if response.json()["data"]["count"]<size:
            for i in list:
                if i["info"]["userName"]=="qq_39046854":
                    update_article(id)
                    return True
        if response.json()["data"]["count"]==size:
            for i in list:
                if i["info"]["userName"] == "qq_39046854":
                    update_article(id)
                    return True
        if response.json()["data"]["count"]>size:
            for i in list:
                if i["info"]["userName"] == "qq_39046854":
                    update_article(id)
                    return True
            comment_page(id, 1, response.json()["data"]["count"])
        return  False

#每四分钟一次
def is_comment():
    not_comment=select_is_comment()
    if not_comment!=():
        for i in not_comment:
            if str(i[1][22:33]) in "qq_39046854weixin_43673589":
                update_article(i[0])
            else:
                y=comment_page(i[0])
                if y==False:
                    comment(i[0])

@repeat(every(10).minutes,func=recommend)
@repeat(every(3).minutes,func=is_comment)
@repeat(every().day.at("10:00"),func=qzzhrb)
@repeat(every().day.at("09:00"),func=lynrb)
@repeat(every().day.at("07:00"),func=xjzzb)
def run_threaded(func):
    job_thread = threading.Thread(target=func)
    job_thread.start()

def main():
    while True:
        try:
            run_pending()#run_pending:运行所有可以运行的任务
        except:
            pass
        time.sleep(1)

if __name__ == '__main__':
    main()
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2021-10-11,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 大家一起学编程 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 前言
  • 正文
    • 一、分析获取文章id
      • 二、流程设计
        • 三、数据库设计
          • 四、操作数据库
            • 五、获取csdn文章数据
              • 六、查询是否已评论
                • 七、评论
                  • 八、组合评论
                    • 九、设置定时任务
                      • 十、完整代码
                      相关产品与服务
                      数据库
                      云数据库为企业提供了完善的关系型数据库、非关系型数据库、分析型数据库和数据库生态工具。您可以通过产品选择和组合搭建,轻松实现高可靠、高可用性、高性能等数据库需求。云数据库服务也可大幅减少您的运维工作量,更专注于业务发展,让企业一站式享受数据上云及分布式架构的技术红利!
                      领券
                      问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档