在64位Windovs 8.1上启动网络爬虫。尝试不连接额外的库,最终爬错了。
C:\Users\I>cd c:\Users\i\Desktop\heritrix-1.14.4
c:\Users\I\Desktop\heritrix-1.14.4>cd bin
c:\Users\I\Desktop\heritrix-1.14.4\bin>heritrix.cmd
You have to specify either a username and password for the
web interface or start Heritrix without the web
我想刮一个Heritrix主页使用pythons 模块。当我试图在chrome上打开这个页面时,我会得到以下错误:
This server could not prove that it is 10.100.121.41; its security
certificate is not trusted by your computer's operating system. This
may be caused by a misconfiguration or an attacker intercepting your
connection.
但我可以继续看这一页。当