在Python中提取标签之间的内容可以使用各种库和方法,以下是其中几种常用的方法:
import re
html = "<p>This is a paragraph.</p><p>This is another paragraph.</p>"
paragraphs = re.findall(r"<p>(.*?)</p>", html)
print(paragraphs)
输出结果为:['This is a paragraph.', 'This is another paragraph.']
from bs4 import BeautifulSoup
html = "<p>This is a paragraph.</p><p>This is another paragraph.</p>"
soup = BeautifulSoup(html, 'html.parser')
paragraphs = soup.find_all('p')
for p in paragraphs:
print(p.text)
输出结果为:This is a paragraph. This is another paragraph.
from lxml import etree
html = "<p>This is a paragraph.</p><p>This is another paragraph.</p>"
tree = etree.HTML(html)
paragraphs = tree.xpath('//p/text()')
print(paragraphs)
输出结果为:['This is a paragraph.', 'This is another paragraph.']
这些方法都可以用来提取标签之间的内容,具体选择哪种方法取决于个人偏好和项目需求。
企业创新在线学堂
云原生正发声
数字化产业研学汇第三期
腾讯技术创作特训营第二季第3期
DB TALK 技术分享会
Elastic 中国开发者大会
云+社区技术沙龙[第6期]
DBTalk技术分享会
Elastic 实战工作坊
领取专属 10元无门槛券
手把手带您无忧上云