我正在寻找一个正则表达式来删除字符串中的每个url或域名,这样:
string='this is my content domain.com more content http://domain2.org/content and more content domain.net/page'
变成了
'this is my content more content and more content'
移除最常见的tlds对我来说已经足够了,所以我尝试了
string = re.sub(r'\w+(.net|.com|.org|.info|.edu|.gov|