我是Regex的初学者,但我正在尝试实现它。我要做的是通过正则表达式提取特定文件中的链接并将其打印出来。下面是我想要导出的所有链接:
https://ci.adoptopenjdk.net/view/Test_openjdk/job/openjdk8_hs_openjdktest_s390x_linux/18/consoleFull
https://ci.adoptopenjdk.net/view/Test_openjdk/job/Test_openjdk11_hs_sanity.openjdk_x86-64_mac/26/consoleFull
To reproduce: https://ci.adoptopenjdk.net/job/Test_openjdk14_hs_sanity.openjdk_ppc64_aix/111/console
Jenkins Build URL: https://ci.adoptopenjdk.net/job/Test_openjdk8_hs_extended.system_ppc64le_linux/259/
internal build `Test_openjdk8_j9_sanity.openjdk_aarch64_linux/43/`
internal build `Test_openjdk11_j9_sanity.functional_aarch64_linux/46`
我希望链接的形式为(预期):
https://ci.adoptopenjdk.net/view/Test_openjdk/job/openjdk8_hs_openjdktest_s390x_linux/18/
https://ci.adoptopenjdk.net/view/Test_openjdk/job/Test_openjdk11_hs_sanity.openjdk_x86-64_mac/26/
https://ci.adoptopenjdk.net/job/Test_openjdk14_hs_sanity.openjdk_ppc64_aix/111/
Test_openjdk8_j9_sanity.openjdk_aarch64_linux/43/
为了解决这个问题,我尝试应用正则表达式的概念,并提出了这个模式:(?:(?:http|ftp|https):\/\/ci.adoptopenjdk.net.+(?:consoleFull|console|\d?))|(?:\`.+?\`)
但是输出结果是这样的(实际):
https://ci.adoptopenjdk.net/job/Test_openjdk17_j9_sanity.openjdk_x86-64_linux/33/consoleFull
https://ci.adoptopenjdk.net/view/Test_openjdk/job/openjdk8_hs_openjdktest_s390x_linux/18/consoleFullhttps://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_ppc64_aix/145/`Test_openjdk8_j9_sanity.openjdk_aarch64_linux/43/`
`cent7-aarch64-3``Test_openjdk8_j9_sanity.openjdk_aarch64_linux/43/`
`cent7-aarch64-3`
现在我想在https://....Test_openjdk{num}的基础上进行检查,因为ci.adoptopenjdk.net
已经过时了,所以遇到这种情况,我做了一个正则表达式模式,根据Test_openjdk{num}检查上面提到的所有链接作为输入,以获得所需的输出。我用过
(?:http|ftp|https):\/\/Test_openjdk/\d.+\/(?=console)|(?<=\`).+?\/(?=\`)
但只得到了以下输出:
Test_openjdk8_j9_sanity.openjdk_aarch64_linux/43/
Test_openjdk8_j9_sanity.openjdk_aarch64_linux/43/
我一直在尝试各种方法,但都失败了。有没有人能解释一下我哪里错了,为什么错了?这真是太好了。谢谢
发布于 2021-06-24 18:35:28
(?:(?:http|ftp|https):\/\/ci\.adoptopenjdk\.net.+\/|(?:\`.+?\`))
.+\/
将确保您在链接的最后一个/处停止比赛
此外,.
的意思是匹配任何字符,如果你想匹配的话。正确使用\.
如果您只想要以console或/结尾的代码,并且您的正则表达式支持正向前视和正向后视,那么您可以使用以下代码
(?:http|ftp|https):\/\/ci\.adoptopenjdk\.net.+\/(?=console)|(?<=\`).+?\/(?=\`)
(?=console)
意味着字符串控制台必须跟在它后面,但它不会捕获它
(?<=\`)
意味着在比赛前必须有一个倒计时,但它不会捕捉到它
(?=\`)
意味着在比赛结束后必须有一个反标记,但它不会捕捉到它
关于您的python函数问题,您需要与\n连接并打印函数的结果
import re
def regexify(s):
pattern = r"(?:http|ftp|https):\/\/ci\.adoptopenjdk\.net.+\/(?=console)|(?<=\`).+?\/(?=\`)"
substring = re.findall(pattern, s)
result='\n'.join(substring)
if result:
try:
return result
except:
return ' '
x='''
https://ci.adoptopenjdk.net/view/Test_openjdk/job/openjdk8_hs_openjdktest_s390x_linux/18/consoleFull
https://ci.adoptopenjdk.net/view/Test_openjdk/job/Test_openjdk11_hs_sanity.openjdk_x86-64_mac/26/consoleFull
To reproduce: https://ci.adoptopenjdk.net/job/Test_openjdk14_hs_sanity.openjdk_ppc64_aix/111/console
Jenkins Build URL: https://ci.adoptopenjdk.net/job/Test_openjdk8_hs_extended.system_ppc64le_linux/259/
internal build `Test_openjdk8_j9_sanity.openjdk_aarch64_linux/43/`
internal build `Test_openjdk11_j9_sanity.functional_aarch64_linux/46`
'''
print(regexify(x))
输出:
https://ci.adoptopenjdk.net/view/Test_openjdk/job/openjdk8_hs_openjdktest_s390x_linux/18/
https://ci.adoptopenjdk.net/view/Test_openjdk/job/Test_openjdk11_hs_sanity.openjdk_x86-64_mac/26/
https://ci.adoptopenjdk.net/job/Test_openjdk14_hs_sanity.openjdk_ppc64_aix/111/
Test_openjdk8_j9_sanity.openjdk_aarch64_linux/43/
https://stackoverflow.com/questions/68113258
复制相似问题