我有个关于regex的问题。假设我有这条绳子
"She gained about 55 pounds in...9 months. She was like an eating machine. ”Trump, a man who wants to be president: "我想删除句点后和字符前的每个空格“并删除字符”。
例如,这部分句子
She was like an eating machine. ”Trump, a man who wants to be president: 应该变成
She was like an eating machine.Trump, a man who wants to be president: "谢谢各位,regex不容易学到。感谢你的帮助!拜p.s,我使用的是软件R,但我认为这是不相关的,因为regex适用于每种编程语言
更新
我解决了我的问题,我想分享它,也许可以帮助别人。我从kaggle下载了这个关于trump和hillary tweet的数据集。
我必须做一些清理之前,进口数据的K尼梅(项目在大学)。除了这个,我已经通过gsub解决了所有的编码问题。我终于设法解决了它写一个csv文件在R与编码UTF-8.显然,我使用相同的编码方式在Knime中读取了该文件。
发布于 2016-12-21 14:03:57
如果需要匹配点和卷曲双引号之间的任意数量的空格(1或更多),则可以使用
x <- "She gained about 55 pounds in...9 months. She was like an eating machine. ”Trump, a man who wants to be president: "
gsub("\\.\\s+”", ".", x)
## => [1] "She gained about 55 pounds in...9 months. She was like an eating machine.Trump, a man who wants to be president: "\\.匹配点,\\s+匹配一个或多个空格符号,”匹配”。
如果在点和引号之间只有一个固定的空格,您可以使用固定字符串替换:
gsub(". ”", ".", x, fixed=TRUE)见这个R演示。
发布于 2016-12-21 14:07:00
这可能有助于:
var str = 'She was like an eating machine. "Trump, a man who wants to be president. "New value';
str.replace(/\.\s"/g,".");发布于 2016-12-21 15:00:50
http://regexr.com/是学习和测试正则表达式的好工具。
在Wiktor的回答中,我要补充的唯一一点是,它与"machine.”Trump"不匹配。若要匹配点后和引号之前的任何空格数,请使用*量词:
x <- "She gained about 55 pounds in...9 months. She was like an eating machine. ”Trump, a man who wants to be president: "
gsub("\\.\\s*”", ".", x)https://stackoverflow.com/questions/41264545
复制相似问题