如何解决Python 子字符串从头开始返回索引有重复的大小写
示例 1:
字符串:
" This is paragraph one "
子串列表:["This","is paragraph","one"]
我需要返回对应子串的索引
结果:[[0,4],[5,17],[18,21]]
示例 2: (可能有更多的空白,可能有重复的子字符串)
字符串:
"
This is a book a book.
"
列表:
子串列表:["This","is a","book","a","book"]
结果:[[0,9],[10,14],[21,22],[23,27]]
解决方法
您可以尝试以下操作:
s = "This is a book a book."
subs = ["This","is a","book","a","book"]
bounds = []
end = 0
for sub in subs:
bounds.append((start := s[end:].find(sub) + end,end := start + len(sub)))
print(bounds)
它给出:
[(0,4),(5,9),(10,14),(21,22),(23,27)]
为了消遣,同样使用re
:
s = "This is a book a book."
subs = ["This","book"]
import re
re.match(".*".join(f"({t})" for t in subs),s).regs[1:]
它给出:
((0,27))
,
您可以使用生成器函数:
├── index.tsx
├── about-us.tsx
输出:
def get_matches(s,sub):
inds = []
for i in sub:
if (k:=[j for j in range(len(s)) if s[j:].startswith(i) and (not inds or j > max(inds))]):
yield [k[0],k[0]+len(i)]
inds.append(k[0])
s = 'This is a book a book.'
subs = ['This','is a','book','a','book']
print(list(get_matches(s,subs)))
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。