如何捕捉一组中最长的序列

本文介绍了如何捕捉一组中最长的序列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

任务是找到一组中最长的序列

The task is to find the longest sequence of a group

例如，给定DNA序列： AGATCAGATCTTTTTTCTAATGTCTAGGATATATCAGATCAGATCAGATCAGATCAGATC

for instance, given DNA sequence: "AGATCAGATCTTTTTTCTAATGTCTAGGATATATCAGATCAGATCAGATCAGATCAGATC" and it has 7 occurrences of AGATC. (AGATC) matches all occurrences. Is it possible to write a regular expression that catches only the longest sequence, i.e. AGATCAGATCAGATCAGATCAGATC in the given text? If this is not possible only with regex, how can I iterate through each sequence (i.e. 1st sequence is AGATCAGATC, 2nd - AGATCAGATCAGATCAGATCAGATC et cetera) in python?

推荐答案

使用：

import re sequence = "AGATCAGATCTTTTTTCTAATGTCTAGGATATATCAGATCAGATCAGATCAGATCAGATC" matches = re.findall(r'(?:AGATC)+', sequence) # To find the longest subsequence longest = max(matches, key=len)

说明：

非捕获组（?: AGATC）+

+ 量词-一次和无限次匹配，例如
AGATC 字面上匹配字符AGATC（区分大小写）

+ Quantifier — Matches between one and unlimited times, as many times as possible.
AGATC matches the characters AGATC literally (case sensitive)

结果：

# print(matches) ['AGATCAGATC', 'AGATCAGATCAGATCAGATCAGATC'] # print(longest) 'AGATCAGATCAGATCAGATCAGATC'

您可以测试正则表达式此处。

You can test the regex here.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

如何捕捉一组中最长的序列

与本文相关的文章

评论列表(0)