最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

如何捕捉一组中最长的序列

SEO心得admin43浏览0评论
本文介绍了如何捕捉一组中最长的序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

任务是找到一组中最长的序列

The task is to find the longest sequence of a group

例如,给定DNA序列: AGATCAGATCTTTTTTCTAATGTCTAGGATATATCAGATCAGATCAGATCAGATCAGATC

for instance, given DNA sequence: "AGATCAGATCTTTTTTCTAATGTCTAGGATATATCAGATCAGATCAGATCAGATCAGATC" and it has 7 occurrences of AGATC. (AGATC) matches all occurrences. Is it possible to write a regular expression that catches only the longest sequence, i.e. AGATCAGATCAGATCAGATCAGATC in the given text? If this is not possible only with regex, how can I iterate through each sequence (i.e. 1st sequence is AGATCAGATC, 2nd - AGATCAGATCAGATCAGATCAGATC et cetera) in python?

推荐答案

使用:

import re sequence = "AGATCAGATCTTTTTTCTAATGTCTAGGATATATCAGATCAGATCAGATCAGATCAGATC" matches = re.findall(r'(?:AGATC)+', sequence) # To find the longest subsequence longest = max(matches, key=len)

说明:

非捕获组(?: AGATC)+

  • + 量词-一次和无限次匹配,例如
  • AGATC 字面上匹配字符AGATC(区分大小写)
  • + Quantifier — Matches between one and unlimited times, as many times as possible.
  • AGATC matches the characters AGATC literally (case sensitive)

结果:

# print(matches) ['AGATCAGATC', 'AGATCAGATCAGATCAGATCAGATC'] # print(longest) 'AGATCAGATCAGATCAGATCAGATC'

您可以测试正则表达式 此处 。

You can test the regex here.

发布评论

评论列表(0)

  1. 暂无评论