Clear Filters
Clear Filters

Info

This question is closed. Reopen it to edit or answer.

How to use regexp to search for separated sequences?

1 view (last 30 days)
Ank Agarwal
Ank Agarwal on 12 Dec 2016
Closed: MATLAB Answer Bot on 20 Aug 2021
Say I have a text file with tons of random letters and I want to search it for the sequence ABBA[AGTC]DABA I want ABBA and DABA to be on the ends of the sequence but also want a variable sequence in the middle. GAGA can be either 0, 4, or 6 letters and can be any combination of the letters AGTC.
Any idea how to search for such sequences using regexp? Maybe another search command?
  2 Comments
Star Strider
Star Strider on 12 Dec 2016
Is the ‘variable sequence in the middle’ always enclosed within square brackets []?
Can you provide a sample sequence, or a file with some sample sequences, in the format you expect to use them?
per isakson
per isakson on 12 Dec 2016
Edited: per isakson on 12 Dec 2016
Are you looking for something like this
>> regexp( 'ABBAAGTGTCDABA', 'ABBA([AGTC]{0}|[AGTC]{4}|[AGTC]{6})DABA', 'match' )
ans =
'ABBAAGTGTCDABA'
>> regexp( 'ABBADABA', 'ABBA([AGTC]{0}|[AGTC]{4}|[AGTC]{6})DABA', 'match' )
ans =
'ABBADABA'
>> regexp( 'xxxABBAAGTGTCDABAzzz', 'ABBA([AGTC]{0}|[AGTC]{4}|[AGTC]{6})DABA', 'match' )
ans =
'ABBAAGTGTCDABA'

Answers (0)

This question is closed.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!