Find words between two phrases

1587 views python
10

I am trying to extract the words between two phrases. For example assuming I have the following paragraph:

One day after they had made porridge for their breakfast they walked out into the wood while the porridge was cooling And while they were walking a little girl came into the house This little girl had golden curls that tumbled down her back to her waist and everyone called her by Goldilocks.

I would like to get all the words between porridge for and golden curls as well as 2 words before and after these words.

Is there an easy way to do so? I was getting the index of the start of phrases but it lead to quite lengthy code

answered question

but wouldn't that give me words not phrase?

One liner: txt.replace('porridge for', '~').replace('golden curls', '~').split('~')

@Joe124, can you provide the desired output?

1 Answer

13

You could use regular expressions:

import re
match = re.search(r'(\w+ \w+) porridge for (.+) golden curls (\w+ \w+)', text)
whole_match = match.group(0)
two_words_before = match.group(1)
phrase_in_middle = match.group(2)
two_words_after = match.group(3)

posted this

Have an answer?

JD

Please login first before posting an answer.