Regular Expression in Python to Replace Items in a List

1786 views python
8

This is for an exercise we are doing in uni. I am trying to get all k1-9 and p1-9 strings in the txt file and change them so that each k(n) = 1*n and each p(n) = 0*n (i.e p5= 00000, k3= 111, p2= 00). I have managed to gather the k1-9 and p1-9 in a list called codes but I dont know how to proceed.

import re

with open("suspicious_knitting.txt") as file:
    string = file.read()
    codes = re.findall("k[1-9]|p[1-9]" ,string)

Printing codes is like this.:['k1', 'p1', 'k1', 'p1', 'k1', 'p2', 'k1', 'p2', 'k1', 'p3', 'k1', 'p3', 'k1', 'p1', 'k2', 'p1', 'k2', 'p3', 'k1', 'p2', 'k2', 'p1', 'k2', 'p1', 'k1', 'p1', 'k1', 'p1', 'k2', 'p2', 'k3', 'p1', 'k1', 'p2', 'k1', 'p2', 'k2', 'p1', 'k1', 'p1', 'k1', 'p2', 'k1', 'p2', 'k1', 'p2', 'k2', 'p2', 'k5', 'p2', 'k3', 'p1', 'k1', 'p1', 'k1', 'p2', 'k3', 'p1', 'k2', 'p3']

answered question

Could you add a sample input?

I think you need re.sub("([kp])([1-9])", lambda x: "0" * int(x.group(2)) if x.group(1) == 'p' else "1" * int(x.group(2)),s), see demo

1 Answer

3

You could use sub:

import re

text = ' '.join(
    ['k1', 'p1', 'k1', 'p1', 'k1', 'p2', 'k1', 'p2', 'k1', 'p3', 'k1', 'p3', 'k1', 'p1', 'k2', 'p1', 'k2', 'p3',
     'k1', 'p2', 'k2', 'p1', 'k2', 'p1', 'k1', 'p1', 'k1', 'p1', 'k2', 'p2', 'k3', 'p1', 'k1', 'p2', 'k1', 'p2',
     'k2', 'p1', 'k1', 'p1', 'k1', 'p2', 'k1', 'p2', 'k1', 'p2', 'k2', 'p2', 'k5', 'p2', 'k3', 'p1', 'k1', 'p1',
     'k1', 'p2', 'k3', 'p1', 'k2', 'p3'])


def repl(match):
    return int(match.group(2)) * match.group(1)


result = re.sub('([kp])([1-9])', repl, text)
print(result)

Output

k p k p k pp k pp k ppp k ppp k p kk p kk ppp k pp kk p kk p k p k p kk pp kkk p k pp k pp kk p k p k pp k pp k pp kk pp kkkkk pp kkk p k p k pp kkk p kk ppp

From the documentation of sub:

Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl.

It turns out that repl can be a function, that receives a match object. In this case the repl takes the second matching group (number of repetitions) cast it to int an multiplies for the first matching group, the letter k or p.

posted this

Have an answer?

JD

Please login first before posting an answer.