Join lines together in file that end with '\'

1632 views python
8

I am really new to python and for some reason this has stumped me for a while so I figured I'd ask for help.

I am working on a python script that would allow me read in my files but then if there is a '\' at the end of the line it would join the line after it.

So if the lines are as follows:

: Student 1 
: Student 2 \
Student 3

Any line that doesn't have the colon before it and if the previous line has the '\' I want to combine them to look like this:

: Student 2 Student 3

Here is what I tried:

s = ""    
if line.endswith('\\'): 
   s.join(line) ## line being the line read from the file

Any help in the righ direction would be great

answered question

s.join doesn't do what you think it does. Also consider that the line in the file has a newline character ('\n') so endswith('\\') probably won't catch for that reason.

3 Answers

6

If you can read the full file without splitting it into lines, you can use a regex:

import re

text = """
: Student 1 
: Student 2 \
Student 3
""".strip()

print(re.sub(r'\\\s*\n[^:]', ' ', text))

: Student 1 
: Student 2 Student 3

The regex matches occurrences of \ followed by new line and something that is not a :.

posted this
0

You can use regex and join to avoid loop if you starts with a list of strings.

l = ['a\\', 'b','c']
s = '_'.join(l)
lx = re.split(r'(?<!\\)_', s) # use negative lookbehind to only split underscore with no `\` before it
[e.replace('\\_', '') for e in lx] # replace with '', ' ' if you need so.

Output:

['ab', 'c']

posted this
13

s.join doesn't do what you think it does. Also consider that the line in the file has a newline character ('\n') so .endswith('\\') won't catch for that reason.

Something like this (although somewhat different method)

output = ''
with open('/path/to/file.txt') as f:
    for line in f:
        if line.rstrip().endswith('\\'):
            next_line = next(f)
            line = line.rstrip()[:-1] + next_line
        output += line

In the above, we used line.rstrip() to get read of any trailing whitespace (the newline character) so that the .endswith method would match properly.

If a line ends with \, we go ahead and pull the next line out of the file generator using the builtin function next.

Finally, we combine the line and next line, taking care to once again remove the whitespace (.rstrip()) and the \ character ([:-1] means all chars up to last character) and taking the new line and adding it to the output.

The resulting string prints out like so

: Student 1 
: Student 2 Student 3

Note about s.join... It's probably best explained as the opposite of split, using s as the separator (or joining) character.

>>> "foo.bar.baz".split('.')
['foo', 'bar', 'baz']
>>> "|".join(['foo', 'bar', 'baz'])
'foo|bar|baz'

posted this

Have an answer?

JD

Please login first before posting an answer.