I am really new to python and for some reason this has stumped me for a while so I figured I'd ask for help.
I am working on a python script that would allow me read in my files but then if there is a '\' at the end of the line it would join the line after it.
So if the lines are as follows:
: Student 1 : Student 2 \ Student 3
Any line that doesn't have the colon before it and if the previous line has the '\' I want to combine them to look like this:
: Student 2 Student 3
Here is what I tried:
s = "" if line.endswith('\\'): s.join(line) ## line being the line read from the file
Any help in the righ direction would be great
If you can read the full file without splitting it into lines, you can use a regex:
import re text = """ : Student 1 : Student 2 \ Student 3 """.strip() print(re.sub(r'\\\s*\n[^:]', ' ', text)) : Student 1 : Student 2 Student 3
The regex matches occurrences of
\ followed by new line and something that is not a
You can use
join to avoid loop if you starts with a list of strings.
l = ['a\\', 'b','c'] s = '_'.join(l) lx = re.split(r'(?<!\\)_', s) # use negative lookbehind to only split underscore with no `\` before it [e.replace('\\_', '') for e in lx] # replace with '', ' ' if you need so.
s.join doesn't do what you think it does. Also consider that the line in the file has a newline character (
.endswith('\\') won't catch for that reason.
Something like this (although somewhat different method)
output = '' with open('/path/to/file.txt') as f: for line in f: if line.rstrip().endswith('\\'): next_line = next(f) line = line.rstrip()[:-1] + next_line output += line
In the above, we used
line.rstrip() to get read of any trailing whitespace (the newline character) so that the
.endswith method would match properly.
If a line ends with
\, we go ahead and pull the next line out of the file generator using the builtin function
Finally, we combine the line and next line, taking care to once again remove the whitespace (
.rstrip()) and the
\ character (
[:-1] means all chars up to last character) and taking the new line and adding it to the output.
The resulting string prints out like so
: Student 1 : Student 2 Student 3
s.join... It's probably best explained as the opposite of
s as the separator (or joining) character.
>>> "foo.bar.baz".split('.') ['foo', 'bar', 'baz'] >>> "|".join(['foo', 'bar', 'baz']) 'foo|bar|baz'