Grouping comma-separated lines together












7












$begingroup$


I have comma-delimited files like these, where the first field is sorted in increasing order:



Case 1 ( 1st file ) :



abcd,1
abcd,21
abcd,122
abce,12
abcf,13
abcf,21


Case 2 ( and another file like this ) :



abcd,1
abcd,21
abcd,122


What I want to do is convert the first file to like this :



abcd 1,21,122
abce 12
abcf 13,21


And similarly, for the second file like this :



abcd 1,21,122


Now, I wrote a very ugly code with a lot of if's to check whether the next line's string before the comma is same as current line's string so, if it is then do ....



It's so badly written that, I wrote it myself around 6 months back and it took me around 3-4 minutes to understand why I did what I did in this code.
Well in short it's ugly, in case you would like to see, here it is ( also there's a bug currently in here and since I needed a better way than this whole code so I didn't sort it out, for the curious folks out there the bug is that it doesn't print anything for the second case mentioned above and I know why ).



def clean_file(filePath, destination):
f = open(filePath, 'r')
data = f.read()
f.close()
curr_string = current_number = next_string = next_number = ""
current_numbers = ""
final_payload = ""
lines = data.split('n')[:-1]
for i in range(len(lines)-1):
print(lines[i])
curr_line = lines[i]
next_line = lines[i+1]
curr_string, current_number = curr_line.split(',')
next_string, next_number = next_line.split(',')
if curr_string == next_string:
current_numbers += current_number + ","
else:
current_numbers += current_number # check to avoid ',' in the end
final_payload += curr_string + " " + current_numbers + "n"
current_numbers = ""
print(final_payload)
# For last line
if curr_string != next_string:
# Directly add it to the final_payload
final_payload += next_line + "n"
else:
# Remove the newline, add a comma and then finally add a newline
final_payload = final_payload[:-1] + ","+next_number+"n"
with open(destination, 'a') as f:
f.write(final_payload)



Any better solutions?










share|improve this question











$endgroup$



migrated from stackoverflow.com Jan 13 at 17:33


This question came from our site for professional and enthusiast programmers.














  • 3




    $begingroup$
    Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
    $endgroup$
    – Mast
    Jan 13 at 19:02
















7












$begingroup$


I have comma-delimited files like these, where the first field is sorted in increasing order:



Case 1 ( 1st file ) :



abcd,1
abcd,21
abcd,122
abce,12
abcf,13
abcf,21


Case 2 ( and another file like this ) :



abcd,1
abcd,21
abcd,122


What I want to do is convert the first file to like this :



abcd 1,21,122
abce 12
abcf 13,21


And similarly, for the second file like this :



abcd 1,21,122


Now, I wrote a very ugly code with a lot of if's to check whether the next line's string before the comma is same as current line's string so, if it is then do ....



It's so badly written that, I wrote it myself around 6 months back and it took me around 3-4 minutes to understand why I did what I did in this code.
Well in short it's ugly, in case you would like to see, here it is ( also there's a bug currently in here and since I needed a better way than this whole code so I didn't sort it out, for the curious folks out there the bug is that it doesn't print anything for the second case mentioned above and I know why ).



def clean_file(filePath, destination):
f = open(filePath, 'r')
data = f.read()
f.close()
curr_string = current_number = next_string = next_number = ""
current_numbers = ""
final_payload = ""
lines = data.split('n')[:-1]
for i in range(len(lines)-1):
print(lines[i])
curr_line = lines[i]
next_line = lines[i+1]
curr_string, current_number = curr_line.split(',')
next_string, next_number = next_line.split(',')
if curr_string == next_string:
current_numbers += current_number + ","
else:
current_numbers += current_number # check to avoid ',' in the end
final_payload += curr_string + " " + current_numbers + "n"
current_numbers = ""
print(final_payload)
# For last line
if curr_string != next_string:
# Directly add it to the final_payload
final_payload += next_line + "n"
else:
# Remove the newline, add a comma and then finally add a newline
final_payload = final_payload[:-1] + ","+next_number+"n"
with open(destination, 'a') as f:
f.write(final_payload)



Any better solutions?










share|improve this question











$endgroup$



migrated from stackoverflow.com Jan 13 at 17:33


This question came from our site for professional and enthusiast programmers.














  • 3




    $begingroup$
    Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
    $endgroup$
    – Mast
    Jan 13 at 19:02














7












7








7





$begingroup$


I have comma-delimited files like these, where the first field is sorted in increasing order:



Case 1 ( 1st file ) :



abcd,1
abcd,21
abcd,122
abce,12
abcf,13
abcf,21


Case 2 ( and another file like this ) :



abcd,1
abcd,21
abcd,122


What I want to do is convert the first file to like this :



abcd 1,21,122
abce 12
abcf 13,21


And similarly, for the second file like this :



abcd 1,21,122


Now, I wrote a very ugly code with a lot of if's to check whether the next line's string before the comma is same as current line's string so, if it is then do ....



It's so badly written that, I wrote it myself around 6 months back and it took me around 3-4 minutes to understand why I did what I did in this code.
Well in short it's ugly, in case you would like to see, here it is ( also there's a bug currently in here and since I needed a better way than this whole code so I didn't sort it out, for the curious folks out there the bug is that it doesn't print anything for the second case mentioned above and I know why ).



def clean_file(filePath, destination):
f = open(filePath, 'r')
data = f.read()
f.close()
curr_string = current_number = next_string = next_number = ""
current_numbers = ""
final_payload = ""
lines = data.split('n')[:-1]
for i in range(len(lines)-1):
print(lines[i])
curr_line = lines[i]
next_line = lines[i+1]
curr_string, current_number = curr_line.split(',')
next_string, next_number = next_line.split(',')
if curr_string == next_string:
current_numbers += current_number + ","
else:
current_numbers += current_number # check to avoid ',' in the end
final_payload += curr_string + " " + current_numbers + "n"
current_numbers = ""
print(final_payload)
# For last line
if curr_string != next_string:
# Directly add it to the final_payload
final_payload += next_line + "n"
else:
# Remove the newline, add a comma and then finally add a newline
final_payload = final_payload[:-1] + ","+next_number+"n"
with open(destination, 'a') as f:
f.write(final_payload)



Any better solutions?










share|improve this question











$endgroup$




I have comma-delimited files like these, where the first field is sorted in increasing order:



Case 1 ( 1st file ) :



abcd,1
abcd,21
abcd,122
abce,12
abcf,13
abcf,21


Case 2 ( and another file like this ) :



abcd,1
abcd,21
abcd,122


What I want to do is convert the first file to like this :



abcd 1,21,122
abce 12
abcf 13,21


And similarly, for the second file like this :



abcd 1,21,122


Now, I wrote a very ugly code with a lot of if's to check whether the next line's string before the comma is same as current line's string so, if it is then do ....



It's so badly written that, I wrote it myself around 6 months back and it took me around 3-4 minutes to understand why I did what I did in this code.
Well in short it's ugly, in case you would like to see, here it is ( also there's a bug currently in here and since I needed a better way than this whole code so I didn't sort it out, for the curious folks out there the bug is that it doesn't print anything for the second case mentioned above and I know why ).



def clean_file(filePath, destination):
f = open(filePath, 'r')
data = f.read()
f.close()
curr_string = current_number = next_string = next_number = ""
current_numbers = ""
final_payload = ""
lines = data.split('n')[:-1]
for i in range(len(lines)-1):
print(lines[i])
curr_line = lines[i]
next_line = lines[i+1]
curr_string, current_number = curr_line.split(',')
next_string, next_number = next_line.split(',')
if curr_string == next_string:
current_numbers += current_number + ","
else:
current_numbers += current_number # check to avoid ',' in the end
final_payload += curr_string + " " + current_numbers + "n"
current_numbers = ""
print(final_payload)
# For last line
if curr_string != next_string:
# Directly add it to the final_payload
final_payload += next_line + "n"
else:
# Remove the newline, add a comma and then finally add a newline
final_payload = final_payload[:-1] + ","+next_number+"n"
with open(destination, 'a') as f:
f.write(final_payload)



Any better solutions?







python csv






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 13 at 19:02









Mast

7,49163788




7,49163788










asked Jan 13 at 17:31









temporaryatemporarya

384




384




migrated from stackoverflow.com Jan 13 at 17:33


This question came from our site for professional and enthusiast programmers.









migrated from stackoverflow.com Jan 13 at 17:33


This question came from our site for professional and enthusiast programmers.










  • 3




    $begingroup$
    Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
    $endgroup$
    – Mast
    Jan 13 at 19:02














  • 3




    $begingroup$
    Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
    $endgroup$
    – Mast
    Jan 13 at 19:02








3




3




$begingroup$
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
$endgroup$
– Mast
Jan 13 at 19:02




$begingroup$
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
$endgroup$
– Mast
Jan 13 at 19:02










2 Answers
2






active

oldest

votes


















6












$begingroup$


  1. To solve the grouping problem, use itertools.groupby.

  2. To read files with comma-separated fields, use the csv module.


  3. In almost all cases, open() should be called using a with block, so that the files will be automatically closed for you, even if an exception occurs within the block:



    with open(file_path) as in_f, open(destination, 'w') as out_f:
    data = csv.reader(in_f)
    # code goes here



  4. filePath violates Python's official style guide, which recommends underscores, like your curr_line.






share|improve this answer











$endgroup$









  • 1




    $begingroup$
    Thanks a lot, I got it using those two.
    $endgroup$
    – temporarya
    Jan 13 at 18:25



















3












$begingroup$

While @200_success's answer is very good (always use libraries that solve your problem), I'm going to give an answer that illustrates how to think about more general problems in case there isn't a perfect library.



Use with to automatically close files when you're done



You risk leaving a file open if an exception is raised and file.close() is never called.



with open(input_file) as in_file:


Use the object to iterate, not indices



Most collections and objects can be iterated over directly, so you don't need indices



with open(input_file) as in_file:
for line in in_file:
line = line.strip() # get rid of 'n' at end of line


Use data structures to organize your data



In the end, you want to associate a letter-string with a list of numbers. In python, a dict allows you to associate an piece of data with any other, so we'll use that to associate the letter-strings with a list of numbers.



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, numbers = line.split(',')
data[letters].append(numbers)


Now, this doesn't quite work since, if a letters entry hasn't been seen yet, the call to data[letters] won't have anything to return and will raise a KeyError exception. So, we have to account for that



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string


Now, all of the file is stored in a convenient form in the data object. To output, just loop through the data



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string

with open(output_file, 'w') as out_file:
for letters, number_list in data.items(): # iterate over all entries
out_file.write(letters + ' ' + ','.join(number_list) + 'n')


The .join() method creates a string from a list such that the entries of the list are separated by the string that precedes it--',' in this case.






share|improve this answer









$endgroup$









  • 1




    $begingroup$
    Instead of trying to append and catching the error, you can use setdefault: data.setdefault(letters, ).append(number)
    $endgroup$
    – Todd Sewell
    Jan 13 at 23:04












  • $begingroup$
    @ToddSewell Neat! That'll be useful in the future.
    $endgroup$
    – Mark H
    Jan 13 at 23:08










  • $begingroup$
    Or use collections.defaultdict of course.
    $endgroup$
    – Graipher
    Jan 14 at 14:24











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f211425%2fgrouping-comma-separated-lines-together%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









6












$begingroup$


  1. To solve the grouping problem, use itertools.groupby.

  2. To read files with comma-separated fields, use the csv module.


  3. In almost all cases, open() should be called using a with block, so that the files will be automatically closed for you, even if an exception occurs within the block:



    with open(file_path) as in_f, open(destination, 'w') as out_f:
    data = csv.reader(in_f)
    # code goes here



  4. filePath violates Python's official style guide, which recommends underscores, like your curr_line.






share|improve this answer











$endgroup$









  • 1




    $begingroup$
    Thanks a lot, I got it using those two.
    $endgroup$
    – temporarya
    Jan 13 at 18:25
















6












$begingroup$


  1. To solve the grouping problem, use itertools.groupby.

  2. To read files with comma-separated fields, use the csv module.


  3. In almost all cases, open() should be called using a with block, so that the files will be automatically closed for you, even if an exception occurs within the block:



    with open(file_path) as in_f, open(destination, 'w') as out_f:
    data = csv.reader(in_f)
    # code goes here



  4. filePath violates Python's official style guide, which recommends underscores, like your curr_line.






share|improve this answer











$endgroup$









  • 1




    $begingroup$
    Thanks a lot, I got it using those two.
    $endgroup$
    – temporarya
    Jan 13 at 18:25














6












6








6





$begingroup$


  1. To solve the grouping problem, use itertools.groupby.

  2. To read files with comma-separated fields, use the csv module.


  3. In almost all cases, open() should be called using a with block, so that the files will be automatically closed for you, even if an exception occurs within the block:



    with open(file_path) as in_f, open(destination, 'w') as out_f:
    data = csv.reader(in_f)
    # code goes here



  4. filePath violates Python's official style guide, which recommends underscores, like your curr_line.






share|improve this answer











$endgroup$




  1. To solve the grouping problem, use itertools.groupby.

  2. To read files with comma-separated fields, use the csv module.


  3. In almost all cases, open() should be called using a with block, so that the files will be automatically closed for you, even if an exception occurs within the block:



    with open(file_path) as in_f, open(destination, 'w') as out_f:
    data = csv.reader(in_f)
    # code goes here



  4. filePath violates Python's official style guide, which recommends underscores, like your curr_line.







share|improve this answer














share|improve this answer



share|improve this answer








edited Jan 13 at 20:52

























answered Jan 13 at 18:00









200_success200_success

129k15153415




129k15153415








  • 1




    $begingroup$
    Thanks a lot, I got it using those two.
    $endgroup$
    – temporarya
    Jan 13 at 18:25














  • 1




    $begingroup$
    Thanks a lot, I got it using those two.
    $endgroup$
    – temporarya
    Jan 13 at 18:25








1




1




$begingroup$
Thanks a lot, I got it using those two.
$endgroup$
– temporarya
Jan 13 at 18:25




$begingroup$
Thanks a lot, I got it using those two.
$endgroup$
– temporarya
Jan 13 at 18:25













3












$begingroup$

While @200_success's answer is very good (always use libraries that solve your problem), I'm going to give an answer that illustrates how to think about more general problems in case there isn't a perfect library.



Use with to automatically close files when you're done



You risk leaving a file open if an exception is raised and file.close() is never called.



with open(input_file) as in_file:


Use the object to iterate, not indices



Most collections and objects can be iterated over directly, so you don't need indices



with open(input_file) as in_file:
for line in in_file:
line = line.strip() # get rid of 'n' at end of line


Use data structures to organize your data



In the end, you want to associate a letter-string with a list of numbers. In python, a dict allows you to associate an piece of data with any other, so we'll use that to associate the letter-strings with a list of numbers.



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, numbers = line.split(',')
data[letters].append(numbers)


Now, this doesn't quite work since, if a letters entry hasn't been seen yet, the call to data[letters] won't have anything to return and will raise a KeyError exception. So, we have to account for that



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string


Now, all of the file is stored in a convenient form in the data object. To output, just loop through the data



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string

with open(output_file, 'w') as out_file:
for letters, number_list in data.items(): # iterate over all entries
out_file.write(letters + ' ' + ','.join(number_list) + 'n')


The .join() method creates a string from a list such that the entries of the list are separated by the string that precedes it--',' in this case.






share|improve this answer









$endgroup$









  • 1




    $begingroup$
    Instead of trying to append and catching the error, you can use setdefault: data.setdefault(letters, ).append(number)
    $endgroup$
    – Todd Sewell
    Jan 13 at 23:04












  • $begingroup$
    @ToddSewell Neat! That'll be useful in the future.
    $endgroup$
    – Mark H
    Jan 13 at 23:08










  • $begingroup$
    Or use collections.defaultdict of course.
    $endgroup$
    – Graipher
    Jan 14 at 14:24
















3












$begingroup$

While @200_success's answer is very good (always use libraries that solve your problem), I'm going to give an answer that illustrates how to think about more general problems in case there isn't a perfect library.



Use with to automatically close files when you're done



You risk leaving a file open if an exception is raised and file.close() is never called.



with open(input_file) as in_file:


Use the object to iterate, not indices



Most collections and objects can be iterated over directly, so you don't need indices



with open(input_file) as in_file:
for line in in_file:
line = line.strip() # get rid of 'n' at end of line


Use data structures to organize your data



In the end, you want to associate a letter-string with a list of numbers. In python, a dict allows you to associate an piece of data with any other, so we'll use that to associate the letter-strings with a list of numbers.



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, numbers = line.split(',')
data[letters].append(numbers)


Now, this doesn't quite work since, if a letters entry hasn't been seen yet, the call to data[letters] won't have anything to return and will raise a KeyError exception. So, we have to account for that



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string


Now, all of the file is stored in a convenient form in the data object. To output, just loop through the data



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string

with open(output_file, 'w') as out_file:
for letters, number_list in data.items(): # iterate over all entries
out_file.write(letters + ' ' + ','.join(number_list) + 'n')


The .join() method creates a string from a list such that the entries of the list are separated by the string that precedes it--',' in this case.






share|improve this answer









$endgroup$









  • 1




    $begingroup$
    Instead of trying to append and catching the error, you can use setdefault: data.setdefault(letters, ).append(number)
    $endgroup$
    – Todd Sewell
    Jan 13 at 23:04












  • $begingroup$
    @ToddSewell Neat! That'll be useful in the future.
    $endgroup$
    – Mark H
    Jan 13 at 23:08










  • $begingroup$
    Or use collections.defaultdict of course.
    $endgroup$
    – Graipher
    Jan 14 at 14:24














3












3








3





$begingroup$

While @200_success's answer is very good (always use libraries that solve your problem), I'm going to give an answer that illustrates how to think about more general problems in case there isn't a perfect library.



Use with to automatically close files when you're done



You risk leaving a file open if an exception is raised and file.close() is never called.



with open(input_file) as in_file:


Use the object to iterate, not indices



Most collections and objects can be iterated over directly, so you don't need indices



with open(input_file) as in_file:
for line in in_file:
line = line.strip() # get rid of 'n' at end of line


Use data structures to organize your data



In the end, you want to associate a letter-string with a list of numbers. In python, a dict allows you to associate an piece of data with any other, so we'll use that to associate the letter-strings with a list of numbers.



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, numbers = line.split(',')
data[letters].append(numbers)


Now, this doesn't quite work since, if a letters entry hasn't been seen yet, the call to data[letters] won't have anything to return and will raise a KeyError exception. So, we have to account for that



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string


Now, all of the file is stored in a convenient form in the data object. To output, just loop through the data



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string

with open(output_file, 'w') as out_file:
for letters, number_list in data.items(): # iterate over all entries
out_file.write(letters + ' ' + ','.join(number_list) + 'n')


The .join() method creates a string from a list such that the entries of the list are separated by the string that precedes it--',' in this case.






share|improve this answer









$endgroup$



While @200_success's answer is very good (always use libraries that solve your problem), I'm going to give an answer that illustrates how to think about more general problems in case there isn't a perfect library.



Use with to automatically close files when you're done



You risk leaving a file open if an exception is raised and file.close() is never called.



with open(input_file) as in_file:


Use the object to iterate, not indices



Most collections and objects can be iterated over directly, so you don't need indices



with open(input_file) as in_file:
for line in in_file:
line = line.strip() # get rid of 'n' at end of line


Use data structures to organize your data



In the end, you want to associate a letter-string with a list of numbers. In python, a dict allows you to associate an piece of data with any other, so we'll use that to associate the letter-strings with a list of numbers.



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, numbers = line.split(',')
data[letters].append(numbers)


Now, this doesn't quite work since, if a letters entry hasn't been seen yet, the call to data[letters] won't have anything to return and will raise a KeyError exception. So, we have to account for that



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string


Now, all of the file is stored in a convenient form in the data object. To output, just loop through the data



with open(input_file) as in_file:
data = dict()
for line in in_file:
line = line.strip() # get rid of 'n' at end of line
letters, number = line.split(',')
try: # there might be an error
data[letters].append(number) # append new number if letters has been seen before
except KeyError:
data[letters] = [number] # create new list with one number for a new letter-string

with open(output_file, 'w') as out_file:
for letters, number_list in data.items(): # iterate over all entries
out_file.write(letters + ' ' + ','.join(number_list) + 'n')


The .join() method creates a string from a list such that the entries of the list are separated by the string that precedes it--',' in this case.







share|improve this answer












share|improve this answer



share|improve this answer










answered Jan 13 at 22:47









Mark HMark H

392110




392110








  • 1




    $begingroup$
    Instead of trying to append and catching the error, you can use setdefault: data.setdefault(letters, ).append(number)
    $endgroup$
    – Todd Sewell
    Jan 13 at 23:04












  • $begingroup$
    @ToddSewell Neat! That'll be useful in the future.
    $endgroup$
    – Mark H
    Jan 13 at 23:08










  • $begingroup$
    Or use collections.defaultdict of course.
    $endgroup$
    – Graipher
    Jan 14 at 14:24














  • 1




    $begingroup$
    Instead of trying to append and catching the error, you can use setdefault: data.setdefault(letters, ).append(number)
    $endgroup$
    – Todd Sewell
    Jan 13 at 23:04












  • $begingroup$
    @ToddSewell Neat! That'll be useful in the future.
    $endgroup$
    – Mark H
    Jan 13 at 23:08










  • $begingroup$
    Or use collections.defaultdict of course.
    $endgroup$
    – Graipher
    Jan 14 at 14:24








1




1




$begingroup$
Instead of trying to append and catching the error, you can use setdefault: data.setdefault(letters, ).append(number)
$endgroup$
– Todd Sewell
Jan 13 at 23:04






$begingroup$
Instead of trying to append and catching the error, you can use setdefault: data.setdefault(letters, ).append(number)
$endgroup$
– Todd Sewell
Jan 13 at 23:04














$begingroup$
@ToddSewell Neat! That'll be useful in the future.
$endgroup$
– Mark H
Jan 13 at 23:08




$begingroup$
@ToddSewell Neat! That'll be useful in the future.
$endgroup$
– Mark H
Jan 13 at 23:08












$begingroup$
Or use collections.defaultdict of course.
$endgroup$
– Graipher
Jan 14 at 14:24




$begingroup$
Or use collections.defaultdict of course.
$endgroup$
– Graipher
Jan 14 at 14:24


















draft saved

draft discarded




















































Thanks for contributing an answer to Code Review Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f211425%2fgrouping-comma-separated-lines-together%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Can a sorcerer learn a 5th-level spell early by creating spell slots using the Font of Magic feature?

Does disintegrating a polymorphed enemy still kill it after the 2018 errata?

A Topological Invariant for $pi_3(U(n))$