I'm getting a segmentation fault when running the code below.
It should basically read a .csv file with over 3M lines and do other stuff afterwards (not relevant to the problem), but after 207746 iterations it returns a segmentation fault. If I remove the p = strsep(&line,"|"); and just print the whole line it will print the >3M lines.
int ReadCSV (int argc, char *argv[]){
char *line = NULL, *p;
unsigned long count = 0;
FILE *data;
if (argc < 2) return 1;
if((data = fopen(argv[1], "r")) == NULL){
printf("the CSV file cannot be open");
exit(0);
}
while (getline(&line, &len, data)>0) {
p = strsep(&line,"|");
printf("Line number: %lu \t p: %s\n", count, p);
count++;
}
free(line);
fclose(data);
return 0;
}
I guess it'd have to do with the memory allocation, but can't figure out how to fix it.
A combination of
getlineandstrsepoften causes confusion, because both functions change the pointer that you pass them by pointer as the initial argument. If you pass the pointer that has been throughstrseptogetlineagain, you run the risk of undefined behavior on the second iteration.Consider an example:
getlineallocates 101 bytes toline, and reads a 100-character string into it. Note thatlenis now set to 101. You callstrsep, which finds'|'in the middle of the string, so it pointslineto what used to beline+50. Now you callgetlineagain. It sees another 100-character line, and concludes that it is OK to copy it into the buffer, becauselenis still 101. However, sincelinepoints to the middle of the buffer now, writing 100 characters becomes undefined behavior.Make a copy of
linebefore callingstrsep:Now
linethat you pass togetlineis preserved between loop iterations.