Im currently trying to parse a .csv file with sscanf. I've made a function that detects quotes at the start of the line.
int checkString(const char *str) {
if (str[0] == '"') {
return 1;
}
return 0;
}
I need help with parsing the file, especially since the format of the csv file varies.
Here is my attempt at parsing the file using sscanf
int main() {
char inputLine[100]; //input line
Node *pList = NULL; //initialize linked list
FILE *inputStream = fopen("../musicPlayList.csv", "r");
Record testArr[100];
Record inputRecord;
Duration inputDuration;
int inputTimesPlayed = 0;
int inputRating = 0;
char artistToken[20];
char albumTitleToken[20];
char songTitleToken[20];
char genreToken[20];
char durationToken[20];
char timesPlayedToken[20];
char ratingToken[20];
if (inputStream != NULL) {
while (fgets(inputLine, sizeof(inputLine), inputStream) != NULL) {
if (checkString(inputLine)) {
sscanf(inputLine, "\"%[^\"]\" %s %s %s %s %s %s",
artistToken, albumTitleToken,
songTitleToken, genreToken, durationToken,
timesPlayedToken, ratingToken);
} else {
sscanf(inputLine, "%s %s %s %s %s %s %s",
artistToken, albumTitleToken,
songTitleToken, genreToken, durationToken,
timesPlayedToken, ratingToken);
}
printf("%s %s %s %s %s %s %s\n",
artistToken, albumTitleToken, songTitleToken,
genreToken, durationToken, timesPlayedToken,
ratingToken);
}
}
return 0;
}
here is the .csv file:
"Swift, Taylor",1989,Shake it Off,Pop,3:35,12,3
Drake,NOTHING WAS THE SAME,Own it,Rap,3:23,3,3
Drake,YOU WELCOME,The Motto,Rap,4:13,7,4
"Perri, Christina",HEAD OF HEART,Trust,Pop,2:35,3,5
"Bieber, Justin",PURPOSE,No Sense,Pop,4:12,6,1
Eminem,SHADYXV,Vegas,Rap,3:37,8,3
Adele,25,Remedy,Pop,4:11,24,4
"Swift, Taylor",RED,Stay Stay Stay,Pop,4:42,5,1
"Brooks, Garth",FRESH HORSES,The Old Stuff,Country,2:57,11,2
Here is the program output:
Swift, Taylor ,1989 , Shake it Off, Pop, 3:35,12,3 00doA
Drake, NOTHING WAS THE SAME, Own it, Rap, 3:23,3, 3
Drake, YOU WELCOME, The Motto, Rap, 4:13,7,4 SAME, Own it, Rap, 3:23,3,3
Perri, Christina ,HEAD ,3,5 HEART, Trust, Pop,2:35,3,5 it, Rap, 3:23,3,3
Bieber, Justin ,PURPOSE, No Sense, Pop, 4:12,6,1 HEART, Trust, Pop, 2:35Sense, Pop, 4:12,6,1 it, Rap, 3:23, 3, 3
Eminem, SHADYXV, Vegas, Rap, 3:37,8,3 , PURPOSE, No Sense, Pop, 4:12,6,1 HEART, Trust, Pop, 2:35Sense, Pop, 4:12,6,1 it, Rap, 3:23,3, 3
Adele, 25, Remedy, Pop, 4:11, 24,4 , PURPOSE, No Sense, Pop, 4:12,6,1 HEART, Trust, Pop, 2:35Sense, Pop, 4:12,6,1 it, Rap, 3:23,3 , 3
Swift, Taylor ,RED, Stay Stay Stay, Pop, 4:42,5,1 it, Rap, 3:23,3, 3
Brooks, Garth ,FRESH HORSES, The 1,2 Stuff, Country,2:57, 11, 2
Process finished with exit code 0
why is it returning garbage values such as 00doA.
sscanfcan be used to parse CVS lines, but it is a very trick tool for the job with many shortcomings. Your code has problems:NOTHING WAS THE SAMEhas 20 characters, which will not fit in an array of 20charassscanfwill store a null terminator at the end of the string. Make the arrays larger and pass a character count between the%and the conversion specifier.%sparses words separated by white space, you want a format of%19[^,]to parse fields separated by commas.sscanfwill not be able to parse empty fieldssscanf()to detect malformed or missing input. If any of the fields are parsed incorrectly, the remaining tokens are unmodified, potentially uninitialized, which may explain the garbage output.Here is a modified version:
Output:
Note however that this code will fail if the album title or song title is encoded with double quotes, especially if it contains embedded newlines or if the quoted field contains embedded quotes. A more advanced hand coded CSV parser is needed to handle these variations.