Running function with ifstream and stringstream multiple times

174 Views Asked by At

Im fairly new to c++ and i would like to ask for suggestions / advice if there is a better / more optimal way to use a function calling ifstream and string stream.

I have a document with the structure with 150 lines and 8 columns (a small subset with values simplified):

5.43e-08    0.0013  0.0105  0.013   0.026   0.068   0.216   0.663
6.98e-08    0.0004  0.0188  0.022   0.103   0.854   0   0
7.31e-08    0.0004  0.0125  0.017   0.074   0.895   0   0
5.82e-08    0.0006  0.0596  0.075   0.150   0.713   0   0

the number of each line represents a position (pos 1 ... pos 150) and each column is a probability of a quality (Qual1 .. Qual8). My goal is to sample from each line each representing a quality distribution, to create a string of a qualities for all the 150 positions. I have created a function which can do this.

std::string Qual(std::ifstream &infile){
  
  std::string line;
  double Q_1,Q_2,Q_3,Q_4,Q_5,Q_6,Q_7,Q_8;
  char Qualities[] = {'1', '2', '3', '4' ,'5', '6', '7','8','\0'};
  std::string Read_qual;

  while (std::getline(infile, line)){
    std::stringstream ss(line);
    ss >> Q_1 >> Q_2 >> Q_3 >> Q_4 >> Q_5 >> Q_6 >> Q_7 >> Q_8;
    
    std::srand(std::time(nullptr));
    std::random_device rd;
    std::default_random_engine gen(rd());
    std::discrete_distribution<> d({Q_1,Q_2,Q_3,Q_4,Q_5,Q_6,Q_7,Q_8});

    Read_qual += Qualities[d(gen)];
  }
  return Read_qual;
}

The problem is that I have to use this function repeatedly to create multiple of these distributions based on some other input. And as far as I can read here on stack overflow I have to use .clear() and seekq to keep the file open but still use it.

int main(int argc,char **argv){
  std::ifstream infile("Freq.txt");
  std::cout << Qual(infile) << std::endl;
  infile.clear();
  infile.seekg(0);
  std::cout << "-------" << std::endl;
  std::cout << Qual(infile);
  return 0;
}

My question is: Is there a more ideal solution to accomplish this when using c++. Like any functions which are perhaps faster. Could anyone come with any suggestions? is it better to keep opening and closing the file?

2

There are 2 best solutions below

0
ytlu On BEST ANSWER

My suggestion:

std::string Qual(double *a)
{  
  std::string line;
  char Qualities[] = {'1', '2', '3', '4' ,'5', '6', '7','8','\0'};
  std::string Read_qual;
 
  std::srand(std::time(nullptr));
  std::random_device rd;
  std::default_random_engine gen(rd());
  std::discrete_distribution<> d({a[0],a[1],a[2],a[3],a[4],a[5],a[6],a[7]);
  Read_qual += Qualities[d(gen)];
  return Read_qual;
}

and the main()

 int main()
 {
  std::ifstream infile("Freq.txt");
  double alldata[150][8];
  for (int i=0, i<150; i++)
  for (int j=0; j<8; j++) infile >> alldata[i][j];
  infile.close();

  for (int idx = 0; idx < 2000; idx++)
  {
     for (int row = 0; row < 150; row++) 
     std::cout << Qual(alldata[row]) << std::endl;
   }
  return 0;
}
3
Surt On

Lets try caching

Totally untested incomplete code

struct row { // your type that goes into the distribution
  double Q_1,Q_2,Q_3,Q_4,Q_5,Q_6,Q_7,Q_8;
};
using QualData = std::vector<row>;  // typedef

QualData ReadData(std::ifstream &infile) {
  std::string line;
  double Q_1,Q_2,Q_3,Q_4,Q_5,Q_6,Q_7,Q_8;
  char Qualities[] = {'1', '2', '3', '4' ,'5', '6', '7','8','\0'};
  std::string Read_qual;
  QualData qual;

  while (std::getline(infile, line)){
    std::stringstream ss(line);
    ss >> Q_1 >> Q_2 >> Q_3 >> Q_4 >> Q_5 >> Q_6 >> Q_7 >> Q_8;
    
    qual.emplace_back(Q_1,Q_2,Q_3,Q_4,Q_5,Q_6,Q_7,Q_8);
 
  }
  return qual;
}

... do qual

int main(int argc,char **argv){
  std::ifstream infile("Freq.txt");
  auto qualData = ReadData(infile);

  std::cout << Qual(qualData) << std::endl;
  std::cout << "-------" << std::endl;
  std::cout << Qual(qualData);
  return 0;
}

You can imaging what else need to change.