Is it possible to make C++ iostream std::cout be as performant as cstdio printf()?

133 Views Asked by At

Note: This is not a duplicate of existing std::ios::sync_with_stdio(false) questions. I have gone through all of them and yet I am unable to make cout behave as fast as printf. Example code and evidence shown below.

I have three source code files:

// ex1.cpp

#include <cstdio>
#include <chrono>

int main()
{
    auto t1 = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < 10000000; i++) {
        printf("%d\n", i);
    }
    auto t2 = std::chrono::high_resolution_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1);
    fprintf(stderr, "%lld\n", duration.count());
}
// ex2.cpp

#include <iostream>
#include <chrono>

int main()
{
    auto t1 = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < 10000000; i++) {
        std::cout << i << '\n';
    }
    auto t2 = std::chrono::high_resolution_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1);
    std::cerr << duration.count() << '\n';
}
// ex3.cpp

#include <iostream>
#include <chrono>

int main()
{
    std::ios::sync_with_stdio(false);
    std::cin.tie(nullptr);

    auto t1 = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < 10000000; i++) {
        std::cout << i << '\n';
    }
    auto t2 = std::chrono::high_resolution_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1);
    std::cerr << duration.count() << '\n';
}

I am not going to ever mix cstdio and iostream in my code, so the hacks used in ex3.cpp are okay for me.

I compile them with clang++ on macOS with solid-state drive.

clang++ -std=c++11 -O2 -Wall -Wextra -pedantic ex1.cpp -o ex1
clang++ -std=c++11 -O2 -Wall -Wextra -pedantic ex2.cpp -o ex2
clang++ -std=c++11 -O2 -Wall -Wextra -pedantic ex3.cpp -o ex3

Now I run them and time them.

$ time ./ex1 > out.txt
1282

real    0m1.294s
user    0m1.217s
sys     0m0.071s

$ time ./ex1 > out.txt
1299

real    0m1.333s
user    0m1.221s
sys     0m0.072s

$ time ./ex1 > out.txt
1277

real    0m1.295s
user    0m1.214s
sys     0m0.070s
$ time ./ex2 > out.txt
3102

real    0m3.371s
user    0m3.037s
sys     0m0.075s

$ time ./ex2 > out.txt
3153

real    0m3.164s
user    0m3.073s
sys     0m0.075s

$ time ./ex2 > out.txt
3136

real    0m3.150s
user    0m3.051s
sys     0m0.077s
$ time ./ex3 > out.txt
3118

real    0m3.513s
user    0m3.045s
sys     0m0.080s

$ time ./ex3 > out.txt
3113

real    0m3.124s
user    0m3.042s
sys     0m0.077s

$ time ./ex3 > out.txt
3095

real    0m3.107s
user    0m3.029s
sys     0m0.073s

The results are quite similar even if I redirect the output to /dev/null. The results are quite similar with -O3 optimization level too.

Both ex3 and ex2 are slower than ex1? Is it possible to use std::cout in anyway that gives comparable speed with printf?

0

There are 0 best solutions below