Correct xtics and keys with columnstacked histograms from CSV (Gnuplot)

26 Views Asked by At

I'm trying to visualize accumulated working hours for different projects in rows (Foo Stuff, Bar Stuff, ...), recorded by employees in columns (alice, bob, ...) in a CSV file like this:

id,title,alice,bob,charlie,diego
foo,Foo Stuff,2,,3,1
bar,Bar Stuff,1,5,,
baz,Baz Stuff,,1,8,5

My goal is to get a stacked bar histogram with x axis being the employees, y axis the accumulated work hours per employee. My current approach is this gnuplot (I'm using 5.4.3) script:

set datafile separator ','

set style data histograms
set style histogram columnstacked
set style fill solid noborder
set boxwidth 0.75
set xtics rotate by 90 right
set grid ytics linestyle 0

set key outside
set xlabel "Employee"
set ylabel "Working hours"

plot for [COL=3:*] 'example.csv' using COL title columnhead

enter image description here

I'm new to gnuplot and it's not that easy for me to extract everything from the documentation. My most important questions here are:

  1. How to get the xtic labels right? There's a double diego after the last column.
  2. How to get a nice legend with project names (2nd column)?
1

There are 1 best solutions below

3
theozh On BEST ANSWER

To your questions:

  1. it looks like the loop with for [COL=2:*] creates this double "diego". Looks like a bug to me (maybe only together with stacked histogram style)

You can avoid it by

  • either set for [COL=2:6] if you know the number of columns beforehand
  • or first find out the correct number of columns via stats stored in the variable STATS_columns (check help stats).
  1. add an invisible plot just for the legend using with boxes and store the second column in the variable t which will be used for the title.

Maybe there are shorter and smarter solutions, but check the following example as starting point. It works for gnuplot>=5.4.0, but for older versions there might be some other workarounds.

Script: (works for gnuplot>=5.4.0)

###  stacked bar chart with autocolumns and titles from column
reset session

$Data <<EOD
id,title,alice,bob,charlie,diego
foo,Foo Stuff,2,,3,1
bar,Bar Stuff,1,5,,
baz,Baz Stuff,,1,8,5
EOD

set datafile separator ','
set style data histograms
set style histogram columnstacked
set style fill solid noborder
set boxwidth 0.75
set xtics rotate by 90 right
set grid ytics linestyle 0
set key outside
set xlabel "Employee"
set ylabel "Working hours"

stats $Data u 0 nooutput
colMax = STATS_columns
rowMax = STATS_records

plot for [COL=3:colMax] $Data u COL ti columnhead, \
     for [ROW=1:rowMax-1] '' every ::ROW::ROW u (t=strcol(2),NaN) w boxes lc ROW ti t
### end of script

Attempt to explain the second part of the plot command:

  • it loops from the second row to the last row (row-indices in gnuplot are zero-based)
  • every ::ROW::ROW limits it to one single row (check help every)
  • during plotting it assigns column 2 of the current row as string to variable t
  • because of serial evaluation (...,NaN) (check help operators binary) actually nothing is plotted, but a legend is created nevertheless
  • t will be use as legend title (this only works for gnuplot>=5.4.0)

I still hope there is a "nicer" solution.

Result:

enter image description here