How to use awk to reformat a set of data?

56 Views Asked by At

I have a large set of traveltime data for earthquakes formatted in 4 columns like so:

     0.000     1.000     0.000         0
     0.050     0.044     0.010         1
     0.100     0.075     0.010         1
     0.150     0.108     0.010         1
     0.200     0.117     0.010         1
     0.250     0.160     0.010         1
     0.300     0.197     0.010         1

I need to use awk to reformat this into a different format that has 6 columns like so:

3
s 1 2.9901 38
r 3 0 0 2.27046 0.01
r 5 0 0 2.53864 0.01
r 7 0 0 2.66227 0.01
r 9 0 0 2.82365 0.01
r 11 0 0 3.23862 0.01
r 13 0 0 3.52581 0.01
r 15 0 0 4.15172 0.01

The key being that the original set is structured as [x position, traveltime, error, flag] and i need it to become [r, x position, z position (0), flag, traveltime, error]. I'm very new to using awk and I'm wondering if there is a way to complete this in a single script, or if it has to be done piece-wise.

What I've tried with my limited experience with awk was this:

{
printf "%s %d %d %d %f %f \n" , $1=r, $2=$1, $3=0, $4=0, $5=$2, $6=$3;
}

So far this has simply output a series of columns of zeros with the column of r's missing entirely.

EDIT: To clarify, the 2 examples are drawn from 2 different data sets just to show the formats. The original data is already in text file form and I need to change it to fit the format of the second example. This includes adding columns.

1

There are 1 best solutions below

0
Daweo On
{
printf "%s %d %d %d %f %f \n" , $1=r, $2=$1, $3=0, $4=0, $5=$2, $6=$3;
}

This is not how one is supposed to use GNU AWK's printf function, compare it with example from User Guide

awk '{ printf "%-10s %s\n", $1, $2 }' mail-list

Observe that you simply state which field you want and do not make any assignment (=), after repairing that and using string literal denoting r rather than variable named r for file.txt holding

 0.000     1.000     0.000         0
 0.050     0.044     0.010         1
 0.100     0.075     0.010         1
 0.150     0.108     0.010         1
 0.200     0.117     0.010         1
 0.250     0.160     0.010         1
 0.300     0.197     0.010         1

command

awk '{printf "%s %d %d %d %f %f \n" , "r", $1, 0, 0, $2, $3}' file.txt

gives output

r 0 0 0 1.000000 0.000000 
r 0 0 0 0.044000 0.010000 
r 0 0 0 0.075000 0.010000 
r 0 0 0 0.108000 0.010000 
r 0 0 0 0.117000 0.010000 
r 0 0 0 0.160000 0.010000 
r 0 0 0 0.197000 0.010000 

as you might observe this is not desired output, as this is not compliant with requirement stipulating going from [x position, traveltime, error, flag] to [r, x position, z position (0), flag, traveltime, error]. This could be done by changing order and using right format codes for each column.

awk '{printf "%s %f %f %d %f %f \n", "r", $1, 0, $4, $2, $3}' file.txt

and output is now

r 0.000000 0.000000 0 1.000000 0.000000 
r 0.050000 0.000000 1 0.044000 0.010000 
r 0.100000 0.000000 1 0.075000 0.010000 
r 0.150000 0.000000 1 0.108000 0.010000 
r 0.200000 0.000000 1 0.117000 0.010000 
r 0.250000 0.000000 1 0.160000 0.010000 
r 0.300000 0.000000 1 0.197000 0.010000

Observe that %f cause certain number of digits after ., if you need another number of digits for certain column use e.g. %.03f to get 3 digits after .

(tested in GNU Awk 5.1.0)