Problem with awk and (maybe) null characters

77 Views Asked by At

I have this file, which "may be" a binary file:

    DATA FIELDINFO Cloud_Mask_QA {{{
  rank: 2
  type: 20
  dims: Cell_Along_Swath_1km 2030, Cell_Across_Swath_1km 1354, 
  data: ... (2748620)
    (0,0) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,16) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,32) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,48) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,64) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,80) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,96) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,112) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,128) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,144) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,160) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,176) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,192) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@

If I use sed -n "l" file, in order to see the "non printable characters" I get:

    DATA FIELDINFO Cloud_Mask_QA {{{$
  rank: 2$
  type: 20$
  dims: Cell_Along_Swath_1km 2030, Cell_Across_Swath_1km 1354, $
  data: ... (2748620)$
    (0,0) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
   \000, \000, \000, \000, \000, \000, \000$
    (0,16) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
   \000, \000, \000, \000, \000, \000, \000$
    (0,32) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
   \000, \000, \000, \000, \000, \000, \000$
    (0,48) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
    \000, \000, \000, \000, \000, \000, \000$
    (0,64) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
    \000, \000, \000, \000, \000, \000, \000$
    (0,80) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
    \000, \000, \000, \000, \000, \000, \000$
    (0,96) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
    \000, \000, \000, \000, \000, \000, \000$
    (0,112) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,128) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,144) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,160) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,176) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,192) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$

I am trying to use awk on it, but if I do awk '{print $0}' file, I get:

    DATA FIELDINFO Cloud_Mask_QA {{{
  rank: 2
  type: 20
  dims: Cell_Along_Swath_1km 2030, Cell_Across_Swath_1km 1354, 
  data: ... (2748620)
    (0,0) 

So it seems that awk stops processing the file at the first "^@" or "\000" character it founds.

How can I avoid this?

Note: it seems my awk is mawk

1

There are 1 best solutions below

0
Javi_VM On

gawk seems to solve the problem, instead of mawk. awk is generally linked to one of those two, so the only thing to do is to install gawk and use it instead of awk.