SAS dataset (seemingly) truncating character variable when setting with another datas

493 Views Asked by At

I create a dataset from an Excel spreadsheet using PROC IMPORT and get the value '3 esoph - 2 cardia' (among others but this is the one that displays the issue) for a character variable. This dataset is then concatenated with with a native SAS dataset. After the concatenation, the value appears on screen and printouts as '3 esoph'. But after concatenation the value is '332065736F7068202D203220636172646961' in hexadecimal format ($hex.). This decodes to '3 esoph - 2 cardia' (manually decoded). The length of the variable is $18 in both datasets .

Any thoughts on why the '- 2 cardia' doesn't display? Is this part of the value truly lost even though the data is present in hexadecimal?

I expected the value to be '3 esoph - 2 cardia' after the merge, not '3 esoph'.

I tried PROC SQL to perform the concantenation (UNION ALL) but got same results.

1

There are 1 best solutions below

0
Tom On

From your symptoms you probably have created a variable that is longer (LENGTH) than the FORMAT attached to it shows.

For normal character variables there is no need to attach any format to the variable. So I would recommend just removing the formats from the character variables and see if that makes your data usable.

So for example to concatenate datasets ONE and TWO to create a new dataset named WANT you could use a data step like this:

data want;
   set one two;
   format _character_ ;
run;

Note that if the issue is the LENGTH of the variable is too short then you will see truncation of the actual values, not just the printed results, then you might need to actually define the length you want for each variable BEFORE the SET statement.

data want;
  length id 8 string1 $30 text2 $100 ;
  set one two;
  format _character_ ;
run;