I have table in SAS with missing values like below:
col1 | col2 | col3 | ... | coln
-----|------|------|-----|-------
111 | | abc | ... | abc
222 | 11 | C1 | ... | 11
333 | 18 | | ... | 12
... | ... | ... | ... | ...
And I need to delete from above table variables where is more than 80% missing values (>=80%).
How can I do taht in SAS ?
The macro below will create a macro variable named
&drop_varsthat holds a list of variables to drop from your dataset that exceed missing threshold. This works for both character and numeric variables. If you have a ton of them then this macro will fail but it can easily be modified to handle any number of variables. You can save and reuse this macro.It has three parameters:
lib: Library of your datasetdsn: Dataset name without the librarythreshold: Proportion of missing values a variable must meet or exceed to be droppedFor example, let's generate some sample data and use this.
col1 col2 col3all have 80% missing values.We'll run the macro and check the log:
The log shows:
Now we can pass this into a simple data step.