I have a file that I am consuming into StreamSets and in that I have the following sample:

Source_id: {String} "1234"
Partition_id: {String} "ABC"
Key: {String} "W3E"

(the field names are dynamic, sometimes it changes so we can't hardcode those field names).

I want to be able to somehow get these to two separate fields so that I can send the entire to a stored procedure that uses dynamic SQL to insert into various tables. For this purpose I need to have two fields with in this format.

ColumnName: {string} " 'Source_id', 'Partition_id', 'Key' " 
ValueName: {String} "'1234', 'ABC', 'W3E' "

I've tried field mappers and other processors but unable to get it working.

I don't know Java/ Groovy enough to make it work. Any help would be appreciated.

Thanks

Regards, NonClever human.

1

There are 1 best solutions below

0
Roman On

here's a Groovy option that should do the trick:

// Sample Groovy code
records = sdc.records
String keys = ""
String values = ""
for (record in records) {
try {
      keys = "";
      values = "";
  
      for(String key: record.value.keySet()) {
        keys = keys + (keys==""?"":", ") + key;
        values = values + (values==""?"":", ") + record.value[key];
  }
  record.value = ["keys":keys, "values": values];

    // Write a record to the processor output
        sdc.output.write(record)
    } catch (e) {
    // Write a record to the error pipeline
        sdc.log.error(e.toString(), e)
        sdc.error.write(record, e.toString())
    }
}

I strongly suggest that you learn the basics of either Groovy or Javascript or Python. That should help a lot with corner cases like this.