How to extract substrings using Right, Left, Mid, LEN and FIND

64 Views Asked by At

I am trying to extract data from a string using different functions.

I have the input data in the first column, which can have two different formats.

Here is an example for both formats, with the expected output next to it:

Data First name Last name Number Date
**NM1*IL*1*EEKS*MILBORD****MI*123456789~** EEKS MILBORD 123456789
**DTP*472*D9*20240212~ ** 20240212

Currently, I got as far as these results:

Data MIDFUNCTION First name Nmbr Date
**NM1*IL*1*EEKS*MILBORD****MI*123456789~** EEKS*MILBORD****MI*123456789~ EEK BORD****MI*123456789
**DTP*472*D9*20240212~ ** 20240212~

For the mid function -- which I placed in the D column -- I used:

=MID(A15,FIND("NM1*IL*1*",A15)+9,LEN(A15)-FIND("NM1*IL*1*",A15))

For the first name I used: =LEFT(D15,FIND("*",D15)-1)

For the Nmbr I used: =RIGHT(A15,FIND("****MI*",A15))

For the date I could not exclude the tilde. I used: =RIGHT(A16,FIND("*D9*",A16))

What I want is to extract using the find and Len based on the data in the data field... The mid function seems to work but does not exclude what is not needed.

I am trying to extract First Name, Last Name, Number and on the next line, the date and exclude the tilde.

1

There are 1 best solutions below

2
trincot On BEST ANSWER

I would suggest to really do this step by step, using intermediate cells for intermediate results. First get all key positions in the data, and only when you have all of them, design your mid extractions based on those positions.

I would use these formulas. I'm assuming the data is in column A, and starting at A2. Then I'd allocate the columns as follows:

# A B C D E F G H I J K
1 Data Start First End First End Last Start Num End Num Start Date First Name Last Name Number Date
2 ...

Use these formulas:

  • In B2: =FIND("*1*",A2)+3
  • In C2: =FIND("*",A2,B2)+1
  • In D2: =FIND("*",A2,C2)
  • In E2: =FIND("*MI*",A2,D2)+4
  • In F2: =FIND("~",A2,E2)
  • In G2: =FIND("*D9*",A2)+4
  • In H2: =IFERROR(MID(A2,B2,C2-B2-1),"")
  • In I2: =IFERROR(MID(A2,C2,D2-C2),"")
  • In J2: =IFERROR(MID(A2,E2,F2-E2),"")
  • In K2: =IFERROR(DATE(MID(A2,G2,4),MID(A2,G2+4,2),MID(A2,G2+6,2)),"")

This is the result I got with that:

enter image description here

To have so many intermediate cells may look like overkill, but it makes it easier to see where things go wrong if they do. Once all is fine, you can just hide columns B-G.