How to add a separator between English and Arabic Text in a cell

81 Views Asked by At

I'm working on a workbook where I insert manually mixed English and Arabic text in the same cell "D" this text contain the section code and unit number in English in addition to some description in Arabic like this for example (NH7B-AV-06A-23 تم الانتهاء من العمل خلال 48 ساعه) in this text (NH7B) is the section code and (AV-06A-23) is the unit number... then in column "C" I have a formula to extract the "section code" (=LEFT(E7,SEARCH("-",E7)-1) and this is ok no problem at all... and in column "E" i have another formula to extract the "unit number (=MID(E70,SEARCH("-",E7)+1,FIND("@",E7)-SEARCH("-",E7)-1) ....by this way and in order to make the formula in "E" run, I have to add @ symbol to my text in column "D" manually and now i need it to be added automatically

One of my friends created a code for me to run it with a button after inserting text to D but it has one problem: it removes the formula in column "E" and extract all the english text from "D" to "E" not only the unit number....

I need your help please to fix this issue in the code or if you have any new suggestions will be appreciated

Sub englishATarabic()
    Dim txt As String, L, r, count As Long
    For r = 1 To ActiveSheet.Cells(ActiveSheet.Rows.count, "E").End(xlUp).Row
        If IsNumeric(ActiveSheet.Range("B" & r).Value) And ActiveSheet.Range("B" & r).Value <> "" And ActiveSheet.Range("E" & r).Value <> "" Then
            If InStr(ActiveSheet.Range("E" & r).Text, "@") < 1 Then
                count = 0
                For L = 1 To Len(ActiveSheet.Range("E" & r).Text)
                    Debug.Print L, AscW(Mid(ActiveSheet.Range("E" & r).Text, L, 1))
                    If AscW(Mid(ActiveSheet.Range("E" & r).Text, L, 1)) < 1000 Then count = count + 1 Else Exit For
                Next
                txt = Trim(Left(ActiveSheet.Range("E" & r).Text, count))
                If InStrRev(txt, "-") = Len(txt) Then ActiveSheet.Range("D" & r).Value = Trim(Left(txt, Len(txt) - 1)) Else ActiveSheet.Range("D" & r).Value = txt
            End If
        End If
    Next
End Sub
1

There are 1 best solutions below

0
Ron Rosenfeld On BEST ANSWER

Your method (of determining the position of the first Arabic character) can be used without VBA in various ways. However, please note that your method will leave a trailing hyphen for the Unit in your second example mentioned in your comment.

To replicate your method, using formulas, assuming you have 365, you could use this function. Note that I had to add a specific step to remove that trailing hyphen (tUnit)

(Note that in both the formula and the Power Query methods, we test for an Arabic character by looking for a character code greater than 255. In a more complex scenario, we could test specifically for the arabic character set, but, at present, this comprises 1519 characters in 11 separate groupings, so would add considerable complexity to the formula. However, if applicable to your dataset, you could just test for the range 0600-06FF. You'd need to investigate whether that is sufficient.)

B2: =LET(
    firstHyphen, FIND("-", A2),
    firstArabic, XMATCH(TRUE, UNICODE(MID(A2, SEQUENCE(LEN(A2)), 1)) > 255),
    section, LEFT(A2, firstHyphen - 1),
    unit, TRIM(MID(A2, firstHyphen + 1, firstArabic - firstHyphen - 1)),
    tUnit, IF(RIGHT(unit) = "-", LEFT(unit, LEN(unit) - 1), unit),
    arb, MID(A2, firstArabic, 255),
    HSTACK(section, tUnit, arb)
)

and fill down. The results SPILL into the adjacent columns

enter image description here

This can also be accomplished using Power Query, available in Windows Excel 2010+ and Excel 365 (Windows or Mac)

To use Power Query

  • Select some cell in your Data Table
  • Data => Get&Transform => from Table/Range or from within sheet
  • When the PQ Editor opens: Home => Advanced Editor
  • Make note of the Table Name in Line 2
  • Paste the M Code below in place of what you see
  • Change the Table name in line 2 back to what was generated originally.
  • Read the comments and explore the Applied Steps to understand the algorithm

Note that it is easier to trim that trailing hyphen.

let
    Source = Excel.CurrentWorkbook(){[Name="Table30"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    
//determine positions on which to split the string
    #"Split Positions" = Table.AddColumn(#"Changed Type","Splits", (c)=>
        let 
            firstHyphen = Text.PositionOf(c[Column1],"-"),

        //1st Arabic character assumed to be first character with a character code > 255
            firstArabic = 
                List.PositionOf(
                    List.Transform(
                        Text.ToList(c[Column1]), 
                    each Character.ToNumber(_)>255),true,Occurrence.First),

        //"0" is first position in PQ
            splits={0,firstHyphen,firstArabic},
            texts = Splitter.SplitTextByPositions(splits)(c[Column1]),

        //Trim any leading/trailing spaces or hyphens. Other characters can be added to this list to trim
            trims = List.Transform(texts, each Text.Trim(_,{" ","-"}))
        in 
            Record.FromList(trims,{"Section","Unit","Description"}), type record),
    
    #"Expanded Splits" = Table.ExpandRecordColumn(#"Split Positions", "Splits", {"Section", "Unit", "Description"})
in
    #"Expanded Splits"

enter image description here

Column1 can be deleted from the final table in PQ if that is preferable

If, for some reason, you prefer to use VBA, you can write a UDF that will do the entire splitting task, and then use that function on your worksheet. eg:

Option Explicit
Function splitWithArabic(s As String)
    Dim RE As Object, MC As Object
    Dim I As Long
    Dim vParts(0 To 2)
    
'Initialize regular expression object
Set RE = CreateObject("vbscript.regexp")
With RE
    .Pattern = "^([^-\s]*)[-\s]+([^\u0600-\u06FF]+)[-\s]+(.*)"
    .MultiLine = True
    If .test(s) = True Then
        Set MC = .Execute(s)
        For I = 0 To 2
            vParts(I) = MC(0).submatches(I)
        Next I
    End If
End With
splitWithArabic = vParts
        
End Function

The function will return a three element array. Depending on your version of Excel, it may either SPILL to the three columns, or you may need to array-enter it across the three columns, or use the INDEX function to return each element of the array

Explanation of the Regex used
^([^-\s])[-\s]+([^\u0600-\u06FF]+)[-\s]+(.)

^([^-\s]*)[-\s]+([^\u0600-\u06FF]+)[-\s]+(.*)

Options: Case insensitive; ^$ match at line breaks

Created with RegexBuddy