If values in one cell are delimited by line breaks, change them to commas in the design sheet

168 Views Asked by At

I am preparing one PowerShell script which needs to check If values in one cell are delimited by line breaks, change them to commas and update the designsheet (excelsheet). I am using below code for the same and it is executing without any errors but it is directly giving the output as "No changes made to the Excel file" and design sheet is also not updated. I have tried multiple ways but still facing the issue. Can some one please help me to resolve this issue. Below is the code:

# Load the Excel file
$designSheet = "/home/lokeshdir/NSG_Rules.xlsx"
$excel = Import-Excel -Path $designSheet

# Flag to track changes
$changesMade = $false

# Iterate through each cell and replace line breaks with commas if present
foreach ($row in $excel) {
    foreach ($property in $row.psobject.Properties) {
        # Check if the cell content has line breaks and replace them with commas
        if ($property.Value -match "`r`n") {
            $property.Value = $property.Value -replace "`r`n", ","
            $changesMade = $true  # Set flag indicating changes
        }
    }
}

# If changes were made, save the updated Excel file
if ($changesMade) {
    $excel | Export-Excel -Path $designSheet -Show
    Write-Host "Excel file updated successfully."
} else {
    Write-Host "No changes made to the Excel file."
}

Output:

PS /home/lokeshdir> ./NSGRulesadd.ps1
No changes made to the Excel file.
PS /home/lokeshdir>

Can someone please help me to resolve this issue to check the design sheet if any line breaks present and change those line breaks to commas and update the design sheet.

enter image description here

1

There are 1 best solutions below

2
mklement0 On

Update:

  • Closer examination of the screenshot added later suggests that it isn't actually line breaks (newline) inside your cells that are the problem, but values that are conceptually single ones being spread across two cells - a solution to the latter problem can be found in this answer.

  • The answer below addresses the question as asked.

    • This related question (which suffers from the inverse misconception) has screenshots that show actual intra-cell line breaks.

Perhaps surprisingly, Excel converts Windows-format CRLF newlines ("`r`n") to Unix-format LF-only newlines ("`n") on assigning multiline strings to cells, irrespective of whether values are entered interactively or set programmatically, as verified with v7.8.6 of the ImportExcel module, both with a programmatically created workbook and one created interactively via Office 2019.[*]

Therefore, simply replacing "`r`n" with "`n" in your code should solve your problem.


As an aside: what Theo suggests - "`r?`n" - is a regex that matches either newline form, more typically expressed via regex escape sequences, '\r?\n', but it isn't necessary here.


Self-contained sample code:

The following:

  • uses Export-Excel to create sample workbook sample.xlsx with a single data row with two multiline values, one with a CRLF-only newline ("line1`r`nline2"), and another with a LF-only newline ("20`n21").

  • reads the file back in with Import-Excel and verifies that the CRLF newline turned into a LF-only ("`n") newline.

  • replaces the now-LF-only newlines with "," and saves the workbook, then re-imports it to verify that the changes were performed as intended, which is indeed the case for me with version 7.8.6 of the ImportExcel module.

    • Number-handling pitfall:
      • By default, Export-Excel exports even string values that look like numbers as numbers, which can be undesired.
      • In the case at hand, transforming "20`n21" into "20,21" would result in interpretation of the latter as a number, namely as 2021 (the , is interpreted as a thousands-grouping symbol).
      • The -NoNumberConversion <propertyNames> parameter suppresses this behavior; -NoNumberConversion * can be used to apply it to all exported properties (columns).
        However, the latter would then format even actual numbers as strings, so it is better to only specify affected properties, as done below.
# File path of a sample workbook.
$sampleWb = "./sample.xlsx"
# Remove a preexisting version of the sample workbook, to avoid potential side effects
# from exporting into it.
Remove-Item -ErrorAction Ignore $sampleWb

# Create the sample workbook with a sample object with
# two multiline values, one with a CRLF newline, the other with an LF-only newline.
[pscustomobject] @{
  Foo              = 'Bar'
  MultilineText    = "line1`r`nline2"  # CRF newline
  MultiLineNumbers = "20`n21"          # LF newline
} | Export-Excel $sampleWb

# Read the sample workbook('s first worksheet)
$rows = Import-Excel $sampleWb

# Initialize a hashset for collecting the names of the properties
# whose values were modified.
$modifedProperties = [System.Collections.Generic.HashSet[string]]::new()

# Iterate through the cells and replace line breaks with commas.
foreach ($row in $rows) {
  foreach ($property in $row.psobject.Properties) {
    $value = $property.Value
    # Report on the newline format found.
    "Cell in column '$($property.Name)' contains " + $(if ($value -match '\r\n') { "a CRLF" } elseif ($value -match '\n') { 'a LF-only' } else { 'NO' }) + ' newline.'
    # Now replace the LF-only newline with ","
    $property.Value = $value -replace '\n', ','
    $null = $modifedProperties.Add($property.Name)
  }
}

# Save the updated workbook.
# Note:
#  * `-NoNumberConversion $modifedProperties` formats the columns
#    of the modified properties *as text*, 
#    to prevent accidental to-number conversion.
$rows | Export-Excel $sampleWb -NoNumberConversion $modifedProperties

# Read the workbook back in and print the resulting object(s),
# to verify that the update worked as intended:
Import-Excel $sampleWb

You should see the following output - note how only LF-only newlines are reported, and how the multiline values were converted to ,-separated single-line ones:

Cell in column 'Foo' contains NO  newline.
Cell in column 'MultilineText' contains a LF-only newline.
Cell in column 'MultiLineNumbers' contains a LF-only newline.

Foo MultilineText MultiLineNumbers
--- ------------- ----------------
Bar line1,line2   20,21

[*] If you have information about different behavior in earlier or later versions, please let us know.