Writing style to prevent string concatenation in a list of strings

625 Views Asked by At

Suppose I have a list/tuple of strings,

COLOURS = [
    "White",
    "Black",
    "Red"
    "Green",
    "Blue"
]

for c in COLOURS:
    # rest of the code

Sometimes I forget placing a comma after each entry in the list ("Red" in the above snippet). This results in one "RedGreen" instead of two separate "Red" and "Green" list items.

Since this is valid Python, no IDE/text editor shows a warning/error. The incorrect value comes to the limelight only during testing.

What writing style or code structure should I use to prevent this?

4

There are 4 best solutions below

0
wjandrea On BEST ANSWER

You're incorrect that "no IDE/text editor shows a warning/error". Pylint can identify this problem using rule implicit-str-concat (W1404) with flag check-str-concat-over-line-jumps. (And for that matter, there are lots of things that are valid Python that a linter will warn you about, like bare except: for example.)

Personally, I'm using VSCode, so I enabled Pylint via the Python extension (python.linting.pylintEnabled) and set up a pylintrc like this:

[tool.pylint]
check-str-concat-over-line-jumps = yes

Now VSCode gives this warning for your list:

Implicit string concatenation found in list  pylint(implicit-str-concat)  [Ln 4, Col 1]


Lastly, there are probably other linters that can find the same problem, but Pylint is the first one I found.

1
JRose On

Using what @wjandrea found, here is a possible solution. It is not pretty, but it can correctly detect that this code has implicit string concatenation. I had it just print out the warning instead of crashing, but if you want it could be modified to throw an error.

You could also just run pylint on your file, but I assumed you were looking for a "programmatic" way to detect when this, and only this, occurs.

import pylint.lint
import sys

def check_for_implicit_str_concat():
    pylint.lint.reporters.json_reporter.JSONReporter.display_messages = lambda self, layout: None

    options = [
        sys.argv[0], 
        '--output-format=pylint.reporters.json_reporter.JSONReporter',
        '--check-str-concat-over-line-jumps=y'
    ]
    results = pylint.lint.Run(options, do_exit=False)
    for message in results.linter.reporter.messages:
        if message['message-id'] == 'W1403':
            print(message)

if __name__ == "__main__":
    check_for_implicit_str_concat()
    COLOURS = [
        "White",
        "Black",
        "Red"
        "Green",
        "Blue"
    ]

Output:

{
  "type": "warning",
  "module": "test",
  "obj": "",
  "line": 22,
  "column": 0,
  "path": "test.py",
  "symbol": "implicit-str-concat-in-sequence",
  "message": "Implicit string concatenation found in list",
  "message-id": "W1403"
}
0
Vuk Marković On

I'd go with an approach like this, which may even increase the maintainability and extensibility of the colours:

  • Use enums (or dictionaries) for colours, that would allow you to use them in the following way: Colours.RED, Colours.BLUE. Probably the best solution out there, but it really depends on the context of your application.

  • Create a very simple Colour class that would contain a string field of a colour and a constructor that initialises it. This way, the following code looks like this:

    COLOURS = [Colour("white"), Colour("green")]
    

    The downside of it are probably over-architecting your code to a certain extent and having to treat colour objects now slightly different (perhaps with c.get_colour() or c.color. Maybe it's not a downside at all, depending on the scale of your application.

  • Another "solution" would be to go in a functional style, creating a function that is very redundant, taking the colour as a string and just returning it back. I'm not a fan of this solution as it may not give you any benefit later.

Most of the time in production code, if you have bunch of constants like this in a list (except the cases where you're defining them specifically for enumeration of some sort), not having them enumerated or anything similar - you're most likely doing something wrong, so the solution would be to avoid that.

0
S.B On

Use Black formatter and Pylint together and you're good to go.

If you have your list like: (without ending comma)

COLOURS = [
    "White",
    "Black",
    "Red"
    "Green",
    "Blue"
]

Then after formatting it'll become:

COLOURS = ["White", "Black", "Red" "Green", "Blue"]

They're now in the same line...So Pylint complains about it: Implicit string concatenation found in list.

If you have your same list but with ending comma:

COLOURS = [
    "White",
    "Black",
    "Red"
    "Green",
    "Blue",
]

It'll become:

COLOURS = [
    "White",
    "Black",
    "Red" "Green",
    "Blue",
]

Again as you see Black forces these to be in one line in either case. So you will get Pylint's warning.