Natural sort of alphanumerical strings in JavaScript (excluding hexadecimal)

70 Views Asked by At

I'm looking for an efficient way to sort a list of Alphanumeric labels but ignoring to sort the hexadecimal values like below.

arr = [
  "a_19",
  "a_2",
  "b_645500",
  "b_6d4500"
]

I've read the solution here

Using { numeric: true } will result in bad sort between hexadecimal values not the shown string label

arrNumeric = [
  "a_2",
  "a_19",
  "b_6d4500",
  "b_645500"
]

Removing numeric flag will cause bad sort between "a_19" and "a_2".

arrNonNumeric = [
  "a_19",
  "a_2",
  "b_645500",
  "b_6d4500"
]

the desired sorted array is:

arrSorted = [
  "a_2",
  "a_19",
  "b_645500",
  "b_6d4500"
]

please note that it's possible to not use (_) in the labels to split each label based on this specific character to extract the non-numeric part

2

There are 2 best solutions below

8
Alexander Nenashev On BEST ANSWER

Create 2 compare functions with Intl.Collator and use them conditionally based on whether the both compared strings are numbers or not (test with a regexp).

Note that in your particular case it's not possible to distinguish a hex number consisting of 6 digits with an integer of 6 digits.

So that's why here's a regular expression /\D\d{1,5}$|\D\d{7,}$/ which treats a 6 digit number as a hex value and compares it as a string. As the OP wanted the delimiter is any non-digit character.

We also have problems with the number/hex delimiter. Assuming that's not a digit (\D) we need to check whether a potential number is not a hex (since for example d inside a hex could act as a number delimiter). Thus we should also do a negative check of a number against a hex string.

To test more accurately I've added b_55 so it would be sorted against hex's with the b_ prefix:

const arr = [
  "a_19",
  "a_2",
  "b_645500",
  "b_55",
  "b_6d4500",
];

const compare = [Intl.Collator('en').compare, Intl.Collator('en', {numeric:true}).compare];

const isNum = /\D\d{1,5}$|\D\d{7,}$/; // test number
const isHex = /\D[0-9a-f]{6}$/; // test hex

arr.sort((a, b, isNumA = isNum.test(a) && !isHex.test(a) , isNumB = isNum.test(b) && !isHex.test(b)) => 
  compare[+(isNumA && isNumB)](a, b));

console.log(arr);

1
Mr. Polywhirl On

You will need to split on the _ (underscore) and compare the left values first. If they are the same, check the right values. You need to determine the "type" of the value. I created a simple function called valueType that determines if a given string is a decimal (dec), hexadecimal (hex), or a string (str).

Note that this is ad-hoc and you should denote hex values with 0x6d450 (prefixed with 0x).

const valueType = (v) =>
  /^\d+$/.test(v) ? 'dec' : /^[a-f0-9]+$/.test(v) ? 'hex' : 'str';

const valueComparator = (a, b) => {
  const at = valueType(a), bt = valueType(b);
  if (bt !== 'dec' && at === 'dec') return -1; // ^a
  if (at !== 'dec' && bt === 'dec') return 1; // ^b
  return +a - +b; // Normal integer comparison
}

// Compare left and right values
const customComparator = (a, b) => {
  const [ax, ay] = a.split('_'), [bx, by] = b.split('_');
  return ax.localeCompare(bx) || valueComparator(ay, by);
}

const arr = [
  'a_19',
  'a_2',
  'b_64550',
  'b_6d450',
];

console.log([...arr].sort(customComparator));