Is it somehow possible to transform an entire collection instead of doing them one by one?

84 Views Asked by At

Is it somehow possible to transform an entire collection instead of doing them one by one?

I am often in a situation where I need to convert elements in a list from one type to another type. The solution i usually end up with is something like this

using System;
using System.Collections.Generic;
using System.Linq;

namespace ListProcessing
{
    public class Person
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public string Department { get; set; }
    }

    public class Employee
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public string Position { get; set; }
    }
    


    class Program
    {
        public static List<Employee> Hire(List<Person> persons)
        {
        var output = new List<Employee>();
        foreach(var p in persons)
        {
            var employee = new Employee { Id = p.Id, Name = p.Name, Position = "Software Engineer" };
        }
        
            return output;
        }
        static void Main(string[] args)
        {
            List<Person> persons = new List<Person>
            {
                new Person { Id = 1, Name = "John Doe", Department = "Software" },
                new Person { Id = 2, Name = "Jane Smith", Department = "Marketing" },
                new Person { Id = 3, Name = "Bob Johnson", Department = "Software" },
                new Person { Id = 4, Name = "Sally Jones", Department = "HR" }
            };

            //Attempt 1
            IEnumerable<Employee> employees = persons
                .Select(p => new Employee { Id = p.Id, Name = p.Name, Position = "Software Engineer" });

            //Attempt 2
            var employees2 = Hire(persons);

            foreach (Employee employee in employees)
            {
                Console.WriteLine($"ID: {employee.Id}, Name: {employee.Name}, Position: {employee.Position}");
            }
        }
    }
}

Either doing it by linq or by inserting each element into a list and returning them.

Some sort of iteration is always needed, and the fact the same instruction has to be done foreach item is what annoys me..

I sort of feel like this could be solved either using the right type from the beginning, or some sort of overload mechanism could be implmented that does not iterate them, but basically process the collection as a collection and returns the same collection "processed" - and O(1) operation.

2

There are 2 best solutions below

1
Olivier Jacot-Descombes On

Converting a list with N elements is a O(N) operation and there is no way to make it a O(1) operation. A list of a reference type is not a monolith. It is storing references to objects that live somewhere else. However, your design could be improved.

You can make an inheritance hierarchy like this:

public abstract class Person
{
    public int Id { get; init; }
    public string Name { get; init; }

    public override string ToString() =>
        $"ID: {Id}, Name: {Name}";
}

public class NaturalPerson : Person
{
    public string Department { get; init; }

    public override string ToString() =>
        base.ToString() + $", Department: {Department}";
}

public class Employee : Person
{
    public string Position { get; init; }

    public override string ToString() =>
        base.ToString() + $", Position: {Position}";
}

Note that because the Person class is abstract, you cannot instantiate it; however, you can declare a List<Person> and add it objects of the the two derived classes.

var persons = new List<Person>
{
    new NaturalPerson { Id = 1, Name = "John Doe", Department = "Software" },
    new NaturalPerson { Id = 2, Name = "Jane Smith", Department = "Marketing" },
    new Employee { Id = 3, Name = "Bob Johnson", Position = "Software Engineer" },
    new Employee { Id = 4, Name = "Sally Jones", Position = "Software Engineer" }
};
foreach (Person person in persons) {
    Console.WriteLine(person);
}

If the employees should have a department as well, then change the hierarchy to this (Person is not abstract here):

public class Person
{
    public int Id { get; init; }
    public string Name { get; init; }
    public string Department { get; init; }

    public override string ToString() =>
        $"ID: {Id}, Name: {Name}, Department: {Department}";
}

public class Employee : Person
{
    public Employee() { }

    // Copy constructor creating an Employee from a Person.
    public Employee(Person person, string position)
    {
        Id = person.Id;
        Name = person.Name;
        Department = person.Department;
        Position = position;
    }

    public string Position { get; init; }

    public override string ToString() =>
        base.ToString() + $", Position: {Position}";
}
var persons = new List<Person>
{
    new Person { Id = 1, Name = "John Doe", Department = "Software" },
    new Person { Id = 2, Name = "Jane Smith", Department = "Marketing" },
    new Employee { Id = 3, Name = "Bob Johnson", Department = "Software", Position = "Software Engineer" },
    new Employee { Id = 4, Name = "Sally Jones", Department = "Marketing", Position = "Chief marketing officer" }
};
foreach (Person person in persons) {
    Console.WriteLine(person);
}

I also added a copy constructor to Employee to ease the transformation of a Person to an Employee.

var employee = new Employee(person, "Software Engineer");

But since Persons and Employees are individual objects, it is not possible to convert them all at once. This is simply not possible. But don't over-estimate the time this takes. You can transform one million persons like this and won't notice any lag.

0
julealgon On

There is no such thing as a O(1) projection operation. The closest you could get to a more optimized variation would be to rely on SIMD vector operations, but that's fairly advanced and recommended only if this is really a critical path that has been identified as a performance bottleneck of sorts, which doesn't appear to be the case.

The one thing that jumps to me in your modelling is that perhaps an Employee is a Person, so you could make use of inheritance to share the semantics and some of those fields, like the Id and Name, which wouldn't change.

The closest thing that resembles an "automatic" conversion (but still not O(1)) would be a cast operator from one entity to the other, but I wouldn't recommend that here necessarily.

Other options could be creating constructors on one entity that take the other as a parameter, or leveraging mapping libraries such as AutoMapper to centralize the mapping logic for you, but again, nothing O(1) here: there will still be an iteration at some point and a per-entity map (although AutoMapper would allow you to call Map<List<Employee>>(persons), which is at least semantically, a direct list-to-list conversion).

I think a direct transformation of some sort that was closer to an O(1) operation would be to rely on the member structure and perform a cast from an entire collection to another, but that's not really feasible on C# and more of a very low level thing you could do in a language like C where you can "interpret" a segment of memory as any type more easily. Again, I would not recommend something like that either.

The only other thing that come to mind that could be tangentially related is how something such as EFCore could handle this if you opted to go with inheritance. For instance, if you go with an approach such as "TPH" (table-per-hierarchy), all that needs to really be done at the lower level to convert a person to a hire would be flipping a single "discriminator" column that specifies the type of the instance. You might thus be able to achieve this a bit more efficiently directly in the database without all the instancing and copying of values. Since you are still returning the hired Employee objects however, there would still be some transformation going on there and even then, none of this is O(1).