How to convert string representation of coordinates to geopandas LineString?

66 Views Asked by At

I retrieve geometries from API and save them as a column in pandas dataframe. Next, I convert them to Linestrings using geopandas, also filtering out Linestrings consisting of less than 3 data points. However, the format of the data is weird and it takes too many steps to do it, applying explode and groupby functions. Is there a faster way to convert coordinates in such a format to geopandas Linestrings and to filter out short Linestrings?

data = {'geometry': [
    ['48.0309079, 11.0873018', '48.03204, 11.08798', '49.5073963,8.6355505'],
    ['48.03204, 11.08798', '48.033, 11.089', '49.5073963,8.6355505'],
]}

df = pd.DataFrame(data)

UPD:

Expected outcome:

LINESTRING (48.0309079 11.0873018, 48.03204 11.08798, 49.5073963 8.6355505)
LINESTRING (48.03204 11.08798, 48.033 11.089, 49.5073963 8.6355505)
1

There are 1 best solutions below

0
Aymen Azoui On

try this :

import pandas as pd
import geopandas as gpd
from shapely.geometry import LineString

data = {
    'geometry': [
        ['48.0309079, 11.0873018', '48.03204, 11.08798', '49.5073963,8.6355505'],
        ['48.03204, 11.08798', '48.033, 11.089', '49.5073963,8.6355505'],
    ]
}

df = pd.DataFrame(data)

def to_linestring(row):
    points = [tuple(map(float, coord.split(','))) for coord in row]
    if len(points) >= 3:
        return LineString(points)
    else:
        return None

df['geometry'] = df['geometry'].apply(to_linestring)
df = df.dropna(subset=['geometry'])
gdf = gpd.GeoDataFrame(df, geometry='geometry')

print(gdf)