I am new to SQL. I have the the following query that works, attached below.
I tried it multiple times without the GROUP BY function and it didn't work. I was curious as to what this function does as I can't understand any descriptions of it online, specifically this article: https://www.w3schools.com/sql/sql_groupby.asp. I know it causes the rows of the items table to be collected into groups, but why do I need to do that for this query to work?
SELECT
usertype,
CONCAT(start_station_name," to ", end_station_name) AS route,
COUNT(*) as num_trips,
ROUND(AVG(cast(tripduration as int64)/60), 2) AS duration
FROM
`bigquery-public-data.new_york_citibike.citibike_trips`
GROUP BY
start_station_name, end_station_name, usertype
LIMIT 10
This kind of SQL refers just to the rows concatenate(col1, col2) also refers to the rows.
COUNT(*) is one of the aggregate functions that means they are referring to the groups of rows, that's why you need to use the GROUP BY clause. With group by you can perform HAVING -> what is WHERE for the rows, that is HAVING for the groups of rows.
Try to create table with 3 fields of any type as above, populate them with few rows, -> col1 has all values present, col2 has just 2 values, col3 has 1 value, and then try to perform counting of each column to see what happens.