Dataframe grouping functions#
Dataframe grouping methods#
- class pystarburst.relational_grouped_dataframe.GroupingSets(*sets: Column | List[Column])#
Bases:
objectCreates a
GroupingSetsobject from a list of column/expression sets that you pass toDataFrame.group_by_grouping_sets(). SeeDataFrame.group_by_grouping_sets()for examples of how to use this class with aDataFrame. See GROUP BY GROUPING SETS for its counterpart in SQL (several examples are shown below).Python interface
SQL interface
GroupingSets([col("a")], [col("b")])GROUPING SETS ((a), (b))GroupingSets([col("a") , col("b")], [col("c"), col("d")])GROUPING SETS ((a, b), (c, d))GroupingSets([col("a"), col("b")])GROUPING SETS ((a, b))GroupingSets(col("a"), col("b"))GROUPING SETS ((a, b))
- class pystarburst.relational_grouped_dataframe.RelationalGroupedDataFrame(df: DataFrame, grouping_exprs: List[Expression], group_type: _GroupType)#
Bases:
objectRepresents an underlying DataFrame with rows that are grouped by common values. Can be used to define aggregations on these grouped DataFrames.
- agg(*exprs: Column | Tuple[ColumnOrName, str] | Dict[str, str]) DataFrame#
Returns a
DataFramewith computed aggregates. See examples inDataFrame.group_by().- Parameters:
exprs –
A variable length arguments list where every element is
a Column object
a tuple where the first element is a column object or a column name and the second element is the name of the aggregate function
a list of the above
a
dictmaps column names to aggregate function names.
Note
The name of the aggregate function to compute must be a valid Trino aggregate function.
See also
DataFrame.agg()DataFrame.group_by()
- builtin(agg_name: str) Callable#
Computes the builtin aggregate
agg_nameover the specified columns. Use this function to invoke any aggregates not explicitly listed in this class. See examples inDataFrame.group_by().
- function(agg_name: str) Callable#
Computes the builtin aggregate
agg_nameover the specified columns. Use this function to invoke any aggregates not explicitly listed in this class. See examples inDataFrame.group_by().