pyspark.pandas.DataFrame.map#
- DataFrame.map(func)[source]#
Apply a function to a Dataframe elementwise.
This method applies a function that accepts and returns a scalar to every element of a DataFrame.
New in version 4.0.0: DataFrame.applymap was deprecated and renamed to DataFrame.map.
Note
this API executes the function once to infer the type which is potentially expensive, for instance, when the dataset is created after aggregations or sorting.
To avoid this, specify return type in
func
, for instance, as below:>>> def square(x) -> np.int32: ... return x ** 2
pandas-on-Spark uses return type hints and does not try to infer the type.
- Parameters
- funccallable
Python function returns a single value from a single value.
- Returns
- DataFrame
Transformed DataFrame.
Examples
>>> df = ps.DataFrame([[1, 2.12], [3.356, 4.567]]) >>> df 0 1 0 1.000 2.120 1 3.356 4.567
>>> def str_len(x) -> int: ... return len(str(x)) >>> df.map(str_len) 0 1 0 3 4 1 5 5
>>> def power(x) -> float: ... return x ** 2 >>> df.map(power) 0 1 0 1.000000 4.494400 1 11.262736 20.857489
You can omit type hints and let pandas-on-Spark infer its type.
>>> df.map(lambda x: x ** 2) 0 1 0 1.000000 4.494400 1 11.262736 20.857489