transforms

Functions

parse_kafka(→ pyspark.sql.DataFrame)

Util function to parse the json message based on a given schema

apply_transforms(→ pyspark.sql.DataFrame)

Util apply basic transformation to the DataFrame

Module Contents

transforms.parse_kafka(df: pyspark.sql.DataFrame, schema: pyspark.sql.types.StructType) pyspark.sql.DataFrame

Util function to parse the json message based on a given schema :param df: A stream based dataframe from the kafka stream read :type df: DataFrame :param schema: schema definition :type schema: StructType

Returns:

parsed json of the stream dataframe

Return type:

df (DataFrame)

transforms.apply_transforms(df: pyspark.sql.DataFrame) pyspark.sql.DataFrame

Util apply basic transformation to the DataFrame

Parameters:

df (DataFrame) – flawed Spark DataFrame

Returns:

rectified DataFrame

Return type:

parsed (DataFrame)