pyspark.sql.functions.from_json¶
-
pyspark.sql.functions.from_json(col, schema, options=None)[source]¶ Parses a column containing a JSON string into a
MapTypewithStringTypeas keys type,StructTypeorArrayTypewith the specified schema. Returns null, in the case of an unparseable string.New in version 2.1.0.
- Parameters
- col
Columnor str string column in json format
- schema
DataTypeor str a StructType or ArrayType of StructType to use when parsing the json column.
Changed in version 2.3: the DDL-formatted string is also supported for
schema.- optionsdict, optional
options to control parsing. accepts the same options as the json datasource. See Data Source Option in the version you use.
- col
Examples
>>> from pyspark.sql.types import * >>> data = [(1, '''{"a": 1}''')] >>> schema = StructType([StructField("a", IntegerType())]) >>> df = spark.createDataFrame(data, ("key", "value")) >>> df.select(from_json(df.value, schema).alias("json")).collect() [Row(json=Row(a=1))] >>> df.select(from_json(df.value, "a INT").alias("json")).collect() [Row(json=Row(a=1))] >>> df.select(from_json(df.value, "MAP<STRING,INT>").alias("json")).collect() [Row(json={'a': 1})] >>> data = [(1, '''[{"a": 1}]''')] >>> schema = ArrayType(StructType([StructField("a", IntegerType())])) >>> df = spark.createDataFrame(data, ("key", "value")) >>> df.select(from_json(df.value, schema).alias("json")).collect() [Row(json=[Row(a=1)])] >>> schema = schema_of_json(lit('''{"a": 0}''')) >>> df.select(from_json(df.value, schema).alias("json")).collect() [Row(json=Row(a=None))] >>> data = [(1, '''[1, 2, 3]''')] >>> schema = ArrayType(IntegerType()) >>> df = spark.createDataFrame(data, ("key", "value")) >>> df.select(from_json(df.value, schema).alias("json")).collect() [Row(json=[1, 2, 3])]