I'm using pyspark 2.4 and I already enabled the HiveSupport:
spark = SparkSession.builder.appName("spark").enableHiveSupport().getOrCreate()
but when I'm running:
spark.sql("""
CREATE TABLE reporting.sport_ads AS
SELECT
*
, 'Home' as HomeOrAway
, HomeTeam as TeamName
FROM adwords_ads_brand
UNION
SELECT
*
, 'Away' as HomeOrAway
, AwayTeam as TeamName
FROM adwords_ads_brand
""")
I hit the error:
pyspark.sql.utils.AnalysisException: "Hive support is required to CREATE Hive TABLE (AS SELECT);;\n'CreateTable `reporting`.`sport_ads`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, ErrorIfExists\n+- Distinct\n +- Union\n :-
....
It doesn't make any sense to me, am I doing something wrong?
ps: I have to add that this code works very well in databricks and with Spark with Scala.