I am using spark-sql-2.4.1v with java8. I have scenario like below
List data = List(
("20", "score", "school", "2018-03-31", 14 , 12 , 20),
("21", "score", "school", "2018-03-31", 13 , 13 , 21),
("22", "rate", "school", "2018-03-31", 11 , 14, 22),
("21", "rate", "school", "2018-03-31", 13 , 12, 23)
)
Dataset<Row> df = = data.toDF("id", "code", "entity", "date", "column1", "column2" ,"column3")
Dataset<Row> resultDs = df
.withColumn("column_names",
array(Arrays.asList(df.columns()).stream().map(s -> new Column(s)).toArray(Column[]::new))
);
**But this is showing respective row columns values instread of column names. so what is wrong here ? how to get "column_names" in java **
I am trying to solve below use-case:
Lets say i have 100 columns like column1....to column100 ... each column calculation would be different depend on the column name and data .... but every time i run my spark job i will get which columns i need to calculate ... but in my code i will have all columns logic i.e. each column logic might be different ... i need to ignore the logic of unspecified columns... but as the dataframe contain all columns i am selecting specified columns..so for non-selected columns my code throws exception as the column not found ...i need to fix this