Skip to content

Instantly share code, notes, and snippets.

@SharathHebbar
Created October 16, 2024 13:58
Show Gist options
  • Select an option

  • Save SharathHebbar/8737bcca1b5312290dc1576ee6f5840b to your computer and use it in GitHub Desktop.

Select an option

Save SharathHebbar/8737bcca1b5312290dc1576ee6f5840b to your computer and use it in GitHub Desktop.
# Read the CSV (with the first row as data)
df = spark.read.format("csv").option("header", "false").load("/path/to/csvfile")
# Extract the first row as the header
new_header = df.first()
# Create a new DataFrame without the first row
df_without_first_row = df.filter(df["_c0"] != new_header["_c0"])
# Rename columns to match the values from the first row (header)
new_column_names = [new_header[col] for col in df.columns]
df_with_new_header = df_without_first_row.toDF(*new_column_names)
df_with_new_header.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment