___|  _ \   |  |    |   |_ _|\ \     / ____|
 |     |   |  |  |    |   |  |  \ \   /  __|
 |   | |   | ___ __|  ___ |  |   \ \ /   |
\____|\___/     _|   _|  _|___|   \_/   _____| 

 --- A GOPHER-LIKE INTERFACE FOR HIVE BLOCKCHAIN ---

Spark Scala: Grouping Values In A Key-Value Pair

BY: @sqlinsix | CREATED: Sept. 24, 2021, 3:12 p.m. | VOTES: 13 | PAYOUT: $1.15 | [ VOTE ]

https://cdn.pixabay.com/photo/2014/10/05/19/02/binary-code-475664_960_720.jpg

First, we'll create our data set. We'll notice in our test data set, we have different uni values in various IDs. Ultimately, we'd like to get these outputted in a key-value pair with the name of the column and the value. For this example, we're using Databricks, but if you have the appropriate libraries installed, you can run this in your environment.

import scala.collection
import org.apache.spark.sql._
import org.apache.spark.sql.functions._

val table1 = Seq(
  (1,"A","uni1"),
  (1,"A","uni2"),
  (2,"B","uni1"),
  (2,"B","uni2"),
  (2,"B","uni3"),
  (3,"C","uni1"),
  (4,"D","uni1"),
  (5,"E","uni1")
).toDF("ID","Letter","Val")

Next, we'll group these Vals as an array in a new column called UniSets. This gets us an array grouping of these values. From here, we want these values to be stored in a key-value pair with the name of the column.

display(
  table1
  .groupBy($"ID",$"Letter")
  .agg(
    collect_set($"Val").as("UniSets")
  )
)

[IMAGE: https://images.hive.blog/DQmcKY2zCjGkNkL2hakTqsyEdf5mPuiYA7Lqv7hvNoXKd2m/1.png]

Finally, we'll group our array with the column names and result in a key-value pair:

display(
  table1
  .groupBy($"ID",$"Letter")
  .agg(
    collect_set(struct($"Val")).as("UniSets")
  )
)

[IMAGE: https://images.hive.blog/DQmZZyBLdQzu3fSZP61eFSFmik4qRDJ2edeTLJqKUTSuGRw/2.png]

TAGS: [ #data ] [ #development ] [ #powershell ] [ #etlhelp ]

Replies

@gangstalking | Sept. 24, 2021, 3:12 p.m. | Votes: 1 | [ VOTE ]

Electronic-terrorism, voice to skull and neuro monitoring on Hive and Steem. You can ignore this, but your going to wish you didnt soon. This is happening whether you believe it or not. https://ecency.com/fyrstikken/@fairandbalanced/i-am-the-only-motherfucker-on-the-internet-pointing-to-a-direct-source-for-voice-to-skull-electronic-terrorism

[ BACK TO TRENDING ] [ BACK TO MENU ]
CMD>