如何在 session==0 时用值 999 替换 timestamp1 列的值?
How to replace value of timestamp1 column with value 999 when session==0?
预期输出
+-------+----------+----+ |session|timestamp1| id2| +-------+----------+----+ | 1| 1|null| | 1| 2| 5.0| | 1| 3| NaN| | 1| 4|null| | 0| 999|10.0| | 1| 6| NaN| | 0| 999| NaN| +-------+----------+----+是否可以在 PySpark 中使用 replace() 来实现?
Is it possible to do it using replace() in PySpark?
推荐答案你应该使用 when (with otherwise) 函数:
You should be using the when (with otherwise) function:
from pyspark.sql.functions import when targetDf = df.withColumn("timestamp1", \ when(df["session"] == 0, 999).otherwise(df["timestamp1"]))