在Spark Scala中读取具有特殊字符的文件' {'和'}'在他们的文件名中

发布于 2025-01-27 03:14:54 字数 1486 浏览 2 评论 0原文

我想阅读具有名称的Spark Scala中的文件:月lylypurchasefile {202205} -may.txt
我正在使用以下代码:
val df = spark.read.text(“ handel_special_ch/normypurchasefile {202205} -may.txt”

但我要低于异常:

org.apache.spark.sql.AnalysisException: Path does not exist: file:/home/hdp_batch_datalake_dev/handel_special_ch/monthlyPurchaseFile{202205}-May.TXT
  at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$checkAndGlobPathIfNecessary$3(DataSource.scala:792)
  at org.apache.spark.util.ThreadUtils$.$anonfun$parmap$2(ThreadUtils.scala:372)
  at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
  at scala.util.Success.$anonfun$map$1(Try.scala:255)
  at scala.util.Success.map(Try.scala:213)
  at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
  at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
  at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
  at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
  at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
  at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
  at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
  at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
  at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)

请建议,我如何读取具有字符{code> {{}以其名称。

I wanted to read a file in Spark Scala having name: monthlyPurchaseFile{202205}-May.TXT
I am using below code:
val df = spark.read.text("handel_special_ch/monthlyPurchaseFile{202205}-May.TXT"

But I am getting below exception:

org.apache.spark.sql.AnalysisException: Path does not exist: file:/home/hdp_batch_datalake_dev/handel_special_ch/monthlyPurchaseFile{202205}-May.TXT
  at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$checkAndGlobPathIfNecessary$3(DataSource.scala:792)
  at org.apache.spark.util.ThreadUtils$.$anonfun$parmap$2(ThreadUtils.scala:372)
  at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
  at scala.util.Success.$anonfun$map$1(Try.scala:255)
  at scala.util.Success.map(Try.scala:213)
  at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
  at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
  at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
  at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
  at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
  at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
  at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
  at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
  at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)

Please suggest, how I can read that file having character {, } in its name.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

雨落□心尘 2025-02-03 03:14:54

路径您将传递到spark.text方法被视为正则表达式。由于{}是特殊字符,因此Spark试图将路径与该表达式匹配。您可以使用字符匹配任何字符,因此应有效:

val df = spark.read.text("handel_special_ch/monthlyPurchaseFile?202205?-May.TXT"

The path you are passing to the spark.read.text method is treated as a regular expression. Since { and } are special characters, Spark tries to match a path against that expression. You can use the ? character to match any character, so the following should work:

val df = spark.read.text("handel_special_ch/monthlyPurchaseFile?202205?-May.TXT"
爱人如己 2025-02-03 03:14:54

字符\\用作逃生序列。因此,使用以下代码根据期望工作并解决问题:

val df = spark.read.text(“ handel_special_ch/normypurchasefile \\ {202205 \\} - 五月

Character \\ works as escape sequence. So, using below code works as per expectation and solves the issue:

val df = spark.read.text("handel_special_ch/monthlyPurchaseFile\\{202205\\}-May.TXT"

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文