当 Parse_Json 尝试解析“NaN”时,会给出 invalid 值。值是 pandas 数据帧中的空值

发布于 2025-01-09 18:01:37 字数 654 浏览 1 评论 0原文

    select
parse_json('{"fullName":"Niks",
      "age":15,
      "address": {
            "areaCode":"1234",
                },
       "state": NaN ,
      "movies": [
          {"name":"Iron Man","budget":"$150M","producer":"Avi"},
          {"name":"Sherlock Holmes","budget":"$200M","producer":"Joel Silver"},]}');

Img 显示无效的 json

在解析雪花中具有 NaN 值的 JSON 时遇到问题。它说无效的 JSON,因此我们无法正确解析或使用 JSON 函数,尤其是在操作时。

目前,我们尝试了

x = 'NaN' then NULL else x end 的情况。

但这只是解决问题的临时方法。有没有其他方法/雪花功能可以帮助我。 ?

注意:我已尝试使用 try_parse_json,但问题仍然相同。

    select
parse_json('{"fullName":"Niks",
      "age":15,
      "address": {
            "areaCode":"1234",
                },
       "state": NaN ,
      "movies": [
          {"name":"Iron Man","budget":"$150M","producer":"Avi"},
          {"name":"Sherlock Holmes","budget":"$200M","producer":"Joel Silver"},]}');

Img show the invalid json

I face an issue when parsing JSON having NaN values in snowflake. it says invalid JSON and so we can not parse or use JSON function properly especially when we manipulate.

currently, we tried with the

case when x = 'NaN' then NULL else x end.

but that is a temporary way to solve it. is there any other way/ snowflake function that helps me out. ?

Note: I have tried with try_parse_json, but still the issue remains the same.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

夏至、离别 2025-01-16 18:01:37

您的 JSON 数据无效,只需检查任何 JSON 验证器,它就会抛出错误,并且行为与屏幕截图中显示的行为相同。您的 json 文档有以下错误

在此处输入图像描述

以及修复逗号后的有效 json和 引号。

create
    or replace table vartab (n number(2), v variant);

insert into
    vartab
select
    column1 as n,
    parse_json(column2) as v
from
values
    (
        1,
        '{
   "fullName":"Niks",
   "age":15,
   "address":{
      "areaCode":"1234"
   },
   "state":"NaN",
   "movies":[
      {
         "name":"Iron Man",
         "budget":"$150M",
         "producer":"Avi"
      },
      {
         "name":"Sherlock Holmes",
         "budget":"$200M",
         "producer":"Joel Silver"
      }
   ]
}'
    ) as vals;
Select
    *,
    v:state
from
    vartab;
   

输入图片此处描述

Your JSON data is invalid and just check any JSON validator , it will throw out error and it is same behaviour what is shown in your screenshot.Your json doc has below errors

enter image description here

and below valid json after fixing comma and quotes.

create
    or replace table vartab (n number(2), v variant);

insert into
    vartab
select
    column1 as n,
    parse_json(column2) as v
from
values
    (
        1,
        '{
   "fullName":"Niks",
   "age":15,
   "address":{
      "areaCode":"1234"
   },
   "state":"NaN",
   "movies":[
      {
         "name":"Iron Man",
         "budget":"$150M",
         "producer":"Avi"
      },
      {
         "name":"Sherlock Holmes",
         "budget":"$200M",
         "producer":"Joel Silver"
      }
   ]
}'
    ) as vals;
Select
    *,
    v:state
from
    vartab;
   

enter image description here

初见你 2025-01-16 18:01:37

NaN 处理

NaN 是一个有效的双精度值,因此,如果您使用 TRY_TO_DOUBLE 解析它,您会得到正确的值:

select column1, 
    try_to_number(column1) as as_number, 
    try_to_double(column1) as as_double
from values
    ('1.0'),
    ('NaN');

给出:

COLUMN1AS_NUMBERAS_DOUBLE
111
NaNnullNaN

但如果您想替换它:

String replacement

您可以采用该字符串形式:

select
    regexp_replace(column1,'NaN', 'null') as nan_free
    ,parse_json(column1):state as var_state_nan
    ,parse_json(nan_free):state as var_state_normal
    ,is_null_value(var_state_normal) as is_json_null
from values ('{"fullName":"Niks",
      "age":15,
      "address": {
            "areaCode":"1234",
                },
       "state": NaN ,
      "movies": [
          {"name":"Iron Man","budget":"$150M","producer":"Avi"},
          {"name":"Sherlock Holmes","budget":"$200M","producer":"Joel Silver"},]}')
NAN_FREEVAR_STATE_NANVAR_STATE_NORMALIS_JSON_NULL
{"fullName":"尼克斯", "年龄":15, "地址": { "areaCode":"1234", }, "状态": null , "电影": [ {"名称":"钢铁侠" ,"预算":"1.5 亿美元","制作人":"Avi"}, {"name":"夏洛克Holmes","budget":"$200M"," Producer":"Joel Silver"},]}NaNnullTRUE

JSON NaN 处理

try_to_double 可以使用,技巧是 json 值是 varaint,并且 to_double 想要一个 TEXT/ STRING/VARCHAR 输入,因此您需要通过 ::text 将其转换为字符串

并且您可以通过直接比较来测试双精度值与 NaN,文档有关于这是如何的注释 与 IEEE 754 不同行为。

但 NULLIF 函数无法正确处理此问题,但您可以根据您的问题使用 IFF 而不是 CASE。

select
    parse_json(column1):state as var_state_nan
    ,try_to_double(var_state_nan::text) as double_val
    ,double_val = 'NaN' as is_nan
    ,NULLIF(double_val, 'NaN') as should_work
    ,IFF(double_val = 'NaN', null, double_val) as does_work
from values ('{"fullName":"Niks",
      "age":15,
      "address": {
            "areaCode":"1234",
                },
       "state": NaN ,
      "movies": [
          {"name":"Iron Man","budget":"$150M","producer":"Avi"},
          {"name":"Sherlock Holmes","budget":"$200M","producer":"Joel Silver"},]}')

VAR_STATE_NANDOUBLE_VALIS_NANSHOULD_WORKDOES_WORK
NaNNaNTRUENaNnull

NaN handling

NaN is a valid double value, thus if you parse it with TRY_TO_DOUBLE you get correct value:

select column1, 
    try_to_number(column1) as as_number, 
    try_to_double(column1) as as_double
from values
    ('1.0'),
    ('NaN');

gives:

COLUMN1AS_NUMBERAS_DOUBLE
111
NaNnullNaN

but if you are wanting to replace it:

String replacing

you can do that string form:

select
    regexp_replace(column1,'NaN', 'null') as nan_free
    ,parse_json(column1):state as var_state_nan
    ,parse_json(nan_free):state as var_state_normal
    ,is_null_value(var_state_normal) as is_json_null
from values ('{"fullName":"Niks",
      "age":15,
      "address": {
            "areaCode":"1234",
                },
       "state": NaN ,
      "movies": [
          {"name":"Iron Man","budget":"$150M","producer":"Avi"},
          {"name":"Sherlock Holmes","budget":"$200M","producer":"Joel Silver"},]}')
NAN_FREEVAR_STATE_NANVAR_STATE_NORMALIS_JSON_NULL
{"fullName":"Niks", "age":15, "address": { "areaCode":"1234", }, "state": null , "movies": [ {"name":"Iron Man","budget":"$150M","producer":"Avi"}, {"name":"Sherlock Holmes","budget":"$200M","producer":"Joel Silver"},]}NaNnullTRUE

JSON NaN processing

try_to_double as can be used, the trick is json value are varaint, and to_double wants a TEXT/STRING/VARCHAR input, thus you need to cast it to string via ::text

And you can test for a double against NaN by comparing directly, the doc's have notes about how this is different to IEEE 754 behavior.

But the NULLIF function does not correctly handle this, but you can use IFF instead of CASE as per you question.

select
    parse_json(column1):state as var_state_nan
    ,try_to_double(var_state_nan::text) as double_val
    ,double_val = 'NaN' as is_nan
    ,NULLIF(double_val, 'NaN') as should_work
    ,IFF(double_val = 'NaN', null, double_val) as does_work
from values ('{"fullName":"Niks",
      "age":15,
      "address": {
            "areaCode":"1234",
                },
       "state": NaN ,
      "movies": [
          {"name":"Iron Man","budget":"$150M","producer":"Avi"},
          {"name":"Sherlock Holmes","budget":"$200M","producer":"Joel Silver"},]}')

VAR_STATE_NANDOUBLE_VALIS_NANSHOULD_WORKDOES_WORK
NaNNaNTRUENaNnull
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文