.csv至.arff函数python

发布于 2025-01-25 21:34:57 字数 1953 浏览 3 评论 0原文

我正在尝试从CSV到ARFF，现在我有：

def csv2arff(csv_path, arff_path=None):
    with open(csv_path, 'r') as fr:
        attributes = []
        
        if arff_path is None:
            arff_path = csv_path[:-4] + '_prueba.arff'  # *.arff -> *.csv
            
        write_sw = False
        with open(arff_path, 'w') as fw:
            fw.write('@relation base_datos_modelo_3_limpia \n')
            firstline = fr.readlines()[0].rstrip()
            fw.write(firstline)

@RELATION

BASE_DATOS_MODELO_3_LIMPIA

DVJ_VALGUS_VALGUS_KNEEMEDIALLDISPLACEMENT_D_D_D_D_D_DISCR fppa_nd_discr，asym_slcmjlanding-pvgrf（ 10 percent）_discr，asym-rom-phir（≥8）_discr，asym_tj_valgus_fppa（10percent）_discr，tj_valgus_fppa_nd_discr，asym-phf-phf-ke（a asym-phf-ke（> 8）百分比）_discr，asym_ybtpl（10％） _discr,Position,Asym-ROM-PADF-KE(≥8º)_discr,DVJ_Valgus_KneeMedialDisplacement_ND_discr,DVJ_Valgus_Knee-to-ankle-ratio_discr,Asym-ROM-PKF(≥8)_discr,Asym-ROM-PHABD(≥8)_discr,Asym -ROM-PHF-KF（≥8）_DISCR，ASYM-ROM-PHER（≥8）_DISCR，ASMYYBTANTERIOR10PERCENTDISCR，ASYM-ROM-pHABD-HF（≥8）_DISCR（≥8）_DISCR，ANSYM-PHE（ANSYM-PHE（ANSYM-PHE（≥8）_DISCR）（＆gt; 4cm）-dvj_valgus_knee; edialldisplacement_discr，asym_slcmjtakeoff-pvgrf（10percent）_discr，asym-rom-phadd（≥8）_discr，asym-ybtcomposite（asym-ybtcomposite（10percent）_discrth，assim_discr， pm（10％）_discr，asym_dvj_valgus_fppa (10percent)_discr,Asym_SLCMJ-pLFT(10percent)_discr,DominantLeg,Asym-ROM-PADF-KF(≥8)_discr,ROM-PHER_ND,CPRDmentalskills,POMStension,STAI-R,ROM-PHER_D,ROM-PHIR_D,ROM- PADF-KF_ND,ROM-PADF-KF_D,Age_at_PHV,ROM-PHIR_ND,CPRDtcohesion,Eperience,ROM-PHABD-HF_D,MaturityOffset,Weight,ROM-PHADD_ND,Height,ROM-PHADD_D,Age,POMSdepressio,ROM-PADF-KE_ND, pomsanger，ybtanterior_dnorm，ybtanterior_ndnorm，pomsvigour，soft-tissue_injury_≥4天，

所以我想在每个属性之前将“ @attribute”放在“ @attribute”之前，然后更改“，”，“”为“ \ n”。但是不知道该怎么做，我试图做出一个函数来改变“”，但没有起作用，有什么想法吗？

谢谢你们。

原文

I'm trying to do a convertion function from csv to arff, right now I have this:

def csv2arff(csv_path, arff_path=None):
    with open(csv_path, 'r') as fr:
        attributes = []
        
        if arff_path is None:
            arff_path = csv_path[:-4] + '_prueba.arff'  # *.arff -> *.csv
            
        write_sw = False
        with open(arff_path, 'w') as fw:
            fw.write('@relation base_datos_modelo_3_limpia \n')
            firstline = fr.readlines()[0].rstrip()
            fw.write(firstline)

and that gives me:

@relation base_datos_modelo_3_limpia

DVJ_Valgus_KneeMedialDisplacement_D_discr,BMI,AgeGroup,ROM-PADF-KE_D,DVJ_Valgus_FPPA_D_discr,TrainFrequency,DVJ_Valgus_FPPA_ND_discr,Asym_SLCMJLanding-pVGRF(10percent)_discr,Asym-ROM-PHIR(≥8)_discr,Asym_TJ_Valgus_FPPA(10percent)_discr,TJ_Valgus_FPPA_ND_discr,Asym-ROM-PHF-KE(≥8)_discr,TJ_Valgus_FPPA_D_discr,Asym_SLCMJ-Height(10percent)_discr,Asym_YBTpl(10percent)_discr,Position,Asym-ROM-PADF-KE(≥8º)_discr,DVJ_Valgus_KneeMedialDisplacement_ND_discr,DVJ_Valgus_Knee-to-ankle-ratio_discr,Asym-ROM-PKF(≥8)_discr,Asym-ROM-PHABD(≥8)_discr,Asym-ROM-PHF-KF(≥8)_discr,Asym-ROM-PHER(≥8)_discr,AsymYBTanterior10percentdiscr,Asym-ROM-PHABD-HF(≥8)_discr,Asym-ROM-PHE(≥8)_discr,Asym(>4cm)-DVJ_Valgus_Knee;edialDisplacement_discr,Asym_SLCMJTakeOff-pVGRF(10percent)_discr,Asym-ROM-PHADD(≥8)_discr,Asym-YBTcomposite(10percent)_discr,Asym_SingleHop(10percent)_discr,Asym_YBTpm(10percent)_discr,Asym_DVJ_Valgus_FPPA(10percent)_discr,Asym_SLCMJ-pLFT(10percent)_discr,DominantLeg,Asym-ROM-PADF-KF(≥8)_discr,ROM-PHER_ND,CPRDmentalskills,POMStension,STAI-R,ROM-PHER_D,ROM-PHIR_D,ROM-PADF-KF_ND,ROM-PADF-KF_D,Age_at_PHV,ROM-PHIR_ND,CPRDtcohesion,Eperience,ROM-PHABD-HF_D,MaturityOffset,Weight,ROM-PHADD_ND,Height,ROM-PHADD_D,Age,POMSdepressio,ROM-PADF-KE_ND,POMSanger,YBTanterior_Dnorm,YBTanterior_NDnorm,POMSvigour,Soft-Tissue_injury_≥4days

So i want to put "@attribute" before each attribute and change the "," to "\n". But don't know how to do it, I tried to make a function to change the "," but didn't work, any idea?

Thank you guys.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

明媚殇 2025-02-01 21:34:57

尝试 liac-arff 库。

这是将UCI IRIS数据集从ARFF转换为CSV的示例，然后返回ARFF：

import csv
import arff

# arff -> csv
content = arff.load(open('./iris.arff', 'r'))
with open('./out.csv', 'w') as fp:
    writer = csv.writer(fp)
    header = []
    for n, t in content['attributes']:
        header.append(n)
    writer.writerow(header)
    writer.writerows(content['data'])

# csv -> arff
with open('./out.csv', 'r') as fp:
    reader = csv.reader(fp)
    header = None
    data = []
    for row in reader:
        if header is None:
            header = row
        else:
            data.append(row)

content = {}
content['relation'] = "from my csv file"
content['attributes'] = []
for n in header:
    if n == "class":
        content['attributes'].append((n, ['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']))
    else:
        content['attributes'].append((n, 'NUMERIC'))
content['data'] = data
with open('./out.arff', 'w') as fp:
    arff.dump(content, fp)

nb：在最后阶段，我们需要指定名义类值，您可以通过扫描扫描来确定它们数据。

Try the liac-arff library.

Here is an example for converting the UCI iris dataset from ARFF to CSV and then back to ARFF:

import csv
import arff

# arff -> csv
content = arff.load(open('./iris.arff', 'r'))
with open('./out.csv', 'w') as fp:
    writer = csv.writer(fp)
    header = []
    for n, t in content['attributes']:
        header.append(n)
    writer.writerow(header)
    writer.writerows(content['data'])

# csv -> arff
with open('./out.csv', 'r') as fp:
    reader = csv.reader(fp)
    header = None
    data = []
    for row in reader:
        if header is None:
            header = row
        else:
            data.append(row)

content = {}
content['relation'] = "from my csv file"
content['attributes'] = []
for n in header:
    if n == "class":
        content['attributes'].append((n, ['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']))
    else:
        content['attributes'].append((n, 'NUMERIC'))
content['data'] = data
with open('./out.arff', 'w') as fp:
    arff.dump(content, fp)

NB: For the last stage, we need to specify the nominal class values, which you could determine by scanning the data.

回复收藏 0 原文

~没有更多了~