.csv至.arff函数python

发布于 2025-01-25 21:34:57 字数 1953 浏览 3 评论 0原文

我正在尝试从CSV到ARFF,现在我有:

def csv2arff(csv_path, arff_path=None):
    with open(csv_path, 'r') as fr:
        attributes = []
        
        if arff_path is None:
            arff_path = csv_path[:-4] + '_prueba.arff'  # *.arff -> *.csv
            
        write_sw = False
        with open(arff_path, 'w') as fw:
            fw.write('@relation base_datos_modelo_3_limpia \n')
            firstline = fr.readlines()[0].rstrip()
            fw.write(firstline)

@RELATION

BASE_DATOS_MODELO_3_LIMPIA

DVJ_VALGUS_VALGUS_KNEEMEDIALLDISPLACEMENT_D_D_D_D_D_DISCR fppa_nd_discr,asym_slcmjlanding-pvgrf( 10 percent)_discr,asym-rom-phir(≥8)_discr,asym_tj_valgus_fppa(10percent)_discr,tj_valgus_fppa_nd_discr,asym-phf-phf-ke(a asym-phf-ke(> 8)百分比)_discr,asym_ybtpl(10%) _discr,Position,Asym-ROM-PADF-KE(≥8º)_discr,DVJ_Valgus_KneeMedialDisplacement_ND_discr,DVJ_Valgus_Knee-to-ankle-ratio_discr,Asym-ROM-PKF(≥8)_discr,Asym-ROM-PHABD(≥8)_discr,Asym -ROM-PHF-KF(≥8)_DISCR,ASYM-ROM-PHER(≥8)_DISCR,ASMYYBTANTERIOR10PERCENTDISCR,ASYM-ROM-pHABD-HF(≥8)_DISCR(≥8)_DISCR,ANSYM-PHE(ANSYM-PHE(ANSYM-PHE(≥8)_DISCR) (> 4cm)-dvj_valgus_knee; edialldisplacement_discr,asym_slcmjtakeoff-pvgrf(10percent)_discr,asym-rom-phadd(≥8)_discr,asym-ybtcomposite(asym-ybtcomposite(10percent)_discrth,assim_discr, pm(10%)_discr,asym_dvj_valgus_fppa (10percent)_discr,Asym_SLCMJ-pLFT(10percent)_discr,DominantLeg,Asym-ROM-PADF-KF(≥8)_discr,ROM-PHER_ND,CPRDmentalskills,POMStension,STAI-R,ROM-PHER_D,ROM-PHIR_D,ROM- PADF-KF_ND,ROM-PADF-KF_D,Age_at_PHV,ROM-PHIR_ND,CPRDtcohesion,Eperience,ROM-PHABD-HF_D,MaturityOffset,Weight,ROM-PHADD_ND,Height,ROM-PHADD_D,Age,POMSdepressio,ROM-PADF-KE_ND, pomsanger,ybtanterior_dnorm,ybtanterior_ndnorm,pomsvigour,soft-tissue_injury_≥4天,

所以我想在每个属性之前将“ @attribute”放在“ @attribute”之前,然后更改“,”,“”为“ \ n”。但是不知道该怎么做,我试图做出一个函数来改变“”,但没有起作用,有什么想法吗?

谢谢你们。

I'm trying to do a convertion function from csv to arff, right now I have this:

def csv2arff(csv_path, arff_path=None):
    with open(csv_path, 'r') as fr:
        attributes = []
        
        if arff_path is None:
            arff_path = csv_path[:-4] + '_prueba.arff'  # *.arff -> *.csv
            
        write_sw = False
        with open(arff_path, 'w') as fw:
            fw.write('@relation base_datos_modelo_3_limpia \n')
            firstline = fr.readlines()[0].rstrip()
            fw.write(firstline)

and that gives me:

@relation base_datos_modelo_3_limpia

DVJ_Valgus_KneeMedialDisplacement_D_discr,BMI,AgeGroup,ROM-PADF-KE_D,DVJ_Valgus_FPPA_D_discr,TrainFrequency,DVJ_Valgus_FPPA_ND_discr,Asym_SLCMJLanding-pVGRF(10percent)_discr,Asym-ROM-PHIR(≥8)_discr,Asym_TJ_Valgus_FPPA(10percent)_discr,TJ_Valgus_FPPA_ND_discr,Asym-ROM-PHF-KE(≥8)_discr,TJ_Valgus_FPPA_D_discr,Asym_SLCMJ-Height(10percent)_discr,Asym_YBTpl(10percent)_discr,Position,Asym-ROM-PADF-KE(≥8º)_discr,DVJ_Valgus_KneeMedialDisplacement_ND_discr,DVJ_Valgus_Knee-to-ankle-ratio_discr,Asym-ROM-PKF(≥8)_discr,Asym-ROM-PHABD(≥8)_discr,Asym-ROM-PHF-KF(≥8)_discr,Asym-ROM-PHER(≥8)_discr,AsymYBTanterior10percentdiscr,Asym-ROM-PHABD-HF(≥8)_discr,Asym-ROM-PHE(≥8)_discr,Asym(>4cm)-DVJ_Valgus_Knee;edialDisplacement_discr,Asym_SLCMJTakeOff-pVGRF(10percent)_discr,Asym-ROM-PHADD(≥8)_discr,Asym-YBTcomposite(10percent)_discr,Asym_SingleHop(10percent)_discr,Asym_YBTpm(10percent)_discr,Asym_DVJ_Valgus_FPPA(10percent)_discr,Asym_SLCMJ-pLFT(10percent)_discr,DominantLeg,Asym-ROM-PADF-KF(≥8)_discr,ROM-PHER_ND,CPRDmentalskills,POMStension,STAI-R,ROM-PHER_D,ROM-PHIR_D,ROM-PADF-KF_ND,ROM-PADF-KF_D,Age_at_PHV,ROM-PHIR_ND,CPRDtcohesion,Eperience,ROM-PHABD-HF_D,MaturityOffset,Weight,ROM-PHADD_ND,Height,ROM-PHADD_D,Age,POMSdepressio,ROM-PADF-KE_ND,POMSanger,YBTanterior_Dnorm,YBTanterior_NDnorm,POMSvigour,Soft-Tissue_injury_≥4days

So i want to put "@attribute" before each attribute and change the "," to "\n". But don't know how to do it, I tried to make a function to change the "," but didn't work, any idea?

Thank you guys.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

明媚殇 2025-02-01 21:34:57

尝试 liac-arff 库。

这是将UCI IRIS数据集从ARFF转换为CSV的示例,然后返回ARFF:

import csv
import arff

# arff -> csv
content = arff.load(open('./iris.arff', 'r'))
with open('./out.csv', 'w') as fp:
    writer = csv.writer(fp)
    header = []
    for n, t in content['attributes']:
        header.append(n)
    writer.writerow(header)
    writer.writerows(content['data'])

# csv -> arff
with open('./out.csv', 'r') as fp:
    reader = csv.reader(fp)
    header = None
    data = []
    for row in reader:
        if header is None:
            header = row
        else:
            data.append(row)

content = {}
content['relation'] = "from my csv file"
content['attributes'] = []
for n in header:
    if n == "class":
        content['attributes'].append((n, ['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']))
    else:
        content['attributes'].append((n, 'NUMERIC'))
content['data'] = data
with open('./out.arff', 'w') as fp:
    arff.dump(content, fp)

nb:在最后阶段,我们需要指定名义类值,您可以通过扫描扫描来确定它们数据。

Try the liac-arff library.

Here is an example for converting the UCI iris dataset from ARFF to CSV and then back to ARFF:

import csv
import arff

# arff -> csv
content = arff.load(open('./iris.arff', 'r'))
with open('./out.csv', 'w') as fp:
    writer = csv.writer(fp)
    header = []
    for n, t in content['attributes']:
        header.append(n)
    writer.writerow(header)
    writer.writerows(content['data'])

# csv -> arff
with open('./out.csv', 'r') as fp:
    reader = csv.reader(fp)
    header = None
    data = []
    for row in reader:
        if header is None:
            header = row
        else:
            data.append(row)

content = {}
content['relation'] = "from my csv file"
content['attributes'] = []
for n in header:
    if n == "class":
        content['attributes'].append((n, ['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']))
    else:
        content['attributes'].append((n, 'NUMERIC'))
content['data'] = data
with open('./out.arff', 'w') as fp:
    arff.dump(content, fp)

NB: For the last stage, we need to specify the nominal class values, which you could determine by scanning the data.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文