在Python中使用换行符和制表符格式化字符串输出？

发布于 2024-12-14 04:43:27 字数 2270 浏览 1 评论 0原文

我试图从大量文件中提取一些数据，并将它们转换为特定的（JSON）格式，以便使用 Django Fixtures 导入到数据库中，

我已经做到了这一点：

'{ {\n "pk":2 ，\n“模型”：试验.条件，\n“字段”：{\n“试验 ID”：NCT00109798，\n“关键字”：脑和中枢神经系统肿瘤，\n }{\n "pk":3,\n "model": Trials.conditions,\n "fields": {\n "Trial_id": NCT00109798,\n "keyword": Lymphoma,\n }{\n " pk": 2,\n "模型": Trials.criteria,\n "字段": {\n "Trial_id": NCT00109798,\n “性别”：两者，\n“最小年龄”：18 岁，\n“最大年龄”：不适用，\n“健康志愿者”：否，\n“文本块”： ,\n }\n\t\t" pk":2,\n\t\t"model": Trials.keyword,\n\t\t"fields": {\n\t\t"Trial_id": NCT00109798，\n\t\t“关键字”：原发性中枢神经系统非霍奇金淋巴瘤，\n\t\t}\n\t\t

...许多行之后......

完成研究治疗后，每 3 个月对患者进行一次随访，持续 1 年，\n 每 4 个月一次，持续 1 年，然后每 6 个月一次，持续 3 年。\n\n 预计应计：总共将有 6-25 名患者接受随访本研究累计。\n 、\n“overall_status”：招募、\n“phase”：第 2 阶段、\n“注册”：25、\n“study_type”：介入、\n“condition”：2,3 ,\n "criteria": 1,\n "overall_contact": testdata,\n "location": 4,\n "lastchanged_date": 三月2010 年 3 月 31 日，\n "firstreceived_date": 2005 年 5 月 3 日，\n "keyword": 2,3,\n "condition_mesh": ,\n }\n \n {\n "pk": testdata,\n “模型”：Trials.contact，\n“字段”：{\n“Trial_id”：NCT00109798，\n "last_name": Pamela Z. New, MD,\n "phone": ,\n "email": ,\n }}'

输出实际上需要如下所示：

{
    "pk": trial_id,
    "model": trials.trial,
    "fields": {
            "trial_id": trial_id,
            "brief_title": brief_title,
            "official_title": official_title,
            "brief_summary": brief_summary,
            "detailed_Description": detailed_description,
            "overall_status": overall_status,
            "phase": phase,
            "enrollment": enrollment,
            "study_type": study_type,
            "condition": _______________,
            "elligibility": elligibility,
            "criteria": ______________,
            "overall_contact": _______________,
            "location": ___________,
            "lastchanged_date": lastchanged_date,
            "firstreceived_date": firstreceived_date,
            "keyword": __________,
            "condition_mesh": condition_mesh,
    }

    "pk": null,
    "model": trials.locations,
    "fields": {
           "trials_id": trials_id,
           "facility": facility,
           "city": city,
           "state": state,
           "zip": zip,
           "country": country,
    }

任何建议将不胜感激。

原文

Im trying to extract some data from a large batch of files and convert them to a specific (JSON) format for importing into a database using Django Fixtures

I've been able to get this far:

'{ {\n "pk":2,\n "model": trials.conditions,\n "fields": {\n "trial_id": NCT00109798,\n "keyword": Brain and Central Nervous System Tumors,\n }{\n "pk":3,\n "model": trials.conditions,\n "fields": {\n "trial_id": NCT00109798,\n "keyword": Lymphoma,\n }{\n "pk": 2,\n "model": trials.criteria,\n "fields": {\n "trial_id": NCT00109798,\n "gender": Both,\n "minimum_age": 18 Years,\n "maximum_age": N/A,\n "healthy_volunteers": No,\n "textblock": ,\n }\n\t\t"pk":2,\n\t\t"model": trials.keyword,\n\t\t"fields": {\n\t\t"trial_id": NCT00109798,\n\t\t"keyword": primary central nervous system non-Hodgkin lymphoma,\n\t\t}\n\t\t

...many lines later.....

After completion of study treatment, patients are followed every 3 months for 1 year, every\n 4 months for 1 year, and then every 6 months for 3 years.\n\n PROJECTED ACCRUAL: A total of 6-25 patients will be accrued for this study.\n ,\n "overall_status": Recruiting,\n "phase": Phase 2,\n "enrollment": 25,\n "study_type": Interventional,\n "condition": 2,3,\n "criteria": 1,\n "overall_contact": testdata,\n "location": 4,\n "lastchanged_date": March 31, 2010,\n "firstreceived_date": May 3, 2005,\n "keyword": 2,3,\n "condition_mesh": ,\n }\n \n {\n "pk": testdata,\n "model": trials.contact,\n "fields": {\n "trial_id": NCT00109798,\n "last_name": Pamela Z. New, MD,\n "phone": ,\n "email": ,\n }}'

The output actually needs to look like this:

{
    "pk": trial_id,
    "model": trials.trial,
    "fields": {
            "trial_id": trial_id,
            "brief_title": brief_title,
            "official_title": official_title,
            "brief_summary": brief_summary,
            "detailed_Description": detailed_description,
            "overall_status": overall_status,
            "phase": phase,
            "enrollment": enrollment,
            "study_type": study_type,
            "condition": _______________,
            "elligibility": elligibility,
            "criteria": ______________,
            "overall_contact": _______________,
            "location": ___________,
            "lastchanged_date": lastchanged_date,
            "firstreceived_date": firstreceived_date,
            "keyword": __________,
            "condition_mesh": condition_mesh,
    }

    "pk": null,
    "model": trials.locations,
    "fields": {
           "trials_id": trials_id,
           "facility": facility,
           "city": city,
           "state": state,
           "zip": zip,
           "country": country,
    }

Any advice would be much appreciated.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

撩发小公举 2024-12-21 04:43:27

json.dumps indent 参数的替代方案：

Python 在 http://docs.python.org 上有一个漂亮的打印机/library/pprint.html。它使用起来非常简单，但只能漂亮地打印 python 对象（你不能给它一个 json 字符串并期望格式化输出）

例如。

pydict = {"name":"Chateau des Tours Brouilly","code":"chateau-des-tours-brouilly-2009-1","region":"France > Burgundy > Beaujolais > Brouilly","winery":"Chateau Des Tours","winery_id":"chateau-des-tours","varietal":"Gamay","price":"14.98","vintage":"2009","type":"Red Wine","link":"http://www.snooth.com/wine/chateau-des-tours-brouilly-2009-1/","tags":"colorful, mauve, intense, purple, floral, violet, lively, rich, raspberry, berry","image":"http://ei.isnooth.com/wine/b/7/8/wine_6316762_search.jpeg","snoothrank":3,"available":1,"num_merchants":10,"num_reviews":1}
from pprint import pprint
pprint(pydict)

输出是

{'available': 1,
 'code': 'chateau-des-tours-brouilly-2009-1',
 'image': 'http://ei.isnooth.com/wine/b/7/8/wine_6316762_search.jpeg',
 'link': 'http://www.snooth.com/wine/chateau-des-tours-brouilly-2009-1/',
 'name': 'Chateau des Tours Brouilly',
 'num_merchants': 10,
 'num_reviews': 1,
 'price': '14.98',
 'region': 'France > Burgundy > Beaujolais > Brouilly',
 'snoothrank': 3,
 'tags': 'colorful, mauve, intense, purple, floral, violet, lively, rich, raspberry, berry',
 'type': 'Red Wine',
 'varietal': 'Gamay',
 'vintage': '2009',
 'winery': 'Chateau Des Tours',
 'winery_id': 'chateau-des-tours'}

Alternative to json.dumps indent parameter:

Python has a pretty printer at http://docs.python.org/library/pprint.html. It is extremely simple to use but only pretty prints python objects (You can't give it a json string and expect formatted output)

Eg.

pydict = {"name":"Chateau des Tours Brouilly","code":"chateau-des-tours-brouilly-2009-1","region":"France > Burgundy > Beaujolais > Brouilly","winery":"Chateau Des Tours","winery_id":"chateau-des-tours","varietal":"Gamay","price":"14.98","vintage":"2009","type":"Red Wine","link":"http://www.snooth.com/wine/chateau-des-tours-brouilly-2009-1/","tags":"colorful, mauve, intense, purple, floral, violet, lively, rich, raspberry, berry","image":"http://ei.isnooth.com/wine/b/7/8/wine_6316762_search.jpeg","snoothrank":3,"available":1,"num_merchants":10,"num_reviews":1}
from pprint import pprint
pprint(pydict)

The output is

{'available': 1,
 'code': 'chateau-des-tours-brouilly-2009-1',
 'image': 'http://ei.isnooth.com/wine/b/7/8/wine_6316762_search.jpeg',
 'link': 'http://www.snooth.com/wine/chateau-des-tours-brouilly-2009-1/',
 'name': 'Chateau des Tours Brouilly',
 'num_merchants': 10,
 'num_reviews': 1,
 'price': '14.98',
 'region': 'France > Burgundy > Beaujolais > Brouilly',
 'snoothrank': 3,
 'tags': 'colorful, mauve, intense, purple, floral, violet, lively, rich, raspberry, berry',
 'type': 'Red Wine',
 'varietal': 'Gamay',
 'vintage': '2009',
 'winery': 'Chateau Des Tours',
 'winery_id': 'chateau-des-tours'}

回复收藏 0 原文

回眸一笑 2024-12-21 04:43:27

json 模块中有一个漂亮的打印机。尝试这样的操作，print json.dumps(s, indent=4)。

>>> s = {'pk': 5678, 'model': 'trial model', 'fields': {'brief_title': 'a short title', 'trial_id':    1234}}

>>> print json.dumps(s, indent=4)
{
    "pk": 5678, 
    "model": "trial model", 
    "fields": {
        "brief_title": "a short title", 
        "trial_id": 1234
    }
}

There is a pretty printer in the json module. Try something like this, print json.dumps(s, indent=4).

>>> s = {'pk': 5678, 'model': 'trial model', 'fields': {'brief_title': 'a short title', 'trial_id':    1234}}

>>> print json.dumps(s, indent=4)
{
    "pk": 5678, 
    "model": "trial model", 
    "fields": {
        "brief_title": "a short title", 
        "trial_id": 1234
    }
}

回复收藏 0 原文

~没有更多了~