Python：读取每个键多行的配置文件

发布于 2024-08-21 08:12:01 字数 764 浏览 7 评论 0原文

我正在编写一个小型数据库测试套件，它读取带有查询和预期结果的配置文件，例如：

query         = "SELECT * from cities WHERE name='Unknown';"
count         = 0
level         = 1
name          = "Check for cities whose name should be null"
suggested_fix = "UPDATE cities SET name=NULL WHERE name='Unknown';"

这很好用；我使用 Python 的 string.partition('=') 划分每一行。

我的问题是 SQL 查询很长。目前，我只是将这些查询粘贴为一行，这是丑陋且无法维护的。

我想找到一种优雅的 Python 方式来读取表达式的右侧，即使跨越多行。

注意：

我的 SQL 查询可能包含 =
我不喜欢在右侧强制 " 的想法，因为有许多现有文件没有它。

< strong>编辑：

ConfigParser 很棒，但它迫使我添加一个多行条目中每行开头的空格或制表符这可能会很痛苦，

亚当

。

原文

I am writing a small DB test suite, which reads configuration files with queries and expected results, e.g.:

query         = "SELECT * from cities WHERE name='Unknown';"
count         = 0
level         = 1
name          = "Check for cities whose name should be null"
suggested_fix = "UPDATE cities SET name=NULL WHERE name='Unknown';"

This works well; I divide each line using Python's string.partition('=').

My problem is very long SQL queries. Currently, I just paste these queries as a one-liner, which is ugly and unmaintainable.

I want to find an elegant, Pythonic way to read the right of an expression, even if spans over many lines.

Notes:

my SQL queries might contain the =
I don't fancy the idea of forcing "s around the right hand side, because there are many existing files without it.

EDIT:

ConfigParser is great, but it forces me to add a space or tab at the beginning of every line in a multiline entry. This might be a great pain.

Thanks in advance,

Adam

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

厌倦 2024-08-28 08:12:01

Python 标准库模块 ConfigParser 默认支持此功能。配置文件必须采用标准格式：

[Long Section]
short: this is a normal line
long: this value continues
    in the next line

上面的配置文件可以使用以下代码读取：

import ConfigParser
config = ConfigParser.ConfigParser()
config.read('longsections.cfg')
long = config.get('Long Section', 'long')

The Python standard library module ConfigParser supports this by default. The configuration file has to be in a standard format:

[Long Section]
short: this is a normal line
long: this value continues
    in the next line

The configuration file above could be read with the following code:

import ConfigParser
config = ConfigParser.ConfigParser()
config.read('longsections.cfg')
long = config.get('Long Section', 'long')

回复收藏 0 原文

呆橘 2024-08-28 08:12:01

这几乎正是让我们切换到 YAML (维基百科, Python 实现, < a href="http://pyyaml.org/wiki/PyYAMLDocumentation" rel="noreferrer">文档，您可能需要查看 JSON 作为替代方案）。 YAML 比 configparser 或 json：

人类可读性（对于较大的文件比 JSON 更好）；
可以序列化任意Python对象（这使得它与pickle一样不安全，但Python实现中有一个safe_load函数可以缓解这个问题）。这对于像日期时间对象这样简单的东西已经很有用了。

为了完整起见，主要缺点（IMO）：

Python 实现比 JSON 实现慢一个数量级；
与 JSON 相比，跨平台的可移植性较差。

例如

import yaml

sql = """
query         : "SELECT * from cities
WHERE name='Unknown';"
count         : 0
level         : 1
name          : "Check for cities whose name should be null"
suggested_fix : "UPDATE cities SET name=NULL WHERE name='Unknown';"
"""

sql_dict = yaml.safe_load(sql)

print(sql_dict['query'])

打印

SELECT * from cities WHERE name='Unknown';

This is almost exactly the use-case that made us switch to YAML (Wikipedia, python implementation, documentation; you might want to look at JSON as an alternative). YAML has some advantages over configparser or json:

human readability (better than JSON for larger files);
can serialize arbitrary python objects (which makes it as un-safe as pickle, but there is a safe_load function in the python implementation to alleviate this issue). This is already useful for something as simple as a datetime object.

For completeness sake, the main disadvantages (IMO):

Python implementation by an order of magnitude slower than JSON implementation;
less portable across platforms than JSON.

For example

import yaml

sql = """
query         : "SELECT * from cities
WHERE name='Unknown';"
count         : 0
level         : 1
name          : "Check for cities whose name should be null"
suggested_fix : "UPDATE cities SET name=NULL WHERE name='Unknown';"
"""

sql_dict = yaml.safe_load(sql)

print(sql_dict['query'])

prints

SELECT * from cities WHERE name='Unknown';

回复收藏 0 原文

新人笑 2024-08-28 08:12:01

我建议您使用正则表达式...代码可能如下所示，以便您开始：

import re

test="""query = "select * from cities;"
count = 0
multine_query = "select *
from cities
     where name='unknown';"
"""

re_config = re.compile(r'^(\w+)\s*=\s*((?:".[^"]*")|(?:\d+))
此示例的输出是：
~> python test.py 
query = 'select * from cities;'
count = 0
multine_query = "select *\nfrom cities\n     where name='unknown';"

希望有帮助！
问候，

克里斯托夫
, re.M)
for key, value in re_config.findall(test):
    if value.startswith('"'):
        value = value[1:-1]
    else:
        value = int(value)
    print key, '=', repr(value)

此示例的输出是：

希望有帮助！

问候，
克里斯托夫

I would you suggest to use a regular expression... The code might look like this to give you are start:

import re

test="""query = "select * from cities;"
count = 0
multine_query = "select *
from cities
     where name='unknown';"
"""

re_config = re.compile(r'^(\w+)\s*=\s*((?:".[^"]*")|(?:\d+))
The output of this example is:
~> python test.py 
query = 'select * from cities;'
count = 0
multine_query = "select *\nfrom cities\n     where name='unknown';"

Hope that helps!
Regards,

Christoph
, re.M)
for key, value in re_config.findall(test):
    if value.startswith('"'):
        value = value[1:-1]
    else:
        value = int(value)
    print key, '=', repr(value)

The output of this example is:

Hope that helps!

Regards,
Christoph

回复收藏 0 原文

愿得七秒忆 2024-08-28 08:12:01

如果您可以使用 Python 3.11+，TOML 是新的热点，包含在标准库中。它就像一个更简单的 YAML，更专注于成为可编辑的配置格式（许多 TOML 库通常不提供编写 TOML 的方法）。

# yourconfig.toml  (the filename assumed below, tho this is also a comment.)
query         = "SELECT * from cities WHERE name='Unknown';"
count         = 0
level         = 1   # 1 = foo, 2 = bar, 3 = baz  (other example comment)
name          = "Check for cities whose name should be null"
suggested_fix = """
UPDATE cities 
SET name=NULL 
WHERE name='Unknown';"""

提供的库是 tomllib，您可以像这样使用它：

import tomllib

with open("yourconfig.toml") as f:
    conf = tomllib.load(f)

import pprint
pprint.pprint(conf)
# {'count': 0,
#  'level': 1,
#  'name': 'Check for cities whose name should be null',
#  'query': "SELECT * from cities WHERE name='Unknown';",
#  'suggested_fix': "UPDATE cities \nSET name=NULL \nWHERE name='Unknown';"}

请注意，suggested_fix 键包含多行字符串隐含的换行符 (\n)。您可以使用结尾的反斜杠来删除它们，但如果值是 SQL，它们并不重要，所以我不会把它弄乱。

其他好处是 count 和 level 值是整数。 TOML 的内置类型包括字符串、数组/列表、映射/字典、整数、浮点数、日期时间和布尔值。

TOML 具有无法“巧妙”推断的严格类型。

在 YAML 中，on 可能是布尔值或字符串，具体取决于您使用的是版本 1.1 还是 1.2。在 TOML 中，除非您引用它，否则它是一个语法错误。如果您想要一个布尔值，请仅使用 true 或 false 之一。
在 YAML 中，O.1 是一个字符串（以大写 O 开头），在 TOML 中它是一个语法错误。
在 YAML 中，您基本上需要 https://yaml-multiline.info/ 来弄清楚如何让你的多行字符串完全符合你的期望，即便如此，我仍然需要一段时间。在 TOML 中它可以有点像你用 Python 编写（转义规则有点不同，使用单引号/双引号会影响它，尾随反斜杠会修剪空格）

If you can use Python 3.11+, TOML is the new hotness which is included in the standard library. It's like a simpler YAML which is much more focused on being an editable config format (many TOML libraries usually don't provide a way to write TOML).

# yourconfig.toml  (the filename assumed below, tho this is also a comment.)
query         = "SELECT * from cities WHERE name='Unknown';"
count         = 0
level         = 1   # 1 = foo, 2 = bar, 3 = baz  (other example comment)
name          = "Check for cities whose name should be null"
suggested_fix = """
UPDATE cities 
SET name=NULL 
WHERE name='Unknown';"""

The provided library is tomllib, and you can use it like so:

import tomllib

with open("yourconfig.toml") as f:
    conf = tomllib.load(f)

import pprint
pprint.pprint(conf)
# {'count': 0,
#  'level': 1,
#  'name': 'Check for cities whose name should be null',
#  'query': "SELECT * from cities WHERE name='Unknown';",
#  'suggested_fix': "UPDATE cities \nSET name=NULL \nWHERE name='Unknown';"}

Note that the suggested_fix key includes the newlines (\n) implied by the multiline string. You can use a trailing backslash to remove them, though if the value is SQL, they don't matter, so I wouldn't clutter it up.

Other benefits are that the count and level values are integers. TOML's built-in types include strings, arrays/lists, maps/dicts, ints, floats, datetimes, and booleans.

TOML has rigid types that aren't "cleverly" inferred.

In YAML, on might be a boolean or a string, depending on if you're on version 1.1 or 1.2. In TOML it's a syntax error unless you quote it. If you wanted a boolean, use exactly one of true or false.
In YAML, O.1 is a string (it starts with an uppercase O), in TOML it's a syntax error.
In YAML, you basically need https://yaml-multiline.info/ to figure out how to get your multiline string to be exactly what you expect, and even then it still takes me a while. In TOML it can kinda be the same as if you wrote it in Python (escaping rules are a bit different, using single/double quotes impacts it, and trailing backslashes trims whitespace)