在Julia中的DataFrame中添加数千个分离器
我有一个带有两个A和B列的数据框,目前两者看起来都像A列,但是我想添加分离器,以便B列看起来像下面。我尝试使用包装格式。但是我还没有得到结果。也许值得一提的是,这两个列均为INT64,列名A和B是类型符号。
a | b
150000 | 1500,00
27 | 27,00
16614 | 166,14
除了使用format.jl外,还有其他方法可以解决此问题吗?还是格式是要走的路?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
假设您希望逗号处于典型位置而不是写逗号,这是一种方法:
如果您要替换该行,请使用
transform(df,:a => byrow(f),renamecols = false )
。如果您只需要输出向量而不是更改数据框,则可以使用
格式。(df.a,commas = true)
您可以编写自己的函数
f
以实现相同的行为,但您也可以使用有人已经在格式中写的人。但是,一旦将数据转换为
String
如上所述,您将无法过滤/排序/分析数据框中的数值数据。我建议您使用 PrettyTables 软件包。这可以立即格式化整个数据帧。Assuming you want the commas in their typical positions rather than how you wrote them, this is one way:
If you instead want the row replaced, use
transform(df, :a => ByRow(f), renamecols=false)
.If you just want the output vector rather than changing the DataFrame, you can use
format.(df.a, commas=true)
You could write your own function
f
to achieve the same behavior, but you might as well use the one someone already wrote inside the Format.jl package.However, once you transform you data to
String
s as above, you won't be able to filter/sort/analyze the numerical data in the DataFrame. I would suggest that you apply the formatting in the printing step (rather than modifying the DataFrame itself to contain strings) by using the PrettyTables package. This can format the entire DataFrame at once.(编辑以反映问题中的更新规格)
请注意,此更改后,
b
列必定是string
,因为整数类型无法在其中存储格式化信息。如果您有很多数据并发现需要更好的性能,则可能还需要使用
inlinestress
软件包:这将
b
列的数据存储为固定大小的字符串(String7
在此处键入),通常将其视为普通String
s,但对于性能而言可能会更好。(Edited to reflect the updated specs in the question)
Note that the
b
column will necessarily be aString
after this change, as integer types cannot store formatting information in them.If you have a lot of data and find that you need better performance, you may also want to use the
InlineStrings
package:This stores the
b
column's data as fixed-size strings (String7
type here), which are generally treated like normalString
s, but can be significantly better for performance.可能有人会在这里搜索一个空间的千分隔符,我煮了以下正则转录的转录:
您只需要处理数字up< 100.00因为正则不考虑它们(这很简单,但要完成工作)。我将此函数用作格式化框架的格式化。
May be someone will search here for a space thousand separator I cooked up the following regex based transcription :
You only need to deal with numbers up < 100.00 as the regex doesn't account for them (it's simplistic but do the job). I use this function as a formatter in a pretty print of a DataFrame.