按列中的值从数据帧中提取矩阵
我正在尝试一些可能有点难以理解的事情,但我会尝试非常具体。
Lat这样的 python 数据框
我有一个像Locality | Count | 。 | 长的。 |
---|---|---|---|
克拉斯诺达 | 尔 俄罗斯 | 44 | 39 |
地拉那 | 阿尔巴尼亚 | 41.33 | 19.83 |
阿雷尼 | 亚美尼亚 | 39.73 | 45.2 |
卡尔斯 | 亚美尼亚 | 40.604517 | 43.100758 |
Brunn Wolfholz | 奥地利 | 48.120396 | 16.291722 |
Kleinhadersdorf Flur Marchleiten | 奥地利 | 48.663197 | 16.589687 |
Jalilabad区 | 阿塞拜疆 | 39.3607139 | 48.4613556 |
Zeyem Chaj | 阿塞拜疆 | 40.9418889 | 45.8327778 |
Jalilabad区 | 阿塞拜疆 | 39.5186111 | 48.65 |
和一个数据框cities.txt
带有一些国家的名称:
Albania
Armenia
Austria
Azerbaijan
等等。 我接下来要做的就是转换这个纬度。和长。将值作为弧度,然后使用列表中的值执行以下操作:
with open('cities.txt') as file:
lines=file.readlines()
x=np.where(df['Count'].eq(lines),pd.DataFrame(
dist.pairwise(df[['Lat.','Long.']].to_numpy())*6373,
columns=df.Locality.unique(), index=df.Locality.unique()))
Where pd.DataFrame(dist.pairwise(df[['Lat.','Long.']].to_numpy())*6373, columns =df.Locality.unique(), index=df.Locality.unique())
正在转换 Lat 中的弧度。长。
以公里为单位的距离,并为每条线(国家/地区)创建一个数据框作为矩阵。
最后我将有很多按国家分组的二维矩阵(理论上),我想应用这个:
>>>Russia.min()
0
>>>Russia.max()
5
获取 .min()
和 .max()
每个矩阵中的值并将结果保存在 cities.txt
中,
Country Max.Dist. Min. Dist.
Albania 5 1
Armenia 10 9
Austria 5 3
Azerbaijan 0 0
不幸的是,1)我在第一部分中收到警告ValueError:长度必须相等
, 2) 可以将此矩阵分组按国家/地区和 3) 保存我的 .min()
和 .max()
值?
I am trying something that could be a little hard to understand but i will try to be very specific.
I have a dataframe of python like this
Locality | Count | Lat. | Long. |
---|---|---|---|
Krasnodar | Russia | 44 | 39 |
Tirana | Albania | 41.33 | 19.83 |
Areni | Armenia | 39.73 | 45.2 |
Kars | Armenia | 40.604517 | 43.100758 |
Brunn Wolfholz | Austria | 48.120396 | 16.291722 |
Kleinhadersdorf Flur Marchleiten | Austria | 48.663197 | 16.589687 |
Jalilabad district | Azerbaijan | 39.3607139 | 48.4613556 |
Zeyem Chaj | Azerbaijan | 40.9418889 | 45.8327778 |
Jalilabad district | Azerbaijan | 39.5186111 | 48.65 |
And a dataframe cities.txt
with a the name of some countries:
Albania
Armenia
Austria
Azerbaijan
And so on.
The nex what I am doing is convert this Lat. and Long. values as radians and then with the values from the list do something like:
with open('cities.txt') as file:
lines=file.readlines()
x=np.where(df['Count'].eq(lines),pd.DataFrame(
dist.pairwise(df[['Lat.','Long.']].to_numpy())*6373,
columns=df.Locality.unique(), index=df.Locality.unique()))
Where pd.DataFrame(dist.pairwise(df[['Lat.','Long.']].to_numpy())*6373, columns=df.Locality.unique(), index=df.Locality.unique())
is converting radians in Lat. Long.
into distances in km and create a dataframe as a matrix for each line (country).
In the end i will have a lot of matrix 2d (in theory) grouped by countries and i want to apply this:
>>>Russia.min()
0
>>>Russia.max()
5
to get the .min()
and .max()
value in each matrix and save this results in cities.txt
as
Country Max.Dist. Min. Dist.
Albania 5 1
Armenia 10 9
Austria 5 3
Azerbaijan 0 0
Unfortunately, 1) I'm stock in the first part where I have an warning ValueError: Lengths must be equal
, 2) can be possible have this matrix grouped by country and 3) save my .min()
and .max()
values?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不确定你到底想要什么。在此解决方案中,如果只有 1 个城市,则最小值为 0,否则为国内 2 个城市之间的最短距离。另外,文件名
cities.txt
似乎只是一个过滤器。我没有这样做,但看起来很简单。这里只是一些示例数据;
为
groupby()
创建并应用自定义聚合在我的例子中,这会打印类似的内容
I am not sure what you exactly want as minimum. In this solution, the minimum is 0 if there is only 1 city, but otherwise the shortest distance between 2 cities within the country. Also, the filename
cities.txt
seems just a filter. I didn't do this but seems straightforward.Here just some sample data;
Create and apply a custom aggregate for
groupby()
In my case this prints something like