您如何将数据从数据框架复制到另一个
我很难将正确的数据从参考CSV文件获取正确的数据到我正在处理的文件。
我有一个CSV文件,其中有超过600万行和19列。我看起来像这样: 在此处输入图像描述
对于每一行,都有一个品牌和一个汽车模型,包括其他信息。 我想将每100公里旅行的燃油和所使用的燃料类型添加到此文件中。
我还有另一个CSV文件,它具有每种看起来像这样的汽车模型的燃料消耗:输入图像描述这里
我最终要做的是将g,h,i和j列的匹配值从第二个文件添加到第一个文件。
由于文件的大小,我想知道除了“ for”或“ a”循环之外,还有其他方法可以做其他方法吗?
编辑:
例如... 第一个DF看起来像这个
ID | 品牌 | 模型 | 其他_COLUMNS | fuel_consu_1 | fuel_consu_2 |
---|---|---|---|---|---|
1 | Toyota | Rav4 | A | C | Nan |
Nan | Nan | Nan | Nan | Nan | Nan |
Nan | Nan Nan Nan Nan Nan Nan Nan Nan Nan Nan Nan Nan Nan Nan 3 Gmc | Nan | Nan | Nan | Nan Nan Nan Nan Nan Nan |
Nan | Sierra | Nan | d | Nan Nan | nan d nan |
d nan nan spect this
Id id | far fall ful_consu_consu_consu_consu_consu_consu_consu_consu_consu_consu_conu_consu_cod_consu_conu_consu_conu_consu_11 | nan | d | Fuel_consu_2 |
---|---|---|---|---|
1 | Toyota | Corrola | 100 | 120 |
2 | Toyota | Rav4 | 80 | 84 |
3 | GMC | Sierra | 91 | 105 |
4 | Honda | Civic | 112 | 125 |
The output should be :
ID | Brand | Model | Other_columns | Fuel_consu_1 | Fuel_consu_2 |
---|---|---|---|---|---|
1 | Toyota | Rav4 | a | 80 | 84 |
2 | Honda | Civic | b | 112 | 125 |
3 | GMC | Sierra | c | 91 | 105 |
4 | Toyota | Rav4 | D | 80 | 84 |
第一个DF可能具有与不同ID相同的品牌和型号。订单是完全随机的。
I am having a difficult time getting the correct data from a reference csv file to the one I am working on.
I have a csv file that has over 6 million rows and 19 columns. I looks something like this :
enter image description here
For each row there is a brand and a model of a car amongst other information.
I want to add to this file the fuel consumption per 100km traveled and the type of fuel that is used.
I have another csv file that has the fuel consumption of every model of car that looks something like this : enter image description here
What I want to ultimately do is add the matching values of G,H, I and J columns from the second file to the first one.
Because of the size of the file I was wondering if there is another way to do it other than with a "for" or a "while" loop?
EDIT :
For example...
The first df would look something like this
ID | Brand | Model | Other_columns | Fuel_consu_1 | Fuel_consu_2 |
---|---|---|---|---|---|
1 | Toyota | Rav4 | a | NaN | NaN |
2 | Honda | Civic | b | NaN | NaN |
3 | GMC | Sierra | c | NaN | NaN |
4 | Toyota | Rav4 | d | NaN | NaN |
The second df would be something like this
ID | Brand | Model | Fuel_consu_1 | Fuel_consu_2 |
---|---|---|---|---|
1 | Toyota | Corrola | 100 | 120 |
2 | Toyota | Rav4 | 80 | 84 |
3 | GMC | Sierra | 91 | 105 |
4 | Honda | Civic | 112 | 125 |
The output should be :
ID | Brand | Model | Other_columns | Fuel_consu_1 | Fuel_consu_2 |
---|---|---|---|---|---|
1 | Toyota | Rav4 | a | 80 | 84 |
2 | Honda | Civic | b | 112 | 125 |
3 | GMC | Sierra | c | 91 | 105 |
4 | Toyota | Rav4 | d | 80 | 84 |
The first df may have many times the same brand and model for different ID's. The order is completely random.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
感谢您提供更新,我能够将一些应该能够帮助您的东西放在一起
Thank you for providing updates I was able to put something together that should be able to help you