对于if语句循环,要通过数据框架检查列值startswith一个特定值

发布于 2025-02-04 08:05:00 字数 974 浏览 3 评论 0原文

我有一个带有地理杂质的shapefile。我不需要所有的行,只有某些行以ROUTE_ID为01、02和03开始的行。我使用for循环和if语句来尝试将我想要的数据附加到空数据框中。 示例数据:

route_idfrom_measuto_measurestreet_prebase_name
01006595050034-D5.7997259.678965215TH
D09.678965ST错误
0200006595050034 -以下串联
34-D09.678965220

我的代码如下:

mnlrshwy = pd.DataFrame(columns=['ROUTE_ID','FROM_MEASU','TO_MEASURE','STREET_PRE',
                                 'BASE_NAME'])
for x in mnlrs['ROUTE_ID']:
    if x.startswith(('01','02','03')) is True:
        mnlrshwy = x.append(mnlrs,ignore_index = True)

我得到 不明白为什么我会得到这样的东西。 任何建议都会有所帮助。

I have a shapefile that I am bringing in with geopandas. I do not need all the rows, just certain rows that start with 01, 02, and 03 for the route_id. I use a for loop and an if statement to try to append the data I want to an empty dataframe. Below is the sample Data:

ROUTE_IDFROM_MEASUTO_MEASURESTREET_PREBASE_NAME
0100006595050034-D5.7997259.678965215th
0200006595050034-D09.678965ST220th
0300006595050034-D5.7997259.678965215th
0400006595050034-D09.678965ST220th

my code is as follows:

mnlrshwy = pd.DataFrame(columns=['ROUTE_ID','FROM_MEASU','TO_MEASURE','STREET_PRE',
                                 'BASE_NAME'])
for x in mnlrs['ROUTE_ID']:
    if x.startswith(('01','02','03')) is True:
        mnlrshwy = x.append(mnlrs,ignore_index = True)

I get a concatenation error which I don't understand why I would get something like that.
Any suggestions would be helpful.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

叹倦 2025-02-11 08:05:00

如果您使用熊猫,则无需循环。

df[df['ROUTE_ID'].str.contains("^0(1|2|3)", regex=True, na=False)]

If you use pandas, you won't need a for loop.

df[df['ROUTE_ID'].str.contains("^0(1|2|3)", regex=True, na=False)]
过气美图社 2025-02-11 08:05:00

在每个阶段执行选票,或将其转换为字符串,然后将其附加到您的数据框架上。

perform type check at each stage, or convert it into string then append it to your data frame.

忆梦 2025-02-11 08:05:00

上面的DF输出

             ROUTE_ID  FROM_MEASU  TO_MEASURE STREET_PRE BASE_NAME
0  0100006595050034-D    5.799725    9.678965        NaN     215th
1  0200006595050034-D    0.000000    9.678965         ST     220th
2  0300006595050034-D    5.799725    9.678965        NaN     215th
3  0400006595050034-D    0.000000    9.678965         ST     220th

是您的数据框架的数据。

mnlrshwy = pd.DataFrame(columns=['ROUTE_ID', 'FROM_MEASU', 'TO_MEASURE', 'STREET_PRE',
                                 'BASE_NAME'], index=[0, 1, 2, 3])
for x in range(0, len(df['ROUTE_ID'])):
    if df.loc[x, 'ROUTE_ID'].startswith(('01', '02', '03')) is True:
        mnlrshwy.loc[x, :] = df.loc[x, 'ROUTE_ID']

print(mnlrshwy)

输出

             ROUTE_ID          FROM_MEASU          TO_MEASURE  \
0  0100006595050034-D  0100006595050034-D  0100006595050034-D   
1  0200006595050034-D  0200006595050034-D  0200006595050034-D   
2  0300006595050034-D  0300006595050034-D  0300006595050034-D   
3                 NaN                 NaN                 NaN   

           STREET_PRE           BASE_NAME  
0  0100006595050034-D  0100006595050034-D  
1  0200006595050034-D  0200006595050034-D  
2  0300006595050034-D  0300006595050034-D  
3                 NaN                 NaN  

您在每次迭代时都会获得一个值。而且,如果空框架中没有索引,则不可能分配一个值。除非您在方括号中添加一个值。 ,您可以看到一个空的 Dataframe

在这里 在每个迭代中,所有列的行。左侧使用LOC是索引,右侧是列的名称。

df Output

             ROUTE_ID  FROM_MEASU  TO_MEASURE STREET_PRE BASE_NAME
0  0100006595050034-D    5.799725    9.678965        NaN     215th
1  0200006595050034-D    0.000000    9.678965         ST     220th
2  0300006595050034-D    5.799725    9.678965        NaN     215th
3  0400006595050034-D    0.000000    9.678965         ST     220th

Above is the data of your dataframe.

mnlrshwy = pd.DataFrame(columns=['ROUTE_ID', 'FROM_MEASU', 'TO_MEASURE', 'STREET_PRE',
                                 'BASE_NAME'], index=[0, 1, 2, 3])
for x in range(0, len(df['ROUTE_ID'])):
    if df.loc[x, 'ROUTE_ID'].startswith(('01', '02', '03')) is True:
        mnlrshwy.loc[x, :] = df.loc[x, 'ROUTE_ID']

print(mnlrshwy)

Output

             ROUTE_ID          FROM_MEASU          TO_MEASURE  \
0  0100006595050034-D  0100006595050034-D  0100006595050034-D   
1  0200006595050034-D  0200006595050034-D  0200006595050034-D   
2  0300006595050034-D  0300006595050034-D  0300006595050034-D   
3                 NaN                 NaN                 NaN   

           STREET_PRE           BASE_NAME  
0  0100006595050034-D  0100006595050034-D  
1  0200006595050034-D  0200006595050034-D  
2  0300006595050034-D  0300006595050034-D  
3                 NaN                 NaN  

You get one value at each iteration. And if there are no indexes in an empty dataframe, then it will not be possible to assign a value. Unless you add a value in square brackets. Here you can see about an empty dataframe

I filled in on each iteration the rows of all columns. Using loc on the left is the indexes, on the right is the name of the column.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文