对于if语句循环，要通过数据框架检查列值startswith一个特定值

发布于 2025-02-04 08:05:00 字数 974 浏览 3 评论 0原文

我有一个带有地理杂质的shapefile。我不需要所有的行，只有某些行以ROUTE_ID为01、02和03开始的行。我使用for循环和if语句来尝试将我想要的数据附加到空数据框中。示例数据：

route_id	from_measu	to_measure	street_pre	base_name
01006595050034-D	5.799725	9.678965		215TH
D	0	9.678965	ST	错误
0200006595050034 -	是	以下		串联
34-D	0	9.678965	了	220

我的代码如下：

mnlrshwy = pd.DataFrame(columns=['ROUTE_ID','FROM_MEASU','TO_MEASURE','STREET_PRE',
                                 'BASE_NAME'])
for x in mnlrs['ROUTE_ID']:
    if x.startswith(('01','02','03')) is True:
        mnlrshwy = x.append(mnlrs,ignore_index = True)

我得到不明白为什么我会得到这样的东西。任何建议都会有所帮助。

原文

I have a shapefile that I am bringing in with geopandas. I do not need all the rows, just certain rows that start with 01, 02, and 03 for the route_id. I use a for loop and an if statement to try to append the data I want to an empty dataframe. Below is the sample Data:

ROUTE_ID	FROM_MEASU	TO_MEASURE	STREET_PRE	BASE_NAME
0100006595050034-D	5.799725	9.678965		215th
0200006595050034-D	0	9.678965	ST	220th
0300006595050034-D	5.799725	9.678965		215th
0400006595050034-D	0	9.678965	ST	220th

my code is as follows:

mnlrshwy = pd.DataFrame(columns=['ROUTE_ID','FROM_MEASU','TO_MEASURE','STREET_PRE',
                                 'BASE_NAME'])
for x in mnlrs['ROUTE_ID']:
    if x.startswith(('01','02','03')) is True:
        mnlrshwy = x.append(mnlrs,ignore_index = True)

I get a concatenation error which I don't understand why I would get something like that.
Any suggestions would be helpful.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

叹倦 2025-02-11 08:05:00

如果您使用熊猫，则无需循环。

df[df['ROUTE_ID'].str.contains("^0(1|2|3)", regex=True, na=False)]

If you use pandas, you won't need a for loop.

df[df['ROUTE_ID'].str.contains("^0(1|2|3)", regex=True, na=False)]

回复收藏 0 原文

过气美图社 2025-02-11 08:05:00

在每个阶段执行选票，或将其转换为字符串，然后将其附加到您的数据框架上。

回复收藏 0 原文

忆梦 2025-02-11 08:05:00

上面的DF输出

             ROUTE_ID  FROM_MEASU  TO_MEASURE STREET_PRE BASE_NAME
0  0100006595050034-D    5.799725    9.678965        NaN     215th
1  0200006595050034-D    0.000000    9.678965         ST     220th
2  0300006595050034-D    5.799725    9.678965        NaN     215th
3  0400006595050034-D    0.000000    9.678965         ST     220th

是您的数据框架的数据。

mnlrshwy = pd.DataFrame(columns=['ROUTE_ID', 'FROM_MEASU', 'TO_MEASURE', 'STREET_PRE',
                                 'BASE_NAME'], index=[0, 1, 2, 3])
for x in range(0, len(df['ROUTE_ID'])):
    if df.loc[x, 'ROUTE_ID'].startswith(('01', '02', '03')) is True:
        mnlrshwy.loc[x, :] = df.loc[x, 'ROUTE_ID']

print(mnlrshwy)

输出

             ROUTE_ID          FROM_MEASU          TO_MEASURE  \
0  0100006595050034-D  0100006595050034-D  0100006595050034-D   
1  0200006595050034-D  0200006595050034-D  0200006595050034-D   
2  0300006595050034-D  0300006595050034-D  0300006595050034-D   
3                 NaN                 NaN                 NaN   

           STREET_PRE           BASE_NAME  
0  0100006595050034-D  0100006595050034-D  
1  0200006595050034-D  0200006595050034-D  
2  0300006595050034-D  0300006595050034-D  
3                 NaN                 NaN

您在每次迭代时都会获得一个值。而且，如果空框架中没有索引，则不可能分配一个值。除非您在方括号中添加一个值。，您可以看到一个空的 Dataframe

在这里在每个迭代中，所有列的行。左侧使用LOC是索引，右侧是列的名称。

df Output

             ROUTE_ID  FROM_MEASU  TO_MEASURE STREET_PRE BASE_NAME
0  0100006595050034-D    5.799725    9.678965        NaN     215th
1  0200006595050034-D    0.000000    9.678965         ST     220th
2  0300006595050034-D    5.799725    9.678965        NaN     215th
3  0400006595050034-D    0.000000    9.678965         ST     220th

Above is the data of your dataframe.

mnlrshwy = pd.DataFrame(columns=['ROUTE_ID', 'FROM_MEASU', 'TO_MEASURE', 'STREET_PRE',
                                 'BASE_NAME'], index=[0, 1, 2, 3])
for x in range(0, len(df['ROUTE_ID'])):
    if df.loc[x, 'ROUTE_ID'].startswith(('01', '02', '03')) is True:
        mnlrshwy.loc[x, :] = df.loc[x, 'ROUTE_ID']

print(mnlrshwy)

Output

             ROUTE_ID          FROM_MEASU          TO_MEASURE  \
0  0100006595050034-D  0100006595050034-D  0100006595050034-D   
1  0200006595050034-D  0200006595050034-D  0200006595050034-D   
2  0300006595050034-D  0300006595050034-D  0300006595050034-D   
3                 NaN                 NaN                 NaN   

           STREET_PRE           BASE_NAME  
0  0100006595050034-D  0100006595050034-D  
1  0200006595050034-D  0200006595050034-D  
2  0300006595050034-D  0300006595050034-D  
3                 NaN                 NaN

You get one value at each iteration. And if there are no indexes in an empty dataframe, then it will not be possible to assign a value. Unless you add a value in square brackets. Here you can see about an empty dataframe

I filled in on each iteration the rows of all columns. Using loc on the left is the indexes, on the right is the name of the column.

回复收藏 0 原文

~没有更多了~