如何在SQL Server查询中读取为dask DataFrame(attributeError:module' dask.dataframe'没有属性' read_sql_query')
我目前使用pyodbc读取为pandas dataframe,然后将其转换为dask数据框架。无论如何是否可以直接阅读为Dask DataFrame?
这是我当前正在使用的代码:
import pandas as pd
import numpy as np
from dask.dataframe import from_pandas
def conn_sql_server(file_path):
#Connect to SQL Server
conn = pyodbc.connect('Driver= {SQL Server Native Client 11.0};'
'Server= Server1;'
'Database = Database1;'
'Trusted_Connection=yes;')
#run query and ouput the result to df
query = open(file_path, 'r')
df = pd.read_sql_query(query.read(), conn, chunksize=10**4)
chunks =[]
for chunk in df:
chunks.append(chunk)
df_comb = pd.concat(chunks)
query.close()
return df_comb
#load in as pandas dataframe
data = conn_sql_server('.\input\data pull.sql')
#Convert to dask dataframe
dd = from_pandas(data, npartitions=3)
我尝试使用dd.read_sql_query
与PYODBC软件包或sqlalchemy软件包。两者都返回了一个属性:模块'dask.dataframe'没有属性'read_sql_query'
(1)pyodbc:
import dask.dataframe as dd
def conn_sql_server(file_path):
#Connect to SQL Server
conn = pyodbc.connect('Driver= {SQL Server Native Client 11.0};'
'Server= Server1;'
'Database = Database1;'
'Trusted_Connection=yes;')
#run query and ouput the result to df
query = open(file_path, 'r')
df = dd.read_sql_query(query.read(), conn)
query.close()
return df
data = conn_sql_server('.\input\data pull.sql')
attributeError:模块'dask.dataframe'没有属性'read_sql_query'
(2)sqlalchemy:
from sqlalchemy import create_engine
Server= 'Server1'
Database = 'Database1'
Driver= 'SQL Server Native Client 11.0'
uri = f'mssql://{Server}/{Database}?driver={Driver}'
query = open('.\input\data pull.sql', 'r')
dd.read_sql_query((query.read(), uri))
正如尼克所建议的,我使用Python -M PIP安装DASK分布式-Upgrade 。我还使用以下脚本检查了dask.dataframe模块中列出的所有功能。发现只有read_sql_table,没有read_sql_query
from inspect import getmembers, isfunction
import dask.dataframe as dd
getmembers(dd, isfunction)
I currently use pyodbc to read in as pandas dataframe, then I convert it to dask dataframe. Is there anyway to read in as Dask dataframe directly?
Here's the code I'm currently using:
import pandas as pd
import numpy as np
from dask.dataframe import from_pandas
def conn_sql_server(file_path):
#Connect to SQL Server
conn = pyodbc.connect('Driver= {SQL Server Native Client 11.0};'
'Server= Server1;'
'Database = Database1;'
'Trusted_Connection=yes;')
#run query and ouput the result to df
query = open(file_path, 'r')
df = pd.read_sql_query(query.read(), conn, chunksize=10**4)
chunks =[]
for chunk in df:
chunks.append(chunk)
df_comb = pd.concat(chunks)
query.close()
return df_comb
#load in as pandas dataframe
data = conn_sql_server('.\input\data pull.sql')
#Convert to dask dataframe
dd = from_pandas(data, npartitions=3)
I tried to use dd.read_sql_query
with pyodbc package or sqlalchemy package. Both returned an AttributeError: module 'dask.dataframe' has no attribute 'read_sql_query'
(1) pyodbc:
import dask.dataframe as dd
def conn_sql_server(file_path):
#Connect to SQL Server
conn = pyodbc.connect('Driver= {SQL Server Native Client 11.0};'
'Server= Server1;'
'Database = Database1;'
'Trusted_Connection=yes;')
#run query and ouput the result to df
query = open(file_path, 'r')
df = dd.read_sql_query(query.read(), conn)
query.close()
return df
data = conn_sql_server('.\input\data pull.sql')
AttributeError: module 'dask.dataframe' has no attribute 'read_sql_query'
(2) sqlalchemy:
from sqlalchemy import create_engine
Server= 'Server1'
Database = 'Database1'
Driver= 'SQL Server Native Client 11.0'
uri = f'mssql://{Server}/{Database}?driver={Driver}'
query = open('.\input\data pull.sql', 'r')
dd.read_sql_query((query.read(), uri))
As Nick suggested, I've upgrade dask to the latest version using python -m pip install dask distributed --upgrade
. I also checked all of the functions listed in the dask.dataframe module using the following script. Found out that there is only read_sql_table, no read_sql_query
from inspect import getmembers, isfunction
import dask.dataframe as dd
getmembers(dd, isfunction)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论