我正在从事的项目有点保密,但我会尽力解释我的问题并尽可能清楚,因为我需要您的意见。
项目:
他们要求我建立一个本地ELK环境,并使用Python脚本与这个堆栈(ELK)通信,存储数据 >,借助 Kibana,检索它、分析它并可视化它,最后根据该数据做出决策(人工智能)。正如您所看到的,这是一个数据工程项目,在决策过程中使用了一些人工智能。我面临的问题是:
- 我不知道如何使用Python与堆栈通信,我没有找到有关它的资源
- 由于数据是机密的,我如何保证高安全性?
- 使用多少个实例?
- 我迷失了,因为我是 ELK 新手,而且我的团队不是面向开发的
我是 ELK 新手,所以请任何建议都会非常有帮助!
The project that I am working on is a bit confidential, but I will try to explain my issues and be as clear as possible because I need your opinion.
Project:
They asked me to set up a local ELK environment , and to use Python scripts to communicate with this stack (ELK), to store data, retrieve it, analyse it and visualise it thanks to Kibana, and finally there is a decision making based on that data(AI). So as you can see, it is a Data Engineering project with some AI for the decision making process. The issues that I am facing are:
- I don't know how to use Python to communicate with the stack, I didn't find resources about it
- Since the data is confidential, how can I assure a high security?
- How many instances to use?
- I am lost because I am new to ELK and my team is not Dev oriented
I am new to ELK, so please any advice would be really helpful!
发布评论
评论(2)
要学习如何与堆栈交互,请使用 python 库:
您可以使用 pip3 install elasticsearch 进行安装,以下链接包含有关您需要执行的几乎所有操作的大量教程。
https://kb.objectrocket.com/category/elasticsearch?filter=python
建议你从这两个开始:
https://kb.objectrocket.com/elasticsearch/how-to-parse-lines-in-a-text-file-and-index-as-elasticsearch-documents-using-python-641
https://kb.objectrocket.com/elasticsearch/how-to-query-elasticsearch-documents-in-python-268
https://www.elastic.co/guide/en/ elasticsearch/reference/current/authorization.html
https://nl.devoteam.com/expert-view/field-level-security-and-data-masking-in-elasticsearch/
我建议您从 1 个 Elasticsearch 节点开始,如果您在 AWS 上,请使用 t3a.large 或同等节点,并在同一台计算机上运行 Elasticsearch、Kibana 和 Logstash 。
设置方法: https://www.elastic.co/guide/en/elastic-stack-get-started/current/get-started-stack-docker.html#run-docker-secure
For learning how to interact with your stack use the python library:
You can install using
pip3 install elasticsearch
and the following links contain a wealth of tutorials on almost anything you would need to be doing.https://kb.objectrocket.com/category/elasticsearch?filter=python
Suggest you start with these two:
https://kb.objectrocket.com/elasticsearch/how-to-parse-lines-in-a-text-file-and-index-as-elasticsearch-documents-using-python-641
https://kb.objectrocket.com/elasticsearch/how-to-query-elasticsearch-documents-in-python-268
https://www.elastic.co/guide/en/elasticsearch/reference/current/authorization.html
https://nl.devoteam.com/expert-view/field-level-security-and-data-masking-in-elasticsearch/
I suggest you start with 1 Elasticsearch node, if you're on AWS use a t3a.large or equivalent and run Elasticsearch, Kibana and Logstash all on the same machine.
For setting it up: https://www.elastic.co/guide/en/elastic-stack-get-started/current/get-started-stack-docker.html#run-docker-secure
如果您想使用 phyton 作为 Elasticsearch 的集成工具,您可以使用 elasticsearch phyton 客户端。
您可以使用 python 创建结果并将其保存在日志文件中或插入数据库的其他选项,而不是 Logstash 将获取您的数据。
对于安全性来说,ELK从API授权用户认证到集群安全都有很好的安全性。您可以在此处查看保护 Elastic Stack
我只使用 1 个实例,但如果您认为需要将 Kibana 与 Elasticsearch 和 Logstash(如果您使用它)分开,请随意,或者您可以使用 docker 将其分开。
根据我的经验,如果您要在短时间内加载大量数据,那么明智的做法是将其分开,这样进程就不会相互干扰。
If you want to use phyton as your integration tools to Elasticsearch you can use elasticsearch phyton client.
The other options you can use python to create the result and save it in log file or insert to database than Logstash will get your data.
For the security ELK have good security from API authorization user authentication to cluster security. you can see in here Secure the Elastic Stack
I just use 1 instance, but feel free if you think you will need to separate between Kibana and Elasticsearch and Logstash (if you use it) or you can use docker to separate it.
Based on my experience, if you are going to load a lot of data in a short time it will be wise If you separate it so the processes don't interfere with each other.