Please Note: I am using Kafka for the first time and do not know much about it's terminologies
I want to make a data pipeline that read logs from multiple servers. Logs are available through REST APIs.
Currently, To read those logs i am using python get requests But as there will be multiple web servers in numbers and it needs to be in real time as some new data can be read from that web server, I am thinking to use Kafka.
Question 1: Can I read data using Kafka that previously i am getting using python get request?
Question 2: If yes then how? any architecture reference. If not then how can i read using python get request from multiple APIs?
Question 3: How can I send this data to elasticsearch.
Question 4: Can I send this data directly from Kafka to TypeDB skipping elasticsearch?
Thanks
I don't know what this means. Kafka doesn't use HTTP, so wouldn't be doing/receiving GET requests
Use a loop?
I think that is a problem. While you can expose logs through an API, it would really be beneficial (assuming you control the API servers) to write the logs to disk. Then you can use various log forwarding tools (like Elastic's Filebeat) to forward that data to Kafka and/or Elasticsearch.
Like I said, Filebeat is an option. Otherwise, look into Kafka Connect for both Elasticsearch and TypeDB