In my project for 10 websites on 20000 records. It takes too much time to scrape. I have introduced multi-threading for the same. But still i see lot of time consumption. Is there a way i can scale my servers and make it faster ? Currently using multi-threading. But want to know how to scale servers and resolve issue.
0
There are 0 best solutions below
Related Questions in DJANGO
- Django Admin Panel and Sub URLs Returning 404 Error on Deployment
- How to return HTTP Get request response from models class in Django project
- Issue with Quantity Increment in Django E-commerce Cart
- Can't install Pipenv on Windows
- use dict from python in django html template and also in js
- 'pyodbc.Cursor' object has no attribute 'callproc', mssql with django
- Django socketio process
- Root path analogue in uWSGI as in Uvicorn
- Django - ModuleNotFoundError: No module named 'backend'
- Does Python being a loosely typed programming language make it less secure?
- sorl-thumbnail adds a background color when padding is used
- Can't connect to local postgresql server from my docker container
- Why ProductHunt api dont work with Python?
- why i have to put extra space in before write option selected because it show error if i don't ' option:selected'
- Django Arrayfield migration to cloud sql (Postgresql) not creating the column
Related Questions in APACHE-SPARK
- Getting error while running spark-shell on my system; pyspark is running fine
- ingesting high volume small size files in azure databricks
- Spark load all partions at once
- Databricks Delta table / Compute job
- Autocomplete not working for apache spark in java vscode
- How to overwrite a single partition in Snowflake when using Spark connector
- Parse multiple record type fixedlength file with beanio gives oom and timeout error for 10GB data file
- includeExistingFiles: false does not work in Databricks Autoloader
- Spark connectors from Azure Databricks to Snowflake using AzureAD login
- SparkException: Task failed while writing rows, caused by Futures timed out
- Configuring Apache Spark's MemoryStream to simulate Kafka stream
- Databricks can't find a csv file inside a wheel I installed when running from a Databricks Notebook
- Add unique id to rows in batches in Pyspark dataframe
- Does Spark Dynamic Allocation depend on external shuffle service to work well?
- Does Spark structured streaming support chained flatMapGroupsWithState by different key?
Related Questions in WEB
- Settlement Amount of Razorpay Dashboard is not correct
- How can I implement synchronous registration on a website and a forum by linking their databases?
- NextJS 13+ how to use parallel + intercepting routes to create a modal on a page which also stores/syncs state with search params?
- logo image error nextjs notion starter kit with teamspace
- how do i create slider on Wix website builder?
- Why do I get 500 error on Azure after using ViewBag?
- After pg-related pop-up calls and processing, the web application JSESSION is broken
- How can i upload image on Laravel React App
- React Routing in web development using an index template
- Why is my time filter not updating within my Quasar template?
- Why do I have a 403 error when trying to save a website
- Hadoop MiniCluster Web UI
- How to debug flutter web app to check maximum memory consumption issue?
- How to send a HTTP Cookie using the Set-Cookie header over a HTTP connection?
- Is it posible to modify packets that creats by request python module?
Related Questions in WEB-CRAWLER
- How do i get the newly opened page after a form submission using puppeteer
- How to crawl 5000 different URLs to find certain links
- Selenium cannot load a page
- FaceBook-Scraper (without API) works nicely - but Login Process failes some how
- Why scrapy shell did not return an output?
- Highcharts Spider Chart with different scale for each category
- Chrome for Testing crashes soon after launching chrome driver in script
- Permission denied When deploy Splash in OpenShift
- scrape( n ′ gcontent−serverapp ′ , ′ How to scrape HTML elements with a specific attribute using Python ′ )
- Puppeteer recognized by BET365 during crawler
- Python requests.get(url) returns empty content in Colab
- I want some of the content in my page to be crawlable but should not be indexed
- Selenium crawler had no problems starting up locally, but it always failed to start up on Linux,org.openqa.selenium.interactions.Coordinates
- Website Branch address not updating in Google search engine even after 1 month
- How can I execute javasript function before page load for search engine crawlers?
Related Questions in HORIZONTAL-SCALING
- Quartz Clustering in Enterprise Applications
- Entity Framework caching with ASP.NET horizontal scale
- How to setup dynamic load balancing (like in activeMQ) in IBM MQ using message grouping
- I have python django backend for webscarpping project. As input i give the website name and it scrapes data. But it takes too long time. How to scale?
- How to horizontally scale scheduler application which can dynamically create and delete cron schedules
- Horizontal scaling Asp.Net MVC5 Framework
- HorizontalPodAutoscaler deployment fails post GKE upgrade to 1.26
- How does horizontal scaling work if you only have a monolith database?
- Kubernetes autofill available resources
- How to horizontaly scale graphana?
- Optimising feedback forms backend design for gaming applications
- How does google process 600K documents in .33 seconds?
- Does mongodb sharded collection's performance deteriorate as the number of shards grow?
- Sharding in MongoDB by location but don't query with it
- Rails horizontal sharding doesn't work with unknown shard argument error on rails 6.1.7.2
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?