So I am designing some search functionality where user will search a word over millions of, say tweets.Now i will build index service here that will store some sort of mapping of words to tweets.Now I also want to introduce cache here to store top frequent word results.My doubt is if a word result comes to cache for some word say "abc",now all the request will be served by cache and let us say that word is so trending that it remains in cache for week.Now in one week there will be lot of new tweets also and index mapping might have been update with new tweets.So how can we specify that if result from cache is an older entry then discard it and fetch new results?Obviously i can use write policies of cache but i think it will effect search if we write to cache and db at same time in write through policy.Am i missing some thing here?How can i approach this?
Adding cache in search service
109 Views Asked by rahul sharma At
1
There are 1 best solutions below
Related Questions in CACHING
- Using Puppeteer to scrape a public API only when the data changes
- Caching private wordpress rest endpoints
- Cloudflare not respecting Cache-Control
- Unexpected Recursive Call
- Cannot serialize (Spring Boot)
- Nginx only caches file endpoints
- The Selenium application properties folder holds two environment options. After running a test the environment setting changes to a previous setting
- Launch jobs in cache in a loop in bash script
- Multiple async request do not store anything to cache
- Dev tool for Next.js cache on the client?
- Creating a letter in the terminal by entering
- Laravel: check if cache has key with thag
- The retrieval time for the Apache Ignite cache is too long
- How to run gradle with caches files
- Docker Run cache mount does not cache apt-get dependencies
Related Questions in MEMCACHED
- How can I properly delete a cache key on a Ruby on Rails project using memcached on multiple nodes?
- Cannot clear APCu Cache from Console, it's shared in the Webserver memory and not accessible from the CLI
- How do I update memcache php extension
- Memcached: How to remove all the keys
- Memcached lost items before expiration
- Once data is cache , make it available for other microservice
- pods connectivity chatbotexternal and mongodb
- How to connect my Nodejs Lambda to a Serverless Elasticache running memcached distribution
- Large number of rejected connections in memcached using django
- unable to retrieve data from Memcached cluster consistently
- Unable to connect to Elasticache serverless memcached from Lambda within same VPC
- Memcached does not work on local machine | Docker
- Memcache - some keys are getting deleted before expiration
- Tomcat session does not persist for multiple tomcat pods with the Memcached session manager
- Failure in Xmemcached CRAM-MD5 authentication
Related Questions in SYSTEM-DESIGN
- How to design a request processing system calling external APIs with spring boot?
- how can i calculate mutual friends/followers efficiently?
- Handling media in chat apps
- How should I design the flow of new messages and accessing old messages in a chat app?
- Should you use the Command Pattern for requests involving very little logic?
- Kafka streaming service with pull model needs improvement
- Assignment to create a class diagram and structure the system correctly
- Designing reliable agent based push module to push data from one boundary to other
- DDD where to put logic where authority can lie with one domain and also with multiple domains
- Kafka streams in hexagonal architecture
- Microservcies workers and api
- AWS SES Configurations Across Environments
- How many architecture styles there are?
- Best Practices for Using Kafka in FastAPI Service for Periodic OTA API Calls
- Should micro-services query the database directly or go through graph ql api?
Related Questions in SCALABLE
- Optimal way of calculating like count on many posts. FullStack
- Optimizing Event Notifications in a Scalable Application
- How ordering is maintained for multiple instance of the same same subscription in GCP Pub Sub
- Capturing analytics Instrumentation events
- Design Rate limiter for outgoing http calls without discard
- Suitable technology for scalable cloud app with API for approx 100K requests per minute
- I can't get this map method to print the components that I want to
- Kubernetes design app for highly scalable
- Group together individuals based on commonalities(direct or indirect) in two other columns for 7.5 million dataset in python
- Is it possible to create clusters in MapDB?
- Designing a scalable backend system that triggers events based on time values
- Scalable approach to make values in a list as column values in a dataframe in pandas in Python
- How to schedule multiple parallel jobs dynamically in Spring boot that scales to millions of users?
- Adding cache in search service
- Avoiding chained if statements for XML parser in C++
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
For something like tweets, it's okay if you respond back with search results are are a few minutes/hours old.
Ideally, I would not recommend doing a write through cache, because of the complication it adds, but a low TTL would be a better approach, unless you have some specific use case. Also, since the search system are pretty good these days, it does not hurt to have the same text being searched every new minutes.
With that though I made this Twitter System Design video. Thoughts?
Or you can find a short summary on CodeKarle's Website here.