It's a deliberately stupid question. But I'm just curious - what would happen if I mount HDFS using FUSE binding as a volume and launch PostgreSQL with a cluster stored on this HDFS volume and start writing massive amounts of data and/or do high-intensity reading?
What would happen if deploy PostgreSQL with HDFS as filesystem in high-load scenario?
449 Views Asked by Gill Bates At
1
There are 1 best solutions below
Related Questions in POSTGRESQL
- Only the first SQL script gets executed inside Docker Postgres container
- Compare fields in two tables
- Hibernate ClobJdbcType bindings: what are the diferences?
- Postgres && statement Error in Mybatis Mapper?
- Can this query be optimized? (Choosing a random row to insert, that excludes previously inserted Rows)
- Connection terminated unexpectedly while performing multi row insert using pg-promise
- Processing multiple forms in nodejs and postgresql
- How to copy data from SQLite to postgreSQL?
- PGAdmin4 configured behind a reverse proxy but unable to connect to Postgresql server
- Updates to pgsodium encrypted values don't use specified key_id
- Connecting to Postgres running in a Docker container using psql
- Can't connect to local postgresql server from my docker container
- Django Arrayfield migration to cloud sql (Postgresql) not creating the column
- Get list of matching keywords for each post
- docker-compose can't reset postgresql database
Related Questions in HADOOP
- Can anyoone help me with this problem while trying to install hadoop on ubuntu?
- Hadoop No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster)
- Top-N using Python, MapReduce
- Spark Driver vs MapReduce Driver on YARN
- ERROR: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "maprfs"
- can't write pyspark dataframe to parquet file on windows
- How to optimize writing to a large table in Hive/HDFS using Spark
- Can't replicate block xxx because the block file doesn't exist, or is not accessible
- HDFS too many bad blocks due to "Operation category WRITE is not supported in state standby" - Understanding why datanode can't find Active NameNode
- distcp throws java.io.IOException when copying files
- Hadoop MapReduce WordPairsCount produces inconsistent results
- If my data is not partitioned can that be why I’m getting maxResultSize error for my PySpark job?
- resource manager and nodemanager connectivity issues
- ERROR flume.SinkRunner: Unable to deliver event
- converting varchar(7) to decimal (7,5) in hive
Related Questions in HDFS
- Can anyoone help me with this problem while trying to install hadoop on ubuntu?
- ERROR: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "maprfs"
- How to optimize writing to a large table in Hive/HDFS using Spark
- Update hadoop hadoop-2.6.5 to haddop 3.x. Operation category WRITE is not supported in state standby
- Copy/Merge multiple HDFS files using Nifi Processor
- HDFS too many bad blocks due to "Operation category WRITE is not supported in state standby" - Understanding why datanode can't find Active NameNode
- distcp throws java.io.IOException when copying files
- ERROR flume.SinkRunner: Unable to deliver event
- Apache flume does not run hadoop 3.1.0 Flume 1.11
- Livy session to submit pyspark from HDFS
- ClickHouse Server Exception: Code: 210.DB::Exception: Fail to read from HDFS:
- Confluent HDFS Sink connector error while connecting HDFS to Hive
- Node Transitioned from NEW to UNHEALTHY and Attempting to remove non-existent node
- Error associated with Azure Datalake Gen2 and Hadoop connection
- How do I directly read files from HDFS using dask?
Related Questions in HIGH-LOAD
- Is react-native javascript runner that slow?
- High load: real-time get SQL message and send it to the Kafka broker. What architectural pattern is suitable here?
- How to select rows only from n first rows by condition in mysql
- High-load C++ logging
- When I am use 'order by Id DESC', query execution time increases more 25 seconds
- Understanding SQL connection pool work with Go
- Pgbouncer how to test usefullness using query
- MySQL high load errors
- Limits for Telegram bot
- django.db.utils.OperationalError: could not translate host name "db" to address: Temporary failure in name resolution on highload
- What are the maximum limits for open files and running processes in CentOS 7?
- FastAPI + Uvicorn config. Why do some of requests take more than 10 seconds?
- Golang http client - connectex: Only one usage of each socket address (protocol/network address/port) is normally permitted
- HAProxy reverse ssl termination: Memory keeps growing. Memory leak?
- Which is more faster? trim() or RegEx?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
First I don't think it's a stupid question, with that said, let's use some definitions and we can continue from that point:
Fuse:
HDFS (Hadoop Distributed File System):
So I think that a short version of your question @Gill Bates is: Does HDFS affect the performance of a Postgres DB (Of course assuming that the Postgres cluster is stored in
HDFS)?The short answer is, depends on your configuration but likely yes, as mentioned above you can think of
HDFSas a file-system, and of course, Postgres stores the data in the file system, so it will be affected by the file system you are using, and let's say you perform multiple operationsread/write, one of the great advantages of having a distributed file system asHDFSis that support multiple replicas of files which considerably reduces the common bottleneck of many clients accessing a single file so that may help to scale better.So answering your question directly: what happens if I start writing massive amounts of data and/or do high-intensity reading?
Regardless of your file system is
HDFS(which may help you to scale better and at the same time add fault tolerance to your file system) or not, the parameters that could determine/affect directly how good your DB responds under stress tests are:And of course, depends on your stack too (how good is your server provided/host), based on my experience these are the facts that may affect more your Postgres DB (attached below some links that may help to clarify more ).
Hope the above helps to clarify!